retrain

retrain is a TOML-first RLVR (Reinforcement Learning with Verifiable Rewards) trainer for LLMs, built to make experiments easier to run, compare, and repeat.

If you are new, start with install -> explore commands -> run a tiny config.

Install

Requires Python 3.11+.

# CLI + docs exploration
uv tool install retrain

# Local GPU training (adds torch)
uv tool install "retrain[local]"

# Remote Tinker backend
uv tool install "retrain[tinker]"

If you are developing this repo directly:

pip install -e ".[dev]"

Explore the CLI

Use these first to understand what exists before you train:

retrain --help
retrain man
retrain man --topic quickstart
retrain man --list-topics
retrain backends
retrain doctor

Useful inspection commands while iterating:

retrain explain retrain.toml   # dry-run: what this config would do
retrain status logs            # summarize runs/campaigns under logs/
retrain plugins                # list built-ins + discovered plugins
retrain init-plugin --kind transform --name my_transform --with-test
retrain man --json --topic quickstart
retrain man --path             # editable bundled manual source

Tiny TOML Demo

Create mini.toml:

max_tokens = 1024 below is an intentional smoke-test profile. The standard default for full runs is max_tokens = 10240.

[model]
model = "Qwen/Qwen3-4B-Instruct-2507"

[algorithm]
advantage_mode = "grpo"
transform_mode = "none"

[training]
max_steps = 20
batch_size = 2
group_size = 8
max_tokens = 1024
lr = 4e-5

[backend]
backend = "local"
adapter_path = "adapters/mini"

[logging]
log_dir = "logs/mini"

Run it:

retrain mini.toml

Override fields from CLI without editing TOML:

retrain mini.toml --seed 42 --max-steps 40 --wandb-project my-project

Quick Start from Template

retrain init --template quickstart
retrain retrain.toml

Other templates:

retrain init --list
retrain init --template experiment
retrain init --template campaign
retrain init --interactive

retrain Workflow

The normal retrain loop is:

Define TOML config (retrain.toml or campaign.toml)
Dry-run with retrain explain ...
Train with retrain ...
Inspect with retrain status logs

Use retrain man --topic capacity only when you are sizing longer runs.

Why retrain

Experiment-first workflow: config -> explain -> run -> compare
Composable advantage pipeline: GRPO/MaxRL + GTPO/HICRA/SEPA
Pluggable backends and inference engines
Pluggable rewards (match, math, judge, custom)
Campaign sweeps from one TOML
LoRA-Squeeze rank analysis/compression
Checkpoint resume and run status tooling

Common Config Patterns

Use verifiers environments from TOML:

[environment]
provider = "verifiers"
id = "primeintellect/gsm8k"
args = { split = "train" }
auto_install = true
max_turns = 8

Use custom advantage + transform plugins from TOML:

[algorithm]
advantage_mode = "my_advantages.hipa_like_advantages"
transform_mode = "my_transforms.make_transform_spec"

Use a full algorithm plugin (overrides composable advantage+transform path):

[algorithm]
algorithm_mode = "my_algorithms.my_algorithm"

Documentation

Full docs: retrain.readthedocs.io

Contributor note: run retrain man --check in CI to detect stale auto-generated manual blocks, run retrain man --sync locally to refresh them, and run uv run mkdocs build --strict before publishing docs changes.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github/workflows		.github/workflows
campaigns		campaigns
docs		docs
retrain		retrain
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
Makefile		Makefile
README.md		README.md
SEPA_EXPERIMENTS.md		SEPA_EXPERIMENTS.md
mkdocs.yml		mkdocs.yml
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml
retrain.toml		retrain.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

retrain

Install

Explore the CLI

Tiny TOML Demo

Quick Start from Template

retrain Workflow

Why retrain

Common Config Patterns

Documentation

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

teilomillet/retrain

Folders and files

Latest commit

History

Repository files navigation

retrain

Install

Explore the CLI

Tiny TOML Demo

Quick Start from Template

retrain Workflow

Why retrain

Common Config Patterns

Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages