Add autonomous ML experiment support by Saba9 · Pull Request #521 · gradio-app/trackio

Saba9 · 2026-04-20T17:47:56Z

Summary

3 new CLI commands for agent workflows: trackio best, trackio compare, trackio summary — collapse multi-step agent workflows into single calls with --json output
Run status tracking — runs automatically marked as running/finished/failed, exposed via CLI and Python API
Structured alert data — trackio.alert(data={...}) for machine-readable payloads agents can parse without text extraction
Metric watchers — trackio.watch() + trackio.should_stop() for automatic NaN/spike/stagnation/threshold detection
Python API completeness — run.metrics(), run.history(), run.summary, run.status on Api.Run
Test harness — synthetic training simulator + agent test runner with 5 experiment types (LR search, architecture search, failure recovery, long monitoring, multi-objective)

Test plan

All 14 e2e-local tests pass
ruff check and ruff format pass clean
Manual verification: trackio best, trackio compare, trackio summary with JSON and human-readable output
Manual verification: run status tracking (running → finished, running → failed on crash)
Manual verification: structured alert data round-trips through storage and CLI
Manual verification: metric watchers fire alerts and trigger should_stop()
Run agent_runner.py end-to-end with all 5 experiments

🤖 Generated with Claude Code

…ers, run status, structured alerts Adds agent-facing query APIs and tooling to make Trackio the definitive tracker for autonomous ML experiments driven by AI coding agents. New CLI commands: - `trackio best` — rank runs by metric, return winner + leaderboard - `trackio compare` — side-by-side run comparison across metrics - `trackio summary` — full experiment overview with status/configs/metrics New features: - Run status tracking (running/finished/failed) with automatic lifecycle mgmt - Structured alert data (`data={}` param on `trackio.alert()`) - Metric watchers (`trackio.watch()`) for auto NaN/spike/stagnation detection - `trackio.should_stop()` for training loop early stopping - Python API: `run.metrics()`, `run.history()`, `run.summary`, `run.status` Test infrastructure: - Synthetic training simulator (no ML deps, runs in seconds) - Agent test runner with 5 experiment types Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Resolve merge conflicts integrating autonomous ML features with main branch changes (run_id support, RemoteClient, query command, OAuth, error handling). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gradio-pr-bot · 2026-04-20T18:25:53Z

🪼 branch checks and previews

•	Name	Status	URL
🦄	Changes	detected!	Details

gradio-pr-bot · 2026-04-20T18:25:56Z

🦄 change detected

This Pull Request includes changes to the following packages.

Package	Version
`trackio`	`minor`

Add autonomous ML experiment support

‼️ Changeset not approved. Ensure the version bump is appropriate for all packages before approving.

Maintainers can approve the changeset by checking this checkbox.

Something isn't right?

Maintainers can change the version label to modify the version bump.
If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

HuggingFaceDocBuilderDev · 2026-04-20T18:26:25Z

🪼 branch checks and previews

•	Name	Status	URL
	Spaces	ready!	Spaces preview

Install Trackio from this PR (includes built frontend)

pip install "https://huggingface.co/buckets/trackio/trackio-wheels/resolve/2283983183942ecd90479c36298cb09d951f0348/trackio-0.24.0-py3-none-any.whl"

HuggingFaceDocBuilderDev · 2026-04-20T18:28:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Fixes: - Fix 1: Run status "failed" no longer overwritten by "finished" — finish() now accepts a status parameter, _cleanup_current_run passes status="failed" directly - Fix 2: Watcher patience supports both min and max mode via new mode parameter (was hardcoded to minimization only) - Fix 3: Replace dead --minimize/--maximize flags with --direction {min,max} on trackio best - Fix 5: Watcher _values now uses deque(maxlen=window) to bound memory; alert dedup via per-condition flags that reset on return to normal - Fix 6: Warn if watchers exist when init() clears them; docstring documents ordering requirement - Fix 7: Drop Api.Run.summary cache — recompute on each access - Fix 9: set_run_status/get_run_status now accept and use run_id, INSERT handles run_id NOT NULL column in new schema - Fix 12: Remove unused enumerate variable in agent_runner Tests: - test_watchers.py: 18 tests covering nan, spike, max/min threshold, dedup, patience min/max mode, window bounds, manager propagation - test_run_status.py: 6 tests covering running→finished, failed status, idempotent finish, Api.Run.status, multi-run status - test_cli_agent_commands.py: 10 tests covering best/compare/summary in JSON and human-readable modes, error cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Saba9 and others added 4 commits April 20, 2026 10:45

Fix lint warnings in test harness scripts

f8303cd

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove remaining unused variable in simulator

4df7718

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge origin/main into saba/auto-ml

c1f8311

Resolve merge conflicts integrating autonomous ML features with main branch changes (run_id support, RemoteClient, query command, OAuth, error handling). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

add changeset

b959e49

Saba9 and others added 2 commits April 20, 2026 11:50

Fix undefined variable in agent_runner after enumerate removal

2283983

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add autonomous ML experiment support#521

Add autonomous ML experiment support#521
Saba9 wants to merge 7 commits intomainfrom
saba/auto-ml

Saba9 commented Apr 20, 2026 •

edited

Loading

Uh oh!

gradio-pr-bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

gradio-pr-bot commented Apr 20, 2026

Something isn't right?

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Saba9 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

gradio-pr-bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪼 branch checks and previews

Uh oh!

gradio-pr-bot commented Apr 20, 2026

🦄 change detected

This Pull Request includes changes to the following packages.

Something isn't right?

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪼 branch checks and previews

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Saba9 commented Apr 20, 2026 •

edited

Loading

gradio-pr-bot commented Apr 20, 2026 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 20, 2026 •

edited

Loading