[Changelog] Move nightly commit/push lifecycle into cli.py auto-bump subcommand#5867
[Changelog] Move nightly commit/push lifecycle into cli.py auto-bump subcommand#5867hujc7 wants to merge 1 commit into
Conversation
The nightly changelog cron's wedge mode on 2026-05-29 (workflow's git-add glob missing pyproject.toml after isaac-sim#5785 added the pyproject write path) was a producer/consumer drift bug between cli.py and nightly-changelog.yml. The two surfaces carried duplicated knowledge of which files cli.py touches, and the YAML side never got the memo when the producer grew a new write site. Lift the entire post-compile lifecycle (stage, commit, rebase-on-conflict loop, push) out of inline shell in the workflow and into a new ``cli.py auto-bump`` subcommand. The orchestrator derives its staging set from each ``Package.compile`` return value, so the glob is an in-process list of files cli.py just wrote — drift is impossible by construction. The workflow shrinks to one ``python3 cli.py auto-bump`` call (follow-up on main once this lands on develop and release). Changes: - ``Package.write_changelog_entry`` and ``Package.write_version`` now return the list of paths they wrote (empty in dry-run). - ``Package.compile`` returns ``(compiled, touched)`` — the existing truthy/falsy semantics carry through ``cmd_compile``'s ``any_compiled`` tracking. - New ``GitRepo`` class: thin subprocess wrapper around the ``git`` CLI with one nested exception (``NonFastForward``) for the policy decision the orchestrator cares about. - New ``AutoBumpRun`` class: end-to-end lifecycle owner. Hardcoded bot identity (deterministic public values, not secrets), three-retry push loop with fetch+rebase on conflict, partial-success across packages. - New ``cmd_auto_bump`` handler + subparser. Tests in tools/changelog/test/test_auto_bump.py cover seven scenarios against a tempdir working tree + bare-repo remote with no GitHub interaction: - Happy path commits and pushes. - pyproject.toml staging regression (today's bug class). - Non-fast-forward push → rebase loop → second push succeeds. - Exhausted retries raise GitError. - Partial success: one bad package, one good; healthy commit ships, exit 1. - Dry-run skips commit/push entirely. - No fragments is a clean no-op. Workflow YAML on main consumes the new subcommand in a follow-up PR.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Verdict: ✅ LGTM
This PR correctly addresses the producer/consumer drift class of bug that caused the 2026-05-29 nightly failure.
What I Checked
| Area | Status |
|---|---|
| CLI subcommand implementation | ✅ Correct |
| Error handling | ✅ Robust |
| Replaces workflow steps properly | ✅ Yes |
| Test coverage | ✅ Excellent (7 scenarios) |
Strengths
-
Drift-proof by construction —
Package.compile()returning(compiled, touched)means the staging set is derived from what cli.py actually wrote, not a duplicated glob in YAML. -
Clean architecture —
GitRepo(subprocess wrapper with typed exceptions) andAutoBumpRun(orchestrator) have clear single responsibilities. -
Robust race handling — The 3-retry push loop with fetch+rebase on
NonFastForwardhandles concurrent commits gracefully without spinning forever. -
Partial success semantics — One broken package does not block healthy packages from shipping; exit code reflects failures appropriately.
-
Comprehensive tests — All 7 scenarios exercise real tempdir repos with bare-remote pushes, no mocking subprocess. This catches integration issues the PR gate would miss with unit-only tests.
Minor Notes (not blocking)
_extract_version_from_diffgracefully returns"?"on parse failure — appropriate for informational commit messages.- Bot identity hardcoded as class constants is intentional (public values, per the design doc).
No issues found. Ready for human reviewer sign-off.
Reviewed at HEAD: 15e9d206bfeb8fe53f865510399398ad92835f36
1. Summary
.github/workflows/nightly-changelog.ymland into a newcli.py auto-bumpsubcommand.git addglob missedpyproject.tomlafter Clean up sub module packaging and remove setup.py #5785 added the pyproject write site). AutoBumpRun stages whatever eachPackage.compileactually wrote — the staging set is an in-process list, not a duplicated glob in YAML.2. Background
The 2026-05-29 incident (https://github.com/isaac-sim/IsaacLab/actions/runs/26621920244) traced back to the workflow's
git addcarrying duplicated knowledge of which files cli.py modifies. #5785 added a write site in cli.py without updating the YAML's glob; the cron noticed only after #5831 cleared an upstream wedge that had been masking the issue.#5859 patches the immediate symptom on main. This PR removes the duplication so future producer changes (new file-write sites in cli.py) don't need paired YAML edits to keep the cron healthy. Discussed at length in the iteration thread following #5859.
3. Design
3.1
Package.compilereturns the touched-paths manifestwrite_changelog_entryandwrite_versionnow return the list of paths they wrote (empty in dry-run).Package.compilereturns(compiled, touched)—compiledpreserves the existingany_compiledsignal forcmd_compile;touchedis the in-process manifest the orchestrator stages.No callable wrappers, no external manifest file, no subprocess hand-off. The list lives in memory for the duration of the run.
3.2
GitRepo— thin subprocess wrapperSingle responsibility: shell out to
gitand raise typed exceptions on the failure modes that need policy. One nested exception (GitRepo.NonFastForward) for the recoverable case the orchestrator retries. Stateless beyondcwd. Easy to test by pointing at a tempdir git repo.3.3
AutoBumpRun— per-run orchestratorOwns the per-run state (branch, remote, dry-run flag, touched list, failure list,
any_compiledflag). Therun()method reads top-down:_compile_allwalksPackage.discover()with per-package try/except (partial success)._stage_and_commitconfigures the bot author identity (deterministic public values, hardcoded as class constants per [[feedback_oop_when_possible]] for non-trivial code), stagesself.touched, commits with the structured message._push_with_retryis a three-attempt fetch+rebase+push loop. Rebase only fires when the race actually fires, and always against a clean (committed) working tree — eliminates the unstaged-residue trap that bricked the cron today._exit_codereturns 1 if any package raised during compile, 0 otherwise.The bot identity values are public, not secret (they appear on every nightly commit). Hardcoding them as class constants keeps cli.py self-contained without leaking secrets — the App token itself stays in repo secrets and is consumed by the workflow before cli.py runs.
3.4 Test coverage in
tools/changelog/test/test_auto_bump.pySeven scenarios, all against a tempdir working tree + bare-repo remote with no GitHub interaction:
test_auto_bump_happy_path_commits_and_pushes— single package with a fragment compiles and lands on the bare remote with the bot's author identity.test_auto_bump_stages_and_commits_pyproject— regression for today's bug: package with[project] versioninpyproject.tomlproduces a commit that carries both extension.toml AND pyproject.toml bumps.test_auto_bump_rebases_on_non_fast_forward— sidecar commit pre-pushed to bare remote; first push fails non-fast-forward; rebase loop fetches the new tip, replays the auto-commit, second push succeeds; bot's commit lands on top with human's commit one below.test_auto_bump_raises_after_exhausting_retries— every retry races against another sidecar commit; orchestrator raisesGitErrorafterPUSH_RETRIESattempts rather than spinning.test_auto_bump_ships_healthy_packages_when_one_fails— two packages, one with a broken CHANGELOG.rst header; healthy commit ships to remote, exit code 1.test_auto_bump_dry_run_does_not_commit— no commits, no push, fragments preserved on disk.test_auto_bump_with_no_fragments_is_a_noop— exit 0, no commit, no push.All 7 pass. Full suite goes from 90 → 97.
4. Follow-up
This PR is develop-scoped: pure
cli.py+ tests. The workflow YAML edit on main that consumes the new subcommand is a one-line follow-up:Will land on main after this PR is in develop AND cherry-picked to release/3.0.0-beta2, so every cron-target branch carries
auto-bumpbefore main's workflow tries to invoke it.5. What stays in cli.py vs the workflow
After both PRs land:
The PR gate (
.github/workflows/changelog-check.yml) already runs the changelog test suite on PRs touchingtools/changelog/**, so future producer changes get caught at PR time by the new test_auto_bump.py without any workflow YAML edits.6. Test plan
pytest tools/changelog/test/— 97/97 pass (was 90, +7 new)../isaaclab.sh -fclean.[rejected]for both git-side and ruleset-side rejections).