Skip to content

Tune Shadow-Hand Vision default iters from 50k to 5k#5837

Open
AntoineRichard wants to merge 1 commit into
isaac-sim:developfrom
AntoineRichard:antoiner/tune/shadow-hand-vision-iters
Open

Tune Shadow-Hand Vision default iters from 50k to 5k#5837
AntoineRichard wants to merge 1 commit into
isaac-sim:developfrom
AntoineRichard:antoiner/tune/shadow-hand-vision-iters

Conversation

@AntoineRichard
Copy link
Copy Markdown
Collaborator

Description

The Shadow-Hand Vision example (in-hand cube reposing with camera observations) currently ships with a default training schedule of 50000 iterations, which is a 10-30 hour wall-clock job on a current GPU. Empirical training curves show convergence well before 5k iterations, so the 50k default is a long no-op tail that discourages operators from running the example as shipped.

This PR drops the default to 5000 for both training frameworks that have a vision config:

  • rsl_rl_ppo_cfg.py: ShadowHandVisionFFPPORunnerCfg.max_iterations 50000 → 5000
  • rl_games_ppo_vision_cfg.yaml: max_epochs 50000 → 5000

skrl has no shadow_hand vision config so no change there.

Users who want the prior long schedule can still pass --max_iterations 50000 on the CLI; both scripts/reinforcement_learning/rsl_rl/train.py and scripts/reinforcement_learning/rl_games/train.py already plumb that flag through to the agent config (rsl_rl:78, rl_games:73).

Reproducing the example with the new default:

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py \
    --task Isaac-Repose-Cube-Shadow-Vision-Direct-v0 \
    --num_envs 4096 --headless

To restore the old long schedule:

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py \
    --task Isaac-Repose-Cube-Shadow-Vision-Direct-v0 \
    --num_envs 4096 --headless --max_iterations 50000

Fixes # (no issue)

Type of change

  • Documentation update (default-value tuning for an example training schedule; no API changes, opt-in path via existing --max_iterations CLI flag)

Screenshots

N/A — config-default change only.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added a changelog fragment under source/<pkg>/changelog.d/ for every touched package (do not edit CHANGELOG.rst or bump extension.toml — CI handles that)
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

Tests not added: this PR only changes default values of two config constants. The convergence claim that motivates the change is an empirical observation; landing a test for it would mean running the full training, which is exactly the multi-hour cost this PR exists to avoid. Happy to add a smoke-level configclass loadability test if a reviewer wants one.

The Shadow-Hand Vision (in-hand cube reposing with camera obs) PPO
agents ship with a default training schedule of 50000 iterations,
which is a 10-30 hour wall-clock job on a current GPU. Empirical
training curves show convergence well before 5k iterations, so the
50k default amounts to a long no-op tail that scares operators away
from running the example as shipped.

Drop the default to 5000 for both training frameworks
(ShadowHandVisionFFPPORunnerCfg for rsl_rl, max_epochs in
rl_games_ppo_vision_cfg.yaml). Users who still want the long
schedule can pass --max_iterations 50000 on the train.py CLI; both
scripts already plumb that flag through to the agent config.
@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 28, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated Code Review

Summary

This PR appropriately reduces the default training iterations for the Shadow-Hand Vision task from 50k to 5k, providing a much better out-of-box experience for users exploring this example.

Review

✅ Configuration Consistency
Both RL frameworks (rsl_rl and rl_games) are updated in lockstep, and the author correctly identified that skrl has no vision config requiring changes.

✅ Backward Compatibility
The existing --max_iterations CLI flag allows users who need longer training schedules to easily restore the previous behavior. No API changes or breaking modifications.

✅ Documentation
The changelog fragment is well-written and follows the project's established format. It clearly explains:

  • What changed
  • Why it changed (10-30h wall-clock to reasonable duration)
  • How to opt into the old behavior

✅ Code Quality

  • Clean diff affecting only the intended constants
  • No unrelated changes
  • Pre-commit checks pass

Notes

The decision not to add tests is reasonable here—verifying convergence claims would require running the actual multi-hour training job, which defeats the purpose of this PR. The change is to default values only; the underlying training logic remains untested by this PR but is unaffected.


Verdict: LGTM 👍

This is a well-scoped quality-of-life improvement. The 10× reduction in default iterations will make the Shadow-Hand Vision example much more accessible for new users while preserving full flexibility for those who need extended training.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 28, 2026

Greptile Summary

Reduces the default training budget for the Shadow-Hand Vision task from 50 000 to 5 000 iterations across both rsl_rl and rl_games configs. All other hyperparameters (learning rate, save cadence, checkpoint frequency, entropy coefficient) are untouched.

  • ShadowHandVisionFFPPORunnerCfg.max_iterations (rsl_rl_ppo_cfg.py) and max_epochs (rl_games_ppo_vision_cfg.yaml) are each set to 5000, making the example runnable in minutes rather than tens of hours on a typical GPU.
  • The skrl framework has no vision config for this task, so it is correctly left unchanged; the existing --max_iterations CLI flag provides an opt-in path back to the old long schedule.

Confidence Score: 5/5

Safe to merge — both changes are isolated single-value edits to default config constants with no effect on code paths, APIs, or other tasks.

The PR touches exactly two numeric literals across two config files and adds a changelog entry. All surrounding hyperparameters (save cadence, learning rate, entropy coefficient, etc.) remain consistent with the new budget. The existing CLI escape hatch covers users who want the old 50k schedule. No logic, no tests, no APIs are affected.

No files require special attention.

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/agents/rsl_rl_ppo_cfg.py Reduces ShadowHandVisionFFPPORunnerCfg.max_iterations from 50000 to 5000; save_interval (250) and all other hyperparameters are unchanged and remain consistent with the new budget.
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand/agents/rl_games_ppo_vision_cfg.yaml Reduces max_epochs from 50000 to 5000; save_frequency (200) and save_best_after (100) remain coherent with the reduced total, yielding 25 checkpoints instead of 250.
source/isaaclab_tasks/changelog.d/antoiner-tune-shadow-hand-vision-iters.rst New changelog fragment accurately describes both changes, the motivation, and the escape hatch via --max_iterations.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Shadow-Hand Vision Training] --> B{Framework?}
    B -->|rsl_rl| C[ShadowHandVisionFFPPORunnerCfg\nmax_iterations: 50000 → 5000\nsave_interval: 250]
    B -->|rl_games| D[rl_games_ppo_vision_cfg.yaml\nmax_epochs: 50000 → 5000\nsave_frequency: 200]
    B -->|skrl| E[No vision config — unchanged]
    C --> F[~20 checkpoints saved]
    D --> G[~25 checkpoints saved]
    F --> H[Override via --max_iterations 50000]
    G --> H
Loading

Reviews (1): Last reviewed commit: "Tune Shadow-Hand Vision default iters to..." | Re-trigger Greptile

@AntoineRichard AntoineRichard added this to the Isaac Lab 3.0 Beta 2 milestone Jun 1, 2026
@AntoineRichard AntoineRichard moved this to Ready to merge in Isaac Lab Jun 1, 2026
@AntoineRichard AntoineRichard moved this from Ready to merge to In review in Isaac Lab Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

1 participant