First experiments: add GR00T closed-loop docs and language instruction support by cvolkcvolk · Pull Request #519 · isaac-sim/IsaacLab-Arena

cvolkcvolk · 2026-03-31T15:23:02Z

Summary

Split "Running your First Experiments" into two sub-pages:
- Exploring Environment Variations — the existing zero-action experiments (swap objects, HDR, scale up, batch eval)
- Running a Real Policy — new page walking through GR00T N1.6 closed-loop evaluation on the DROID embodiment
Add language_instruction as an optional per-job field in the jobs config (for eval_runner.py) and as a --language_instruction CLI argument in policy_runner. In both cases the value takes precedence over the task's own description.
Remove the hardcoded instruction from droid_manip_gr00t_closedloop_config.yaml and add explicit per-object instructions to droid_pnp_srl_gr00t_jobs_config.json

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

docs/pages/quickstart/first_experiments/exploring_variations.rst

alexmillane

Looks great! Another great improvement!

Bridges from zero_action to a real policy: shows the container prerequisite (-g flag), the two argument changes vs zero_action (policy_type + enable_cameras), and batch evaluation via the GR00T jobs config. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Convert first_experiments.rst into a directory with an index and two pages: 'Exploring Environment Variations' (zero-action experiments) and 'Running a Real Policy' (GR00T N1.6 closed-loop). Matches the structure used by the Example Workflows section. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

first_arena_env.rst was linking to the old flat first_experiments path; update both occurrences to first_experiments/index. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Shows a 5x5 grid of closed-loop GR00T N1.6 runs varying background, lighting, and destination object. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

- Add closing sentence to Exploring Environment Variations pointing forward to Running a Real Policy - Explain the switch from --num_steps to --num_episodes in the GR00T command - Fix stale job count: seven -> six (billiard_hall_wooden_bowl removed) - Fix GIF caption: destination object -> pick-up object Signed-off-by: Clemens Volk <cvolk@nvidia.com>

'Batch' implies parallel execution; eval_runner.py runs jobs sequentially. 'Multi-job' is accurate and maps directly to the jobs config concept. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Add mustard_bottle, sugar_box, and mug jobs. Distribute wooden_bowl as destination across 5 of the 9 jobs (blue_block, orange, tomato_sauce_can, mustard_bottle, mug) and bowl_ycb across the remaining 4. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Allows callers to pass a natural-language instruction directly on the command line. The value takes precedence over the task's own get_task_description(), which in turn takes precedence over the policy config YAML fallback. Remove the hardcoded language_instruction from droid_manip_gr00t_closedloop_config.yaml so the instruction is always supplied explicitly rather than silently falling back to a stale string that doesn't match the object being evaluated. Update the GR00T closed-loop docs command to pass the instruction explicitly. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Wire language_instruction through Job, eval_runner, and rollout_policy so per-job instructions in the jobs config take precedence over the task's own get_task_description(). Add explicit language instructions to all jobs in droid_pnp_srl_gr00t_jobs_config.json. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

- Raise ValueError in Gr00tClosedloopPolicy and Gr00tRemotePolicy set_task_description when no instruction is provided, preventing silent evaluation with an empty prompt - Rename droid_pick_and_place_srl -> pick_and_place_maple_table in droid_pnp_srl_gr00t_jobs_config.json and drop the duplicate billiard_hall_wooden_bowl job, aligning with the intended state - Fix "Two things change" -> "Three things change" in docs now that --language_instruction is a third notable CLI change Signed-off-by: Clemens Volk <cvolk@nvidia.com>

cvolkcvolk · 2026-04-01T16:21:54Z

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

+zero-action experiments. This functionality can be used to test how the policy adapts to each new
+object and lighting condition, as we shall see in the next section.
+
+**Multi-job evaluation across object variations**


@xyao-nv I think you had a comment on that right?

Can we call it as "Sequential batch evaluation across object variations"
https://isaac-sim.github.io/IsaacLab-Arena/main/pages/policy_evaluation/evaluation_types.html

xyao-nv · 2026-04-01T16:30:11Z

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

+zero-action experiments. This functionality can be used to test how the policy adapts to each new
+object and lighting condition, as we shall see in the next section.
+
+**Multi-job evaluation across object variations**


Can we call it as "Sequential batch evaluation across object variations"
https://isaac-sim.github.io/IsaacLab-Arena/main/pages/policy_evaluation/evaluation_types.html

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

xyao-nv · 2026-04-01T17:30:05Z

isaaclab_arena_gr00t/policy/gr00t_remote_policy.py

    def set_task_description(self, task_description: str | None) -> dict[str, Any]:
        if task_description is None:
            task_description = self.policy_config.language_instruction
+        if not task_description:


xyao-nv · 2026-04-01T17:30:18Z

isaaclab_arena_gr00t/policy/gr00t_closedloop_policy.py

        """Set the language instruction of the task being evaluated."""
        if task_description is None:
            task_description = self.policy_config.language_instruction
+        if not task_description:


xyao-nv · 2026-04-01T18:14:28Z

docs/pages/quickstart/first_experiments/index.rst

+Running your First Experiments
+==============================
+
+The following pages walk you through your first Arena experiments — first verifying that


exploring?
To be consistent as below.

xyao-nv · 2026-04-01T18:17:28Z

docs/pages/quickstart/first_experiments/exploring_variations.rst


-Batch Evaluation
-----------------
+Multi-Job Evaluation


Can re rename to "Sequential batch evaluation"

xyao-nv · 2026-04-01T18:20:09Z

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

+- ``--enable_cameras`` turns on the robot's cameras, which GR00T requires for observations
+- ``--language_instruction`` sets the natural-language instruction sent to the model
+
+GR00T also requires absolute joint positions, so use ``--embodiment droid_abs_joint_pos``


The default modality config came with this checkpoint is set to using absolute joint positions.

xyao-nv · 2026-04-01T18:23:52Z

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

+   droid_pnp_srl_gr00t_blue_block:
+     num_episodes                            3
+     object_moved_rate                  0.0000
+     success_rate                       0.0000


Maybe we can ack this bad results. We can add a note declaring this checkpoint is not post-trained to those object setup. So it's reasonable to observe 0 success rate.

It could remind the users again that we are eval platform, not responsible for policy SR given there is no known bugs in our evaluation.

Although, to me, the claim is not reasonable given it brands as a true foundation model.

xyao-nv · 2026-04-01T18:26:50Z

docs/pages/quickstart/first_experiments/running_a_real_policy.rst

+
+To go beyond the pre-trained GR00T N1.6 foundation model — for example, fine-tuning on your own
+teleoperation data — see :doc:`../../../pages/example_workflows/imitation_learning/index` for
+end-to-end imitation learning workflows.


How about adding a pointer to RL example too? So we are not tightly coupled with GR00T in policy eval.

xyao-nv

Thx for adding them!
Let's put the refactorings (aka rm -rf) into our v0.3 todos!

cvolkcvolk changed the base branch from main to cvolk/language-instruction-in-jobs-config March 31, 2026 15:24

cvolkcvolk commented Mar 31, 2026

View reviewed changes

docs/pages/quickstart/first_experiments/running_a_real_policy.rst Outdated Show resolved Hide resolved

cvolkcvolk commented Mar 31, 2026

View reviewed changes

docs/pages/quickstart/first_experiments/running_a_real_policy.rst Outdated Show resolved Hide resolved

xyao-nv reviewed Mar 31, 2026

View reviewed changes

docs/pages/quickstart/first_experiments/exploring_variations.rst Show resolved Hide resolved

alexmillane approved these changes Apr 1, 2026

View reviewed changes

cvolkcvolk and others added 15 commits April 1, 2026 16:21

Fix broken doc references to first_experiments after page split

8d2652a

first_arena_env.rst was linking to the old flat first_experiments path; update both occurrences to first_experiments/index. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Add GR00T DROID evaluation grid GIF to running a real policy doc

1ade601

Shows a 5x5 grid of closed-loop GR00T N1.6 runs varying background, lighting, and destination object. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Rename Batch Evaluation to Multi-Job Evaluation in first experiments

9d53d2c

'Batch' implies parallel execution; eval_runner.py runs jobs sequentially. 'Multi-job' is accurate and maps directly to the jobs config concept. Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Reword multi-job eval intro to mention environment variations

5f86754

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Note that getting_started_jobs_config bundles the four experiments

ec883c5

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Edit of clemen's wording.

e2074dc

Add unique HDR per job in GR00T jobs config

868ba0d

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Update job count and description in multi-job eval section

238c9f0

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Replace 25-run GIF with 3x3 grid of GR00T evaluation runs

9108bac

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Remove stale 25-run GR00T GIF replaced by 3x3 grid

bf5aaad

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

Add expected terminal output below GIF in multi-job eval section

02df632

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

cvolkcvolk force-pushed the cvolk/first-experiments-gr00t-docs branch from 55b9269 to 02df632 Compare April 1, 2026 14:22

cvolkcvolk changed the base branch from cvolk/language-instruction-in-jobs-config to main April 1, 2026 14:24

cvolkcvolk added 2 commits April 1, 2026 16:45

cvolkcvolk changed the title ~~Add GR00T N1.6 closed-loop section to first experiments docs~~ First experiments: add GR00T closed-loop docs and language instruction support Apr 1, 2026

cvolkcvolk commented Apr 1, 2026

View reviewed changes

xyao-nv approved these changes Apr 1, 2026

View reviewed changes

Rename Multi-job evaluation to Sequential batch evaluation in docs

efe6572

Signed-off-by: Clemens Volk <cvolk@nvidia.com>

xyao-nv reviewed Apr 1, 2026

View reviewed changes

xyao-nv approved these changes Apr 1, 2026

View reviewed changes

Conversation

cvolkcvolk commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexmillane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xyao-nv Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xyao-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cvolkcvolk commented Mar 31, 2026 •

edited

Loading

xyao-nv Apr 1, 2026 •

edited

Loading