Skip to content

Skip redundant GPU rendering for action-chunking policies #521

@kanghui0204

Description

@kanghui0204

Summary

Action-chunking policies (e.g., GR00T) output a chunk of 16-32 actions per inference call. The policy replays this chunk over subsequent env.step() calls, but only the first step of each chunk needs a fresh camera frame -- all intermediate steps discard the rendered result. Today, Isaac Lab renders on every env.step() regardless, wasting significant GPU time.

This PR is a proof-of-concept demonstrating two optimizations that eliminate unnecessary rendering and significantly improve throughput. The implementation works but relies on a workaround (mutating cfg.sim.render_interval at runtime). We propose a cleaner long-term API change to Isaac Lab's env.step().

Branch

hkang/render-opt (based on latest main)

Benchmark Results

Hardware: NVIDIA L20 (48 GB), single GPU, remote policy (GR00T server + Isaac Sim client).

Tested on L20 with ActionChunkingClientSidePolicy + GR00T remote server, gr1_open_microwave (chunk_length=16), 8 envs, 100 steps:

  • With render (inference step): ~4-6 step/s -- this is the step where needs_obs_next_step() returns True, so render_interval is restored to normal and env.step() renders a camera frame for the next inference
  • Without render (chunk-replay step): ~9.5 step/s -- this is the step where the policy is consuming buffered actions from the chunk, needs_obs_next_step() returns False, so render_interval is set to a huge value and env.step() skips rendering entirely

The ~2x speed difference confirms that render-skipping is working. For a chunk_length of 16, only 1 out of every 16 steps needs to render, so the majority of steps run at the faster rate.

How to Reproduce

Prerequisites: Two Docker containers on the same GPU node -- one for the GR00T policy server, one for the Isaac Sim client.

1. Start GR00T policy server

# Build & run the server container (from repo root)
bash docker/run_gr00t_server.sh \
  -m /path/to/models \
  --port 5555 \
  --policy_type isaaclab_arena_gr00t.policy.gr00t_remote_policy.Gr00tRemoteServerSidePolicy \
  --policy_config_yaml_path isaaclab_arena_gr00t/policy/config/gr1_manip_gr00t_closedloop_config.yaml

Wait until you see: [PolicyServer] listening on tcp://0.0.0.0:5555

2. Run the client (Isaac Sim container)

# Inside the isaaclab_arena Docker container
/isaac-sim/python.sh -m isaaclab_arena.evaluation.policy_runner \
  --headless \
  --enable_cameras \
  --policy_type isaaclab_arena.policy.action_chunking_client.ActionChunkingClientSidePolicy \
  --remote_host $(hostname) \
  --remote_port 5555 \
  --num_envs 8 \
  --num_steps 100 \
  gr1_open_microwave

You should see a repeating throughput pattern every 16 steps (= chunk_length): the first step is slow (~4-6 step/s) because needs_obs_next_step() returned True on the previous step, so this env.step() renders a camera frame. The remaining 15 steps are fast (~9-10 step/s) because the policy is replaying buffered actions and needs_obs_next_step() returns False, causing env.step() to skip rendering.

3. Compare with baseline

To see the baseline (without render optimization), revert the render_interval change in isaaclab_arena_manager_based_env.py and remove the render-skip logic in policy_runner.py.

What Changed and Why It's a Workaround

This PR modifies 5 files (42 lines added):

Optimization 1: Render once per env.step() instead of twice

In isaaclab_arena_manager_based_env.py, we set render_interval = decimation in __post_init__(). With default settings (decimation=4, render_interval=2), Isaac Lab renders twice per env.step(), but only the final frame is consumed by observation_manager.compute(). This change reduces it to 1 render per step. This is clean and correct.

Optimization 2: Skip rendering when the policy doesn't need observations

This is the workaround part. We add PolicyBase.needs_obs_next_step() -> bool so action-chunking policies can signal "I'm replaying buffered actions, don't need a fresh camera frame." In policy_runner.py, we toggle cfg.sim.render_interval at runtime:

# Current workaround: mutate config before each env.step()
_NO_RENDER = 2**31 - 1

if not policy.needs_obs_next_step():
    unwrapped.cfg.sim.render_interval = _NO_RENDER  # skip render
else:
    unwrapped.cfg.sim.render_interval = _render_interval  # restore

obs, _, terminated, truncated, _ = env.step(actions)

Why this is ugly:

  • We mutate cfg.sim.render_interval (a config value) at runtime as a side-channel to control rendering behavior
  • This only works because render_interval happens to be read inside the physics loop as a modulo condition -- it's an implementation detail, not an API contract
  • If Isaac Lab changes how render_interval is used internally, this breaks silently

Proposed Clean Solution (Requires Isaac Lab Change)

The right fix is for Isaac Lab's ManagerBasedRLEnv.step() to accept a render parameter:

# Proposed Isaac Lab API
def step(self, action, render: bool = True):
    is_rendering = render and (self.sim.has_gui() or self.sim.has_rtx_sensors())
    ...

Then the rollout loop becomes clean and explicit:

actions = policy.get_action(env, obs)
obs, _, terminated, truncated, _ = env.step(
    actions,
    render=policy.needs_obs_next_step(),
)

This would:

  • Eliminate runtime config mutation
  • Make the semantics explicit: the caller decides whether to render
  • Leave existing reset re-rendering (num_rerenders_on_reset) unchanged

Headless vs. Non-Headless Consideration

Developers should consider whether render-skipping should only apply in headless mode. In non-headless (GUI) mode, skipping renders would freeze the viewport on most steps, making the simulation appear broken. A possible guard:

# Only skip renders when running headless
if not policy.needs_obs_next_step() and not env.sim.has_gui():
    # skip rendering

This ensures:

  • Headless evaluation/benchmarking gets the full performance benefit
  • Interactive/GUI sessions always render for visual feedback

internal analysis documentation

perf doc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions