Skip to content

Commit cdd540d

Browse files
pkooijclaude
andcommitted
feat(envs): add LIBERO-plus robustness benchmark integration
- Add import fallback in libero.py for LIBERO-plus nested package structure (github.com/sylvestf/LIBERO-plus installs under a deeper module path than the original hf-libero wheel) - Register LiberoPlusEnv config subclass (inherits LiberoEnv fully; only the env type name and default suite differ) - Add libero_plus optional dep group in pyproject.toml pointing to the LIBERO-plus GitHub repo - Add docs/source/libero_plus.mdx with install guide, task suite table, perturbation dimensions, eval commands, and dataset reference - Add docker/Dockerfile.benchmark.libero_plus for isolated CI image - Add libero-plus-integration-test CI job to benchmark_tests.yml Dataset: pepijn223/libero_plus_lerobot is already v3.0 (no conversion needed). Dataset card is missing and should be added separately on the Hub. Eval smoke-test (requires Linux + GPU): lerobot-eval \ --policy.path=pepijn223/smolvla_libero \ --env.type=libero_plus \ --env.task=libero_spatial \ --eval.batch_size=1 --eval.n_episodes=1 \ --eval.use_async_envs=false --policy.device=cuda \ '--env.camera_name_mapping={"agentview_image":"camera1","robot0_eye_in_hand_image":"camera2"}' \ --policy.empty_cameras=1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 3534331 commit cdd540d

File tree

7 files changed

+393
-2
lines changed

7 files changed

+393
-2
lines changed

.github/workflows/benchmark_tests.yml

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,100 @@ jobs:
162162
path: /tmp/libero-artifacts/metrics.json
163163
if-no-files-found: warn
164164

165+
# ── LIBERO-plus ───────────────────────────────────────────────────────────
166+
# Isolated image: lerobot[libero_plus] only (LIBERO-plus from GitHub, mujoco)
167+
libero-plus-integration-test:
168+
name: LIBERO-plus — build image + 1-episode eval
169+
runs-on:
170+
group: aws-g6-4xlarge-plus
171+
env:
172+
HF_USER_TOKEN: ${{ secrets.LEROBOT_HF_USER }}
173+
174+
steps:
175+
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
176+
with:
177+
persist-credentials: false
178+
lfs: true
179+
180+
- name: Set up Docker Buildx
181+
uses: docker/setup-buildx-action@v3 # zizmor: ignore[unpinned-uses]
182+
with:
183+
cache-binary: false
184+
185+
- name: Build LIBERO-plus benchmark image
186+
uses: docker/build-push-action@v6 # zizmor: ignore[unpinned-uses]
187+
with:
188+
context: .
189+
file: docker/Dockerfile.benchmark.libero_plus
190+
push: false
191+
load: true
192+
tags: lerobot-benchmark-libero-plus:ci
193+
cache-from: type=local,src=/tmp/.buildx-cache-libero-plus
194+
cache-to: type=local,dest=/tmp/.buildx-cache-libero-plus,mode=max
195+
196+
- name: Login to Hugging Face
197+
if: env.HF_USER_TOKEN != ''
198+
run: |
199+
docker run --rm \
200+
-e HF_HOME=/tmp/hf \
201+
lerobot-benchmark-libero-plus:ci \
202+
bash -c "hf auth login --token '$HF_USER_TOKEN' --add-to-git-credential && hf auth whoami"
203+
204+
- name: Run LIBERO-plus smoke eval (1 episode)
205+
run: |
206+
docker run --name libero-plus-eval --gpus all \
207+
--shm-size=4g \
208+
-e HF_HOME=/tmp/hf \
209+
-e HF_USER_TOKEN="${HF_USER_TOKEN}" \
210+
-e HF_HUB_DOWNLOAD_TIMEOUT=300 \
211+
lerobot-benchmark-libero-plus:ci \
212+
bash -c "
213+
hf auth login --token \"\$HF_USER_TOKEN\" --add-to-git-credential 2>/dev/null || true
214+
lerobot-eval \
215+
--policy.path=pepijn223/smolvla_libero \
216+
--env.type=libero_plus \
217+
--env.task=libero_spatial \
218+
--eval.batch_size=1 \
219+
--eval.n_episodes=1 \
220+
--eval.use_async_envs=false \
221+
--policy.device=cuda \
222+
'--env.camera_name_mapping={\"agentview_image\": \"camera1\", \"robot0_eye_in_hand_image\": \"camera2\"}' \
223+
--policy.empty_cameras=1 \
224+
--output_dir=/tmp/eval-artifacts
225+
"
226+
227+
- name: Copy LIBERO-plus artifacts from container
228+
if: always()
229+
run: |
230+
mkdir -p /tmp/libero-plus-artifacts
231+
docker cp libero-plus-eval:/tmp/eval-artifacts/. /tmp/libero-plus-artifacts/ 2>/dev/null || true
232+
docker rm -f libero-plus-eval || true
233+
234+
- name: Parse LIBERO-plus eval metrics
235+
if: always()
236+
run: |
237+
python3 scripts/ci/parse_eval_metrics.py \
238+
--artifacts-dir /tmp/libero-plus-artifacts \
239+
--env libero_plus \
240+
--task libero_spatial \
241+
--policy pepijn223/smolvla_libero
242+
243+
- name: Upload LIBERO-plus rollout video
244+
if: always()
245+
uses: actions/upload-artifact@v4
246+
with:
247+
name: libero-plus-rollout-video
248+
path: /tmp/libero-plus-artifacts/videos/
249+
if-no-files-found: warn
250+
251+
- name: Upload LIBERO-plus eval metrics
252+
if: always()
253+
uses: actions/upload-artifact@v4
254+
with:
255+
name: libero-plus-metrics
256+
path: /tmp/libero-plus-artifacts/metrics.json
257+
if-no-files-found: warn
258+
165259
# ── METAWORLD ─────────────────────────────────────────────────────────────
166260
# Isolated image: lerobot[metaworld] only (metaworld==3.0.0, mujoco>=3 chain)
167261
metaworld-integration-test:
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# Isolated benchmark image for LIBERO-plus integration tests.
16+
# Installs only lerobot[libero_plus] (LIBERO-plus from GitHub, dm-control, mujoco).
17+
#
18+
# Build: docker build -f docker/Dockerfile.benchmark.libero_plus -t lerobot-benchmark-libero-plus .
19+
# Run: docker run --gpus all --rm lerobot-benchmark-libero-plus lerobot-eval ...
20+
21+
ARG CUDA_VERSION=12.4.1
22+
ARG OS_VERSION=22.04
23+
FROM nvidia/cuda:${CUDA_VERSION}-base-ubuntu${OS_VERSION}
24+
25+
ARG PYTHON_VERSION=3.12
26+
27+
ENV DEBIAN_FRONTEND=noninteractive \
28+
MUJOCO_GL=egl \
29+
PATH=/lerobot/.venv/bin:$PATH \
30+
CUDA_VISIBLE_DEVICES=0 \
31+
DEVICE=cuda
32+
33+
# System deps — same set as Dockerfile.internal plus LIBERO-plus extras
34+
RUN apt-get update && apt-get install -y --no-install-recommends \
35+
software-properties-common build-essential git curl \
36+
libglib2.0-0 libgl1-mesa-glx libegl1-mesa ffmpeg \
37+
libusb-1.0-0-dev speech-dispatcher libgeos-dev portaudio19-dev \
38+
cmake pkg-config ninja-build \
39+
libexpat1 libfontconfig1-dev libmagickwand-dev \
40+
&& add-apt-repository -y ppa:deadsnakes/ppa \
41+
&& apt-get update \
42+
&& apt-get install -y --no-install-recommends \
43+
python${PYTHON_VERSION} \
44+
python${PYTHON_VERSION}-venv \
45+
python${PYTHON_VERSION}-dev \
46+
&& curl -LsSf https://astral.sh/uv/install.sh | sh \
47+
&& mv /root/.local/bin/uv /usr/local/bin/uv \
48+
&& useradd --create-home --shell /bin/bash user_lerobot \
49+
&& usermod -aG sudo user_lerobot \
50+
&& apt-get clean && rm -rf /var/lib/apt/lists/*
51+
52+
WORKDIR /lerobot
53+
RUN chown -R user_lerobot:user_lerobot /lerobot
54+
USER user_lerobot
55+
56+
ENV HOME=/home/user_lerobot \
57+
HF_HOME=/home/user_lerobot/.cache/huggingface \
58+
HF_LEROBOT_HOME=/home/user_lerobot/.cache/huggingface/lerobot \
59+
TORCH_HOME=/home/user_lerobot/.cache/torch \
60+
TRITON_CACHE_DIR=/home/user_lerobot/.cache/triton
61+
62+
RUN uv venv --python python${PYTHON_VERSION}
63+
64+
# Install only lerobot[libero_plus] — isolated from hf-libero and metaworld dep trees
65+
COPY --chown=user_lerobot:user_lerobot setup.py pyproject.toml uv.lock README.md MANIFEST.in ./
66+
COPY --chown=user_lerobot:user_lerobot src/ src/
67+
68+
RUN uv sync --extra libero_plus --extra smolvla --no-cache
69+
70+
# Pre-download libero assets so nothing is fetched at runtime (CI timeout risk).
71+
# libero/libero/__init__.py prompts with input() when ~/.libero/config.yaml is
72+
# missing; write the config first so any import is non-interactive.
73+
RUN LIBERO_DIR=$(python${PYTHON_VERSION} -c \
74+
"import importlib.util, os; s=importlib.util.find_spec('libero'); \
75+
print(os.path.join(os.path.dirname(s.origin), 'libero'))") && \
76+
mkdir -p /home/user_lerobot/.libero && \
77+
python${PYTHON_VERSION} -c "\
78+
from huggingface_hub import snapshot_download; \
79+
snapshot_download(repo_id='lerobot/libero-assets', repo_type='dataset', \
80+
local_dir='/home/user_lerobot/.libero/assets')" && \
81+
printf "assets: /home/user_lerobot/.libero/assets\nbddl_files: ${LIBERO_DIR}/bddl_files\ndatasets: ${LIBERO_DIR}/../datasets\ninit_states: ${LIBERO_DIR}/init_files\n" \
82+
> /home/user_lerobot/.libero/config.yaml
83+
84+
RUN chmod +x /lerobot/.venv/lib/python${PYTHON_VERSION}/site-packages/triton/backends/nvidia/bin/ptxas
85+
86+
COPY --chown=user_lerobot:user_lerobot . .
87+
88+
CMD ["/bin/bash"]

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@
7777
title: Adding a New Benchmark
7878
- local: libero
7979
title: LIBERO
80+
- local: libero_plus
81+
title: LIBERO-plus
8082
- local: metaworld
8183
title: Meta-World
8284
- local: envhub_isaaclab_arena

docs/source/libero_plus.mdx

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# LIBERO-plus
2+
3+
LIBERO-plus is a **robustness benchmark** for Vision-Language-Action (VLA) models built on top of [LIBERO](./libero). It systematically stress-tests policies by applying **seven independent perturbation dimensions** to the original LIBERO task set, exposing failure modes that standard benchmarks miss.
4+
5+
- Paper: [LIBERO-plus: A Robustness Benchmark for VLA Models](https://github.com/sylvestf/LIBERO-plus)
6+
- GitHub: [sylvestf/LIBERO-plus](https://github.com/sylvestf/LIBERO-plus)
7+
- Dataset: [pepijn223/libero_plus_lerobot](https://huggingface.co/datasets/pepijn223/libero_plus_lerobot)
8+
9+
## Perturbation dimensions
10+
11+
LIBERO-plus creates ~10 000 task variants by perturbing each original LIBERO task along these axes:
12+
13+
| Dimension | What changes |
14+
| --------------------- | ----------------------------------------------------- |
15+
| Objects layout | Target position, presence of confounding objects |
16+
| Camera viewpoints | Camera position, orientation, field-of-view |
17+
| Robot initial states | Manipulator start pose |
18+
| Language instructions | LLM-rewritten task description (paraphrase / synonym) |
19+
| Light conditions | Intensity, direction, color, shadow |
20+
| Background textures | Scene surface and object appearance |
21+
| Sensor noise | Photometric distortions and image degradation |
22+
23+
## Available task suites
24+
25+
LIBERO-plus covers the same five suites as LIBERO:
26+
27+
| Suite | CLI name | Tasks | Max steps |
28+
| -------------- | ---------------- | ----- | --------- |
29+
| LIBERO-Spatial | `libero_spatial` | 10 | 280 |
30+
| LIBERO-Object | `libero_object` | 10 | 280 |
31+
| LIBERO-Goal | `libero_goal` | 10 | 300 |
32+
| LIBERO-90 | `libero_90` | 90 | 400 |
33+
| LIBERO-Long | `libero_10` | 10 | 520 |
34+
35+
## Installation
36+
37+
### System dependencies (Linux only)
38+
39+
```bash
40+
sudo apt install libexpat1 libfontconfig1-dev libmagickwand-dev
41+
```
42+
43+
### Python package
44+
45+
```bash
46+
pip install -e ".[libero_plus]"
47+
```
48+
49+
This installs LIBERO-plus directly from its GitHub repository. Because MuJoCo is required, only Linux is supported.
50+
51+
<Tip>
52+
Set the MuJoCo rendering backend before running evaluation:
53+
54+
```bash
55+
export MUJOCO_GL=egl # headless / HPC / cloud
56+
```
57+
58+
</Tip>
59+
60+
### Download LIBERO-plus assets
61+
62+
LIBERO-plus ships its extended asset pack separately. Download `assets.zip` from the [Hugging Face dataset](https://huggingface.co/datasets/Sylvest/LIBERO-plus/tree/main) and extract it into the LIBERO-plus package directory:
63+
64+
```bash
65+
# After installing the package, find where it was installed:
66+
python -c "import libero; print(libero.__file__)"
67+
# Then extract assets.zip into <package_root>/libero/assets/
68+
```
69+
70+
## Evaluation
71+
72+
### Minimal smoke-test (1 episode, no async)
73+
74+
```bash
75+
lerobot-eval \
76+
--policy.path=pepijn223/smolvla_libero \
77+
--env.type=libero_plus \
78+
--env.task=libero_spatial \
79+
--eval.batch_size=1 \
80+
--eval.n_episodes=1 \
81+
--eval.use_async_envs=false \
82+
--policy.device=cuda \
83+
--env.camera_name_mapping='{"agentview_image": "camera1", "robot0_eye_in_hand_image": "camera2"}' \
84+
--policy.empty_cameras=1
85+
```
86+
87+
### Full robustness benchmark (recommended)
88+
89+
```bash
90+
lerobot-eval \
91+
--policy.path=<your-policy-id> \
92+
--env.type=libero_plus \
93+
--env.task=libero_spatial,libero_object,libero_goal,libero_10 \
94+
--eval.batch_size=1 \
95+
--eval.n_episodes=10 \
96+
--env.max_parallel_tasks=1
97+
```
98+
99+
### Key CLI flags
100+
101+
| Flag | Description |
102+
| --------------------------- | ---------------------------------------------------------------- |
103+
| `--env.type=libero_plus` | Selects LIBERO-plus environment (same gym interface as `libero`) |
104+
| `--env.task` | Suite name(s), comma-separated |
105+
| `--env.task_ids` | Restrict to specific task indices, e.g. `[0,1,2]` |
106+
| `--env.camera_name_mapping` | JSON dict remapping raw camera names to policy input keys |
107+
| `--env.control_mode` | `relative` (default) or `absolute` |
108+
| `--eval.use_async_envs` | `true` for parallel rollouts (default), `false` for debugging |
109+
| `--policy.empty_cameras` | Number of camera slots without observations (policy-specific) |
110+
111+
### Camera name mapping
112+
113+
By default, LIBERO cameras are mapped as:
114+
115+
| Raw camera name | LeRobot key |
116+
| -------------------------- | --------------------------- |
117+
| `agentview_image` | `observation.images.image` |
118+
| `robot0_eye_in_hand_image` | `observation.images.image2` |
119+
120+
If your policy was trained with different key names, pass a JSON remapping:
121+
122+
```bash
123+
--env.camera_name_mapping='{"agentview_image": "camera1", "robot0_eye_in_hand_image": "camera2"}'
124+
```
125+
126+
## Policy inputs and outputs
127+
128+
**Observations (after `LiberoProcessorStep`):**
129+
130+
- `observation.state` — 8-dim proprioceptive vector: `[eef_pos(3), eef_axis_angle(3), gripper_qpos(2)]`
131+
- `observation.images.<name>` — camera image(s), flipped 180° to match VLA convention
132+
133+
**Actions:**
134+
135+
- `Box(-1, 1, shape=(7,))` — 6D end-effector delta + 1D gripper
136+
137+
## Dataset
138+
139+
A LeRobot-format training dataset for LIBERO-plus is available at:
140+
141+
- [pepijn223/libero_plus_lerobot](https://huggingface.co/datasets/pepijn223/libero_plus_lerobot)
142+
143+
### Example training command
144+
145+
```bash
146+
lerobot-train \
147+
--policy.type=smolvla \
148+
--policy.repo_id=${HF_USER}/smolvla_libero_plus \
149+
--policy.load_vlm_weights=true \
150+
--dataset.repo_id=pepijn223/libero_plus_lerobot \
151+
--env.type=libero_plus \
152+
--env.task=libero_spatial \
153+
--output_dir=./outputs/ \
154+
--steps=100000 \
155+
--batch_size=4 \
156+
--eval.batch_size=1 \
157+
--eval.n_episodes=1 \
158+
--eval_freq=1000
159+
```
160+
161+
## Relationship to LIBERO
162+
163+
LIBERO-plus is a drop-in extension of LIBERO:
164+
165+
- Same Python gym interface (`LiberoEnv`, `LiberoProcessorStep`)
166+
- Same camera names and observation/action format
167+
- Same task suite names
168+
- Installs under the same `libero` Python package name (different GitHub repo)
169+
- The only code difference in LeRobot is a try/except import fallback in `libero.py` that handles the slightly different package nesting in LIBERO-plus
170+
171+
To use the original LIBERO benchmark, see [LIBERO](./libero) and use `--env.type=libero`.

pyproject.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,11 @@ video_benchmark = ["scikit-image>=0.23.2,<0.26.0", "pandas>=2.2.2,<2.4.0"]
175175
aloha = ["gym-aloha>=0.1.2,<0.2.0", "lerobot[scipy-dep]"]
176176
pusht = ["gym-pusht>=0.1.5,<0.2.0", "pymunk>=6.6.0,<7.0.0"] # TODO: Fix pymunk version in gym-pusht instead
177177
libero = ["lerobot[transformers-dep]", "hf-libero>=0.1.3,<0.2.0; sys_platform == 'linux'", "lerobot[scipy-dep]"]
178+
libero_plus = [
179+
"lerobot[transformers-dep]",
180+
"libero @ git+https://github.com/sylvestf/LIBERO-plus.git@main ; sys_platform == 'linux'",
181+
"lerobot[scipy-dep]",
182+
]
178183
metaworld = ["metaworld==3.0.0", "lerobot[scipy-dep]"]
179184

180185
# All
@@ -205,6 +210,7 @@ all = [
205210
"lerobot[pusht]",
206211
"lerobot[phone]",
207212
"lerobot[libero]; sys_platform == 'linux'",
213+
"lerobot[libero_plus]; sys_platform == 'linux'",
208214
"lerobot[metaworld]",
209215
"lerobot[sarm]",
210216
"lerobot[peft]",

0 commit comments

Comments
 (0)