Skip to content

Commit b674671

Browse files
authored
Improves Isaac Lab Mimic tutorial documentation and flow (#5283)
# Description This PR updates workflows and documentation for Isaac Lab Mimic to improve ease of use and code legibility. The changes include: 1. Add option for full sim buffer reset in HDF5 replay script when using single envs. Copy and move a large chunk of the main() function into a separate helper function. 2. Mark optional methods in ManagerBasedRLMimicEnv 3. Refactor Franka IK Rel envs to inherit directly from Stack Env base. Eliminate illogical inheritance from Franka direct joint pose env. 4. Change idle action in pick place envs from torch tensor to standard python list so allow for env serialization 5. Refactor Isaac Lab Mimic documentation for better clarity and flow. 6. Let uv override numpy <2 dep requirements to avoid downgrading numpy for SRL usd-to-urdf-converter ## Type of change <!-- As you go through the list, delete the ones that are not applicable. --> - New feature (non-breaking change which adds functionality) - Documentation update ## Screenshots Please attach before and after screenshots of the change if applicable. <!-- Example: | Before | After | | ------ | ----- | | _gif/png before_ | _gif/png after_ | To upload images to a PR -- simply drag and drop an image while in edit mode and it should upload the image directly. You can then paste that source into the above before/after sections. --> ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there <!-- As you go through the checklist above, you can mark something as done by putting an x character in it For example, - [x] I have done this task - [ ] I have not done this task -->
1 parent 80e48f7 commit b674671

25 files changed

Lines changed: 950 additions & 467 deletions
116 KB
Loading
82.7 KB
Loading
349 KB
Loading

docs/source/overview/imitation-learning/augmented_imitation.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ Augmented Imitation Learning
55

66
This section describes how to use Isaac Lab's imitation learning capabilities with the visual augmentation capabilities of `Cosmos <https://www.nvidia.com/en-us/ai/cosmos/>`_ models to generate demonstrations at scale to train visuomotor policies robust against visual variations.
77

8+
9+
.. important::
10+
The `Cosmos Transfer1 <https://github.com/nvidia-cosmos/cosmos-transfer1/tree/e4055e39ee9c53165e85275bdab84ed20909714a>`_ model used in this tutorial is `supported <https://huggingface.co/nvidia/Cosmos-Transfer1-7B#software-integration>`_ on Ampere and Hopper GPUs.
11+
812
Generating Demonstrations
913
~~~~~~~~~~~~~~~~~~~~~~~~~
1014

docs/source/overview/imitation-learning/humanoids_imitation.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ This page covers data generation and imitation learning workflows for humanoid r
1111

1212
.. important::
1313

14-
Complete the tutorial in :ref:`Teleoperation and Imitation Learning with Isaac Lab Mimic <teleoperation-imitation-learning>`
14+
Complete the tutorial in :ref:`Synthetic Data Generation and Imitation Learning with Isaac Lab Mimic <teleoperation-imitation-learning>`
1515
before proceeding with the following demonstrations to
1616
understand the data collection, annotation, and generation steps of Isaac Lab Mimic.
1717

@@ -114,7 +114,7 @@ You can replay the collected demonstrations by running the following command:
114114
Annotate the demonstrations
115115
"""""""""""""""""""""""""""
116116

117-
Unlike the :ref:`Franka stacking task <generating-additional-demonstrations>`, the GR-1 pick and place task uses manual annotation to define subtasks.
117+
Unlike the :ref:`Franka stacking task <generate-additional-demonstrations>`, the GR-1 pick and place task uses manual annotation to define subtasks.
118118

119119
The pick and place task has one subtask for the left arm (pick) and two subtasks for the right arm (idle, place).
120120
Annotations denote the end of a subtask. For the pick and place task, this means there are no annotations for the left arm and one annotation for the right arm (the end of the final subtask is always implicit).

docs/source/overview/imitation-learning/teleop_imitation.rst

Lines changed: 315 additions & 171 deletions
Large diffs are not rendered by default.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,7 @@ explicit = false
154154
[tool.uv]
155155
index-strategy = "unsafe-best-match"
156156
prerelease = "allow"
157+
override-dependencies = ["numpy>=2"]
157158

158159
[tool.uv.pip]
159160
index-strategy = "unsafe-best-match"

scripts/tools/replay_demos.py

Lines changed: 126 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,15 @@
3838
default=False,
3939
help="Validate the replay success rate using the task environment termination criteria",
4040
)
41+
parser.add_argument(
42+
"--reset_sim_buffer_each_episode",
43+
action="store_true",
44+
default=False,
45+
help=(
46+
"Before loading each episode's initial state, call env.sim.reset() to clear"
47+
" simulation buffers. Only valid with --num_envs 1."
48+
),
49+
)
4150

4251
# append AppLauncher cli args
4352
AppLauncher.add_app_launcher_args(parser)
@@ -106,86 +115,27 @@ def compare_states(state_from_dataset, runtime_state, runtime_env_index) -> (boo
106115
return states_matched, output_log
107116

108117

109-
def main():
110-
"""Replay episodes loaded from a file."""
111-
global is_paused
112-
113-
# Load dataset
114-
if not os.path.exists(args_cli.dataset_file):
115-
raise FileNotFoundError(f"The dataset file {args_cli.dataset_file} does not exist.")
116-
dataset_file_handler = HDF5DatasetFileHandler()
117-
dataset_file_handler.open(args_cli.dataset_file)
118-
env_name = dataset_file_handler.get_env_name()
119-
episode_count = dataset_file_handler.get_num_episodes()
120-
121-
if episode_count == 0:
122-
print("No episodes found in the dataset.")
123-
exit()
124-
125-
episode_indices_to_replay = args_cli.select_episodes
126-
if len(episode_indices_to_replay) == 0:
127-
episode_indices_to_replay = list(range(episode_count))
128-
129-
if args_cli.task is not None:
130-
env_name = args_cli.task.split(":")[-1]
131-
if env_name is None:
132-
raise ValueError("Task/env name was not specified nor found in the dataset.")
133-
134-
num_envs = args_cli.num_envs
135-
136-
env_cfg = parse_env_cfg(env_name, device=args_cli.device, num_envs=num_envs)
137-
138-
# extract success checking function to invoke in the main loop
139-
success_term = None
140-
if args_cli.validate_success_rate:
141-
if hasattr(env_cfg.terminations, "success"):
142-
success_term = env_cfg.terminations.success
143-
env_cfg.terminations.success = None
144-
else:
145-
print(
146-
"No success termination term was found in the environment."
147-
" Will not be able to mark recorded demos as successful."
148-
)
149-
150-
# Disable all recorders and terminations
151-
env_cfg.recorders = {}
152-
env_cfg.terminations = {}
153-
154-
# create environment from loaded config
155-
env = gym.make(args_cli.task, cfg=env_cfg).unwrapped
156-
157-
teleop_interface = Se3Keyboard(Se3KeyboardCfg(pos_sensitivity=0.1, rot_sensitivity=0.1))
158-
teleop_interface.add_callback("N", play_cb)
159-
teleop_interface.add_callback("B", pause_cb)
160-
print('Press "B" to pause and "N" to resume the replayed actions.')
161-
162-
# Determine if state validation should be conducted
163-
state_validation_enabled = False
164-
if args_cli.validate_states and num_envs == 1:
165-
state_validation_enabled = True
166-
elif args_cli.validate_states and num_envs > 1:
167-
print("Warning: State validation is only supported with a single environment. Skipping state validation.")
168-
169-
# Get idle action (idle actions are applied to envs without next action)
170-
if hasattr(env_cfg, "idle_action"):
171-
idle_action = env_cfg.idle_action.repeat(num_envs, 1)
172-
else:
173-
idle_action = torch.zeros(env.action_space.shape)
118+
def replay_episodes_loop( # noqa: C901
119+
env,
120+
dataset_file_handler: HDF5DatasetFileHandler,
121+
episode_names: list[str],
122+
episode_count: int,
123+
episode_indices_to_replay: list[int],
124+
num_envs: int,
125+
success_term,
126+
state_validation_enabled: bool,
127+
idle_action: torch.Tensor,
128+
reset_sim_buffer_each_episode: bool,
129+
) -> tuple[int, int, list[int]]:
130+
"""Run the replay loop until all selected episodes finish or the app exits.
174131
175-
# reset before starting
176-
env.reset()
177-
teleop_interface.reset()
178-
179-
# simulate environment -- run everything in inference mode
180-
episode_names = list(dataset_file_handler.get_episode_names())
132+
Returns:
133+
Tuple of (replayed_episode_count, recorded_episode_count, failed_demo_ids).
134+
"""
181135
replayed_episode_count = 0
182136
recorded_episode_count = 0
183-
184-
# Track current episode indices for each environment
185-
current_episode_indices = [None] * num_envs
186-
187-
# Track failed demo IDs
188-
failed_demo_ids = []
137+
current_episode_indices: list[int | None] = [None] * num_envs
138+
failed_demo_ids: list[int] = []
189139

190140
with contextlib.suppress(KeyboardInterrupt) and torch.inference_mode():
191141
while simulation_app.is_running() and not simulation_app.is_exiting():
@@ -195,7 +145,7 @@ def main():
195145
episode_ended = [False] * num_envs
196146
while has_next_action:
197147
# initialize actions with idle action so those without next action will not move
198-
actions = idle_action
148+
actions = idle_action.clone()
199149
has_next_action = False
200150
for env_id in range(num_envs):
201151
env_next_action = env_episode_data_map[env_id].get_next_action()
@@ -216,11 +166,9 @@ def main():
216166
)
217167
else:
218168
# if not successful, add to failed demo IDs list
219-
if (
220-
current_episode_indices[env_id] is not None
221-
and current_episode_indices[env_id] not in failed_demo_ids
222-
):
223-
failed_demo_ids.append(current_episode_indices[env_id])
169+
cid = current_episode_indices[env_id]
170+
if cid is not None and cid not in failed_demo_ids:
171+
failed_demo_ids.append(cid)
224172

225173
episode_ended[env_id] = True
226174

@@ -243,6 +191,8 @@ def main():
243191
env_episode_data_map[env_id] = episode_data
244192
# Set initial state for the new episode
245193
initial_state = episode_data.get_initial_state()
194+
if reset_sim_buffer_each_episode:
195+
env.sim.reset()
246196
env.reset_to(initial_state, torch.tensor([env_id], device=env.device), is_relative=True)
247197
# Get the first action for the new episode
248198
env_next_action = env_episode_data_map[env_id].get_next_action()
@@ -275,6 +225,99 @@ def main():
275225
print("\t- mismatched.")
276226
print(comparison_log)
277227
break
228+
229+
return replayed_episode_count, recorded_episode_count, failed_demo_ids
230+
231+
232+
def main():
233+
"""Replay episodes loaded from a file."""
234+
global is_paused
235+
236+
# Load dataset
237+
if not os.path.exists(args_cli.dataset_file):
238+
raise FileNotFoundError(f"The dataset file {args_cli.dataset_file} does not exist.")
239+
dataset_file_handler = HDF5DatasetFileHandler()
240+
dataset_file_handler.open(args_cli.dataset_file)
241+
env_name = dataset_file_handler.get_env_name()
242+
episode_count = dataset_file_handler.get_num_episodes()
243+
244+
if episode_count == 0:
245+
print("No episodes found in the dataset.")
246+
exit()
247+
248+
episode_indices_to_replay = list(args_cli.select_episodes)
249+
if len(episode_indices_to_replay) == 0:
250+
episode_indices_to_replay = list(range(episode_count))
251+
252+
if args_cli.task is not None:
253+
env_name = args_cli.task.split(":")[-1]
254+
if env_name is None:
255+
raise ValueError("Task/env name was not specified nor found in the dataset.")
256+
257+
num_envs = args_cli.num_envs
258+
if args_cli.reset_sim_buffer_each_episode and num_envs != 1:
259+
raise ValueError(
260+
"--reset_sim_buffer_each_episode is only supported with a single environment (--num_envs 1). "
261+
f"Got num_envs={num_envs}. Use --num_envs 1 or disable --reset_sim_buffer_each_episode."
262+
)
263+
264+
env_cfg = parse_env_cfg(env_name, device=args_cli.device, num_envs=num_envs)
265+
266+
# extract success checking function to invoke in the main loop
267+
success_term = None
268+
if args_cli.validate_success_rate:
269+
if hasattr(env_cfg.terminations, "success"):
270+
success_term = env_cfg.terminations.success
271+
env_cfg.terminations.success = None
272+
else:
273+
print(
274+
"No success termination term was found in the environment."
275+
" Will not be able to mark recorded demos as successful."
276+
)
277+
278+
# Disable all recorders and terminations
279+
env_cfg.recorders = {}
280+
env_cfg.terminations = {}
281+
282+
# create environment from loaded config
283+
env = gym.make(args_cli.task, cfg=env_cfg).unwrapped
284+
285+
teleop_interface = Se3Keyboard(Se3KeyboardCfg(pos_sensitivity=0.1, rot_sensitivity=0.1))
286+
teleop_interface.add_callback("N", play_cb)
287+
teleop_interface.add_callback("B", pause_cb)
288+
print('Press "B" to pause and "N" to resume the replayed actions.')
289+
290+
# Determine if state validation should be conducted
291+
state_validation_enabled = False
292+
if args_cli.validate_states and num_envs == 1:
293+
state_validation_enabled = True
294+
elif args_cli.validate_states and num_envs > 1:
295+
print("Warning: State validation is only supported with a single environment. Skipping state validation.")
296+
297+
# Get idle action (idle actions are applied to envs without next action)
298+
if hasattr(env_cfg, "idle_action"):
299+
idle_action = torch.tensor(env_cfg.idle_action, device=env.unwrapped.device).repeat(num_envs, 1)
300+
else:
301+
idle_action = torch.zeros(env.action_space.shape)
302+
303+
# reset before starting
304+
env.reset()
305+
teleop_interface.reset()
306+
307+
episode_names = list(dataset_file_handler.get_episode_names())
308+
replayed_episode_count, recorded_episode_count, failed_demo_ids = replay_episodes_loop(
309+
env,
310+
dataset_file_handler,
311+
episode_names,
312+
episode_count,
313+
episode_indices_to_replay,
314+
num_envs,
315+
success_term,
316+
state_validation_enabled,
317+
idle_action,
318+
args_cli.reset_sim_buffer_each_episode,
319+
)
320+
278321
# Close environment after replay in complete
279322
plural_trailing_s = "s" if replayed_episode_count > 1 else ""
280323
print(f"Finished replaying {replayed_episode_count} episode{plural_trailing_s}.")

source/isaaclab/config/extension.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "4.6.14"
4+
version = "4.6.15"
55

66
# Description
77
title = "Isaac Lab framework for Robot Learning"

source/isaaclab/docs/CHANGELOG.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,17 @@
11
Changelog
22
---------
33

4+
4.6.15 (2026-04-24)
5+
~~~~~~~~~~~~~~~~~~~
6+
7+
Changed
8+
^^^^^^^
9+
10+
* Marked :meth:`~isaaclab.envs.manager_based_rl_mimic_env.ManagerBasedRLMimicEnv.get_subtask_start_signals` and
11+
:meth:`~isaaclab.envs.manager_based_rl_mimic_env.ManagerBasedRLMimicEnv.get_subtask_term_signals` with
12+
``@optional_method``.
13+
14+
415
4.6.14 (2026-04-24)
516
~~~~~~~~~~~~~~~~~~~
617

0 commit comments

Comments
 (0)