Skip to content

[Feature] Surface Isaac Lab per-index reset and reset_to via IsaacLabWrapper#3781

Open
vmoens wants to merge 1 commit into
pytorch:mainfrom
vmoens:feature/isaaclab-per-env-reset-bridge
Open

[Feature] Surface Isaac Lab per-index reset and reset_to via IsaacLabWrapper#3781
vmoens wants to merge 1 commit into
pytorch:mainfrom
vmoens:feature/isaaclab-per-env-reset-bridge

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented May 19, 2026

Summary

IsaacLab's ManagerBasedEnv / DirectRLEnv envs let the caller reset an arbitrary subset of sub-environments without disturbing the others. Until now torchrl had no supported path for this: users had to reach env.base_env._env.unwrapped.reset(env_ids=...) by hand, bypassing spec validation, tensordict bookkeeping, and the transform stack.

This PR plumbs Isaac's per-index reset through the standard torchrl "_reset" boolean mask: when the tensordict passed to env.reset carries a partial _reset mask, only the masked sub-envs are reset, and the transform stack (RewardSum, InitTracker, recurrent primers, VecNormV2, ...) fires on the reset rows only -- exactly like a normal reset.

What's bridged

  • Per-index reset via _reset mask -- IsaacLabWrapper._reset now extracts the boolean mask from the tensordict and forwards it as env_ids to the underlying Isaac env (ManagerBasedEnv.reset(env_ids=...) on the manager-based path, DirectRLEnv._reset_idx(env_ids) + scene.write_data_to_sim() + sim.forward() + _get_observations() on the Direct path).
  • reset_to_state(state, td=...) + get_state() -- deterministic snapshot/branch using Isaac's reset_to (manager-based only; Direct envs do not expose reset_to).
  • IsaacLabWrapper._supports_native_autoreset(env) classmethod -- the single source of truth for which Isaac Lab classes the wrapper can bridge. Now includes ManagerBasedEnv, DirectRLEnv and DirectMARLEnv (previously the metaclass only matched ManagerBasedRLEnv). This fixes the Direct-env regression where step_and_maybe_reset would fire a synthetic full reset on top of Isaac's internal auto-reset.

Gating

The per-index reset path is gated on native_autoreset=True: with the default native_autoreset=False, partial-mask resets are issued by EnvBase.maybe_reset on every "done" row, and the VecGymEnvTransform obs-swap path already handles them implicitly -- firing unwrapped.reset(env_ids=...) there would double-reset the affected envs.

Backwards compatibility:

  • Full reset (no mask, all-True mask): unchanged path through super()._reset.
  • All-False mask: returns the input tensordict unchanged (matches historical no-op).
  • native_autoreset=False + partial mask: historical no-op preserved.

Files

  • torchrl/envs/libs/isaac_lab.py -- the new _reset, reset_to_state, get_state, _supports_native_autoreset, and helper plumbing.
  • torchrl/envs/libs/gym.py -- _GymAsyncMeta and _is_batched now delegate Isaac class detection to IsaacLabWrapper._supports_native_autoreset / _supported_isaac_env_classes; the metaclass mirrors _torchrl_native_autoreset onto the inner wrapper.
  • torchrl/testing/env_helper.py -- make_isaac_env resolves the env config dynamically via gymnasium.spec(env_name).kwargs["env_cfg_entry_point"], so it works for Direct envs (e.g. Isaac-Cartpole-Direct-v0); added a num_envs knob.
  • knowledge_base/ISAACLAB.md -- new "Per-index reset and reset_to_state" section.
  • test/libs/test_isaac.py -- 6 new tests (details below).

Test plan

New tests (all skipped if isaaclab is unavailable):

  • test_isaaclab_partial_reset_via_reset_mask -- episode_length_buf is 0 only for the masked envs; the rest keeps its prior step counter.
  • test_isaaclab_partial_reset_triggers_transforms -- RewardSum.episode_reward zeroed only for masked envs in a TransformedEnv(env, RewardSum()) stack.
  • test_isaaclab_full_reset_unchanged_by_bridge -- full reset and all-True mask still zero every env.
  • test_isaaclab_reset_to_state_roundtrip -- snapshot, step, then reset(td, isaac_reset_state=snapshot) for half the envs; masked envs land back on the snapshot, others keep going.
  • test_isaaclab_get_state_returns_scene_state -- env.get_state() mirrors InteractiveScene.get_state().
  • test_isaaclab_direct_env_native_autoreset[Isaac-Cartpole-Direct-v0] -- runs in a worker process (mirrors the existing _isaaclab_rollout pattern); asserts that with native_autoreset=True on a Direct env, step_and_maybe_reset skips the synthetic reset, marks the terminal ("next", obs) with NaN at done rows, and the next root tensordict has finite obs at the same rows.

Regression checks:

  • pytest test/envs/test_auto_reset.py test/envs/test_env_base.py -- 77 passed, 7 skipped, no regressions.
  • pytest test/libs/test_isaac.py --collect-only -- all old + new tests collect cleanly.
  • Pre-commit (ufmt, flake8, pydocstyle, pyupgrade, codespell, autoflake) passes on the commit.
  • Must-not-regress invariants from the original ticket are covered by the pre-existing test_isaaclab_native_autoreset_rollout_seeded, test_isaaclab_native_autoreset_rollout_matches_default_signals, and test_isaaclab_native_autoreset_rollout_reset_obs_continuity tests -- still untouched.

Made with Cursor

…Wrapper

IsaacLab's ManagerBased / Direct envs let the caller reset an arbitrary
subset of sub-environments without disturbing the others. Until now,
torchrl had no supported path for this: users had to reach
`env.base_env._env.unwrapped.reset(env_ids=...)` by hand, bypassing
spec validation, tensordict bookkeeping, and the transform stack.

This change plumbs Isaac's per-index reset through the standard torchrl
`"_reset"` boolean mask: when the tensordict passed to `env.reset` carries
a partial `_reset` mask, only the masked sub-envs are reset, and the
transform stack (RewardSum, InitTracker, recurrent primers, VecNormV2,
...) fires on the reset rows only -- exactly like a normal reset.

Also bridged:
- `reset_to_state(state, td=...)` / `get_state()` for deterministic
  branching from a snapshot (manager-based envs only).
- `IsaacLabWrapper._supports_native_autoreset` classmethod -- the single
  source of truth for which Isaac Lab classes the wrapper can bridge.
  This now includes `ManagerBasedEnv`, `DirectRLEnv` and `DirectMARLEnv`
  (previously only `ManagerBasedRLEnv`), fixing the Direct-env regression
  where `step_and_maybe_reset` would fire a synthetic full reset on top
  of Isaac's internal auto-reset.

The per-index reset path is gated on `native_autoreset=True`: with the
default `native_autoreset=False`, partial-mask resets are issued by
`EnvBase.maybe_reset` on every "done" row, and the `VecGymEnvTransform`
obs-swap path already handles them implicitly -- firing
`unwrapped.reset(env_ids=...)` there would double-reset the affected envs.

Tests cover partial reset semantics on `Isaac-Ant-v0` (manager-based),
the Direct-env native_autoreset bridge on `Isaac-Cartpole-Direct-v0`
(in a worker process, mirroring the existing rollout test pattern),
transform-state preservation under partial reset, full-reset
backwards-compat, and the `reset_to_state` round trip.

Authored with Claude.

Co-authored-by: Cursor <cursoragent@cursor.com>
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3781

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (1 Unrelated Failure)

As of commit 1eb439c with merge base 68f1ba5 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 19, 2026
@github-actions github-actions Bot added Environments Adds or modifies an environment wrapper Environments/gym Environments/isaaclab Feature New feature labels May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Environments/gym Environments/isaaclab Environments Adds or modifies an environment wrapper Feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant