Skip to content

Commit 9e3dc50

Browse files
authored
Merge branch 'develop' into fix/fabric-prepare-for-reuse
2 parents d547182 + a6e7577 commit 9e3dc50

32 files changed

Lines changed: 1531 additions & 130 deletions

File tree

.github/workflows/docs.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,4 +131,3 @@ jobs:
131131
github_token: ${{ secrets.GITHUB_TOKEN }}
132132
publish_dir: ./docs/_build
133133
keep_files: false
134-
force_orphan: true
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
# Multi-GPU distributed training validation
7+
#
8+
# This workflow validates that multi-GPU training works correctly across:
9+
# - Physics backends: PhysX, Newton
10+
# - Rendering backends: none (physics-only), Isaac RTX, Newton Warp
11+
#
12+
# Runs on a dedicated multi-GPU runner (separate from standard CI) to minimize costs.
13+
# Only triggered on PRs that touch distributed training code paths.
14+
15+
name: Multi-GPU Training Tests
16+
17+
on:
18+
pull_request:
19+
paths:
20+
- "source/isaaclab/isaaclab/app/app_launcher.py"
21+
- "source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py"
22+
- "scripts/reinforcement_learning/**/train.py"
23+
- ".github/workflows/test-multi-gpu.yaml"
24+
workflow_dispatch:
25+
26+
concurrency:
27+
group: ${{ github.workflow }}-${{ github.ref }}
28+
cancel-in-progress: true
29+
30+
jobs:
31+
test-multi-gpu:
32+
name: Multi-GPU (${{ matrix.physics }}, ${{ matrix.renderer }})
33+
# Use dedicated multi-GPU runner to avoid blocking standard CI resources
34+
# Configure this label on a runner with 2+ GPUs (e.g., g5.12xlarge with 4x A10G)
35+
runs-on: [self-hosted, linux, x64, gpu, multi-gpu]
36+
timeout-minutes: 30
37+
strategy:
38+
fail-fast: false
39+
matrix:
40+
include:
41+
# PhysX physics-only
42+
- physics: physx
43+
renderer: none
44+
task: Isaac-Cartpole-Direct-v0
45+
extra_args: ""
46+
47+
# PhysX + Isaac RTX renderer
48+
- physics: physx
49+
renderer: isaac-rtx
50+
task: Isaac-Cartpole-Camera-Presets-Direct-v0
51+
extra_args: ""
52+
trainer: skrl
53+
54+
# PhysX + Newton Warp renderer (hybrid)
55+
- physics: physx
56+
renderer: newton-warp
57+
task: Isaac-Cartpole-Camera-Presets-Direct-v0
58+
extra_args: "env.tiled_camera.renderer_cfg=newton_renderer"
59+
trainer: skrl
60+
61+
# Newton physics-only
62+
- physics: newton
63+
renderer: none
64+
task: Isaac-Cartpole-Direct-v0
65+
extra_args: "+sim=newton"
66+
67+
# Newton + Newton Warp renderer
68+
- physics: newton
69+
renderer: newton-warp
70+
task: Isaac-Cartpole-Camera-Presets-Direct-v0
71+
extra_args: "+sim=newton env.tiled_camera.renderer_cfg=newton_renderer"
72+
trainer: skrl
73+
74+
# Newton + Isaac RTX renderer (hybrid)
75+
- physics: newton
76+
renderer: isaac-rtx
77+
task: Isaac-Cartpole-Camera-Presets-Direct-v0
78+
extra_args: "+sim=newton"
79+
trainer: skrl
80+
81+
steps:
82+
- name: Checkout repository
83+
uses: actions/checkout@v4
84+
85+
- name: Set up Python
86+
uses: actions/setup-python@v5
87+
with:
88+
python-version: "3.10"
89+
90+
- name: Install Isaac Lab
91+
run: |
92+
./isaaclab.sh --install
93+
94+
- name: Verify multi-GPU availability
95+
run: |
96+
echo "=== GPU Info ==="
97+
nvidia-smi --query-gpu=index,name,memory.total --format=csv
98+
99+
GPU_COUNT=$(python -c "import torch; print(torch.cuda.device_count())")
100+
echo "Detected $GPU_COUNT GPU(s)"
101+
102+
if [ "$GPU_COUNT" -lt 2 ]; then
103+
echo "::error::At least 2 GPUs required for multi-GPU tests, found $GPU_COUNT"
104+
exit 1
105+
fi
106+
107+
- name: Run multi-GPU training (${{ matrix.physics }}, ${{ matrix.renderer }})
108+
env:
109+
NCCL_DEBUG: WARN
110+
run: |
111+
TRAINER="${{ matrix.trainer || 'rsl_rl' }}"
112+
113+
echo "=========================================="
114+
echo "Physics: ${{ matrix.physics }}"
115+
echo "Renderer: ${{ matrix.renderer }}"
116+
echo "Task: ${{ matrix.task }}"
117+
echo "Trainer: $TRAINER"
118+
echo "Extra args: ${{ matrix.extra_args }}"
119+
echo "=========================================="
120+
121+
# Run 2-GPU distributed training for 3 iterations
122+
./isaaclab.sh -p -m torch.distributed.run --nproc_per_node=2 \
123+
scripts/reinforcement_learning/${TRAINER}/train.py \
124+
--task=${{ matrix.task }} \
125+
--headless \
126+
--distributed \
127+
--max_iterations=3 \
128+
--num_envs=16 \
129+
${{ matrix.extra_args }}
130+
131+
- name: Verify training completed
132+
run: |
133+
# Find the most recent log directory
134+
LATEST_LOG=$(ls -td logs/*/*/*/ 2>/dev/null | head -1)
135+
136+
if [ -z "$LATEST_LOG" ]; then
137+
echo "::error::No training log directory found"
138+
exit 1
139+
fi
140+
141+
echo "Log directory: $LATEST_LOG"
142+
ls -la "$LATEST_LOG"
143+
144+
# Check for model checkpoints
145+
MODELS=$(find "$LATEST_LOG" -name "*.pt" | wc -l)
146+
echo "Model checkpoints found: $MODELS"
147+
148+
if [ "$MODELS" -lt 1 ]; then
149+
echo "::error::No model checkpoints found - training may have failed"
150+
exit 1
151+
fi
152+
153+
echo "✅ Multi-GPU training completed successfully (${{ matrix.physics }}, ${{ matrix.renderer }})"

docs/source/experimental-features/newton-physics-integration/solver-transitioning.rst

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,16 @@ Solver Transitioning
22
====================
33

44
Transitioning to the Newton physics engine introduces new physics solvers that handle simulation using different numerical approaches.
5-
While Newton supports several different solvers, our initial focus for Isaac Lab is on using the MuJoCo-Warp solver from Google DeepMind.
5+
While Newton supports several different solvers, our initial focus for Isaac Lab is on using the
6+
MuJoCo-Warp solver from Google DeepMind. Isaac Lab also includes beta support for the Kamino
7+
solver on selected classic tasks. Kamino is selected through a physics preset rather than as a
8+
separate backend; see :ref:`hydra-backend-solver-presets`.
9+
10+
.. note::
11+
12+
Kamino support is experimental and currently depends on assets being structured
13+
in a way that Kamino can consume. Assets that work with MuJoCo-Warp or PhysX
14+
may still require model-structure updates before they work with Kamino.
615

716
The way the physics scene itself is defined does not change - we continue to use USD as the primary way to set basic parameters of objects and robots in the scene,
817
and for current environments, the exact same USD files used for the PhysX-based Isaac Lab are used.
@@ -12,15 +21,18 @@ What does require change is the way that some solver-specific settings are confi
1221
Tuning these parameters can have a significant impact on both simulation performance and behaviour.
1322

1423
For now, we will show an example of setting these parameters to help provide a feel for these changes.
15-
Note that the :class:`~isaaclab.sim.NewtonCfg` replaces the :class:`~isaaclab.sim.PhysxCfg` and is used to set everything related to the physical simulation parameters except for the ``dt``:
24+
Note that the :class:`~isaaclab_newton.physics.NewtonCfg` replaces
25+
:class:`~isaaclab_physx.physics.PhysxCfg` and is used to set everything related to the physical
26+
simulation parameters except for the ``dt``:
1627

1728
.. code-block:: python
1829
19-
from isaaclab.sim._impl.newton_manager_cfg import NewtonCfg
20-
from isaaclab.sim._impl.solvers_cfg import MJWarpSolverCfg
30+
from isaaclab.sim import SimulationCfg
31+
from isaaclab_newton.physics import MJWarpSolverCfg, NewtonCfg
2132
2233
solver_cfg = MJWarpSolverCfg(
23-
nefc_per_env=35,
34+
njmax=35,
35+
nconmax=20,
2436
ls_iterations=10,
2537
cone="pyramidal",
2638
ls_parallel=True,
@@ -31,14 +43,17 @@ Note that the :class:`~isaaclab.sim.NewtonCfg` replaces the :class:`~isaaclab.si
3143
num_substeps=1,
3244
debug_mode=False,
3345
)
34-
sim: SimulationCfg = SimulationCfg(dt=1 / 120, render_interval=decimation, newton_cfg=newton_cfg)
46+
sim: SimulationCfg = SimulationCfg(dt=1 / 120, render_interval=decimation, physics=newton_cfg)
3547
3648
3749
Here is a very brief explanation of some of the key parameters above:
3850

39-
* ``nefc_per_env``: This is the size of the buffer constraints we want MuJoCo warp to
40-
pre-allocate for a given environment. A large value will slow down the simulation,
41-
while a too small value may lead to some contacts being missed.
51+
* ``njmax``: This is the number of constraint rows MuJoCo-Warp pre-allocates for a
52+
given environment. A large value will slow down the simulation, while a too small
53+
value may lead to missing constraints.
54+
55+
* ``nconmax``: This is the maximum number of contact points MuJoCo-Warp pre-allocates
56+
for a given environment. Set it high enough for the expected contact count.
4257

4358
* ``ls_iterations``: The number of line searches performed by the MuJoCo Warp solver.
4459
Line searches are used to find an optimal step size, and for each solver step,

docs/source/features/hydra.rst

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,74 @@ disabled unless explicitly selected:
242242
python train.py --task=Isaac-Reach-Franka-v0 env.scene.camera=large
243243
244244
245+
.. _hydra-backend-solver-presets:
246+
247+
Backend and Solver Presets
248+
^^^^^^^^^^^^^^^^^^^^^^^^^^
249+
250+
Physics backend selection uses the same preset system. A task can define a
251+
``PresetCfg`` whose entries replace the complete physics config:
252+
253+
.. code-block:: python
254+
255+
from isaaclab.utils import configclass
256+
from isaaclab_newton.physics import KaminoSolverCfg, MJWarpSolverCfg, NewtonCfg
257+
from isaaclab_physx.physics import PhysxCfg
258+
from isaaclab_tasks.utils import PresetCfg
259+
260+
@configclass
261+
class CartpolePhysicsCfg(PresetCfg):
262+
default: PhysxCfg = PhysxCfg()
263+
physx: PhysxCfg = PhysxCfg()
264+
newton: NewtonCfg = NewtonCfg(
265+
solver_cfg=MJWarpSolverCfg(njmax=5, nconmax=3),
266+
num_substeps=1,
267+
)
268+
kamino: NewtonCfg = NewtonCfg(
269+
solver_cfg=KaminoSolverCfg(
270+
integrator="moreau",
271+
use_collision_detector=True,
272+
sparse_jacobian=True,
273+
padmm_max_iterations=100,
274+
),
275+
num_substeps=1,
276+
debug_mode=False,
277+
use_cuda_graph=True,
278+
)
279+
280+
The ``newton`` and ``kamino`` entries both select the Newton physics backend because
281+
both entries are :class:`~isaaclab_newton.physics.NewtonCfg` objects. The difference
282+
is the solver configuration: ``newton`` uses
283+
:class:`~isaaclab_newton.physics.MJWarpSolverCfg`, while ``kamino`` uses
284+
:class:`~isaaclab_newton.physics.KaminoSolverCfg`.
285+
286+
Kamino is therefore a solver preset, not a separate Isaac Lab backend. The same
287+
Newton assets, sensors, renderers, and visualizers are used after the preset is
288+
resolved. It is a Proximal Alternating Direction Method of Multipliers (P-ADMM)
289+
based solver for constrained rigid multi-body dynamics, and its Isaac Lab support
290+
is currently beta.
291+
292+
.. note::
293+
294+
Kamino support is experimental and currently depends on the asset being
295+
structured in a way that Kamino can consume. Assets that work with the
296+
MuJoCo-Warp or PhysX presets may still require model-structure updates before
297+
they work with ``presets=kamino``.
298+
299+
.. code-block:: bash
300+
301+
# Select the Kamino solver preset everywhere it is defined
302+
python train.py --task=Isaac-Cartpole-v0 presets=kamino
303+
304+
# Select the Kamino solver preset for a specific physics config path
305+
python train.py --task=Isaac-Cartpole-v0 env.sim.physics=kamino
306+
307+
The ``kamino`` preset is currently defined for ``Isaac-Cartpole-Direct-v0``,
308+
``Isaac-Ant-Direct-v0``, ``Isaac-Cartpole-v0``, and ``Isaac-Ant-v0``. Passing
309+
``presets=kamino`` to a task without a ``kamino`` preset does not enable Kamino;
310+
add and validate a task-specific preset first.
311+
312+
245313
Inline Presets with preset()
246314
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
247315

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
.. note::
2+
3+
The bare ``isaaclab`` install ships only the core extension. To run
4+
the bundled training scripts under ``scripts/reinforcement_learning/``
5+
you must install with the ``[all]`` extras (or the per-framework
6+
extras ``[skrl]`` / ``[sb3]`` / ``[rsl-rl]``); otherwise commands such
7+
as ``python scripts/reinforcement_learning/skrl/train.py ...`` fail
8+
at import time with ``ModuleNotFoundError: No module named 'skrl'``.

docs/source/setup/installation/isaaclab_pip_installation.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,8 @@ Isaac Lab sub-packages:
7171
# Isaac Lab + Isaac Sim + all sub-packages
7272
uv pip install "isaaclab[isaacsim,all]" --extra-index-url https://pypi.nvidia.com --index-strategy unsafe-best-match --prerelease=allow
7373
74+
.. include:: include/pip_extras_note.rst
75+
7476
.. tab-item:: pip
7577

7678
.. code-block:: bash
@@ -90,6 +92,8 @@ Isaac Lab sub-packages:
9092
# Isaac Lab + Isaac Sim + all Isaac Lab sub-packages
9193
pip install "isaaclab[isaacsim,all]" --extra-index-url https://pypi.nvidia.com --pre
9294
95+
.. include:: include/pip_extras_note.rst
96+
9397
Installing dependencies
9498
~~~~~~~~~~~~~~~~~~~~~~~
9599

docs/source/tutorials/01_assets/run_deformable_object.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ Now that we have gone through the code, let's run the script and see the result:
163163
./isaaclab.sh -p scripts/tutorials/01_assets/run_deformable_object.py --visualizer kit
164164
165165
166-
This should open a stage with a ground plane, lights, and several green cubes. Two of the four cubes must be dropping
166+
This should open a stage with a ground plane, lights, and several cubes. Two of the four cubes must be dropping
167167
from a height and settling on to the ground. Meanwhile the other two cubes must be moving along the z-axis. You
168168
should see a marker showing the kinematic target position for the nodes at the bottom-left corner of the cubes.
169169
To stop the simulation, you can either close the window, or press ``Ctrl+C`` in the terminal

scripts/benchmarks/benchmark_non_rl.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,12 @@ def main(
120120

121121
# override configurations with non-hydra CLI arguments
122122
env_cfg.scene.num_envs = args_cli.num_envs if args_cli.num_envs is not None else env_cfg.scene.num_envs
123-
env_cfg.sim.device = args_cli.device if args_cli.device is not None else env_cfg.sim.device
123+
# For distributed training, launch_simulation() already resolved the
124+
# correct per-rank device; only apply a CLI --device override for
125+
# non-distributed runs (the default "cuda:0" would clobber the
126+
# per-rank device otherwise).
127+
if not args_cli.distributed:
128+
env_cfg.sim.device = args_cli.device if args_cli.device is not None else env_cfg.sim.device
124129
env_cfg.seed = args_cli.seed
125130

126131
# check for invalid combination of CPU device with distributed training
@@ -131,10 +136,10 @@ def main(
131136
)
132137

133138
# process distributed
139+
# env_cfg.sim.device is already resolved by launch_simulation().
134140
world_size = 1
135141
world_rank = 0
136142
if args_cli.distributed:
137-
env_cfg.sim.device = f"cuda:{int(os.getenv('LOCAL_RANK', '0'))}"
138143
world_size = int(os.getenv("WORLD_SIZE", 1))
139144
world_rank = int(os.getenv("RANK", "0"))
140145

0 commit comments

Comments
 (0)