Enable all rough terrain envs on Newton backend#5248
Enable all rough terrain envs on Newton backend#5248hujc7 wants to merge 27 commits intoisaac-sim:developfrom
Conversation
Greptile SummaryThis testing branch adds
Confidence Score: 4/5Safe to continue testing; one P1 bug in CassieRoughEnvCfg_PLAY should be fixed before final merge. One P1 defect (Cassie PLAY retains live push events), plus a DRY concern with the duplicated RoughPhysicsCfg. Since this is an acknowledged DO NOT MERGE testing branch, score is 4 rather than lower, but the PLAY bug is a real behavioral issue. source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/config/cassie/rough_env_cfg.py — missing event disables in PLAY class. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[LocomotionVelocityRoughEnvCfg\nbase class] --> B[RobotRoughEnvCfg\ne.g. AnymalBRoughEnvCfg]
B --> C[RobotRoughEnvCfg_PLAY]
D[RoughPhysicsCfg\nPresetCfg] -->|physics preset| E[SimulationCfg]
E --> B
D -->|default| F[PhysxCfg]
D -->|newton| G[NewtonCfg\nMJWarpSolverCfg\nnjmax=200, nconmax=100]
D -->|physx| F
C -->|sets to None| H[events.push_robot]
C -->|sets to None| I[events.base_external_force_torque]
J[CassieRoughEnvCfg_PLAY\n⚠ Missing disables] -.->|NOT set to None| H
J -.->|NOT set to None| I
|
| class RoughPhysicsCfg(PresetCfg): | ||
| default = PhysxCfg(gpu_max_rigid_patch_count=10 * 2**15) | ||
| newton = NewtonCfg( | ||
| solver_cfg=MJWarpSolverCfg( | ||
| njmax=200, | ||
| nconmax=100, | ||
| cone="pyramidal", | ||
| impratio=1.0, | ||
| integrator="implicitfast", | ||
| use_mujoco_contacts=False, | ||
| ), | ||
| collision_cfg=NewtonCollisionPipelineCfg(max_triangle_pairs=2_500_000), | ||
| num_substeps=1, | ||
| debug_mode=False, | ||
| ) | ||
| physx = default | ||
|
|
There was a problem hiding this comment.
RoughPhysicsCfg duplicated across all 9 configs
The identical RoughPhysicsCfg class body is copy-pasted verbatim into every rough terrain config: anymal_b, anymal_c, anymal_d, a1, go1, go2, h1, g1, cassie, and digit. Changing Newton solver settings (e.g. raising njmax/nconmax for G1) will require editing 9+ files. Consider extracting this to a shared module (e.g. velocity_env_cfg.py or a new rough_physics_cfg.py) and importing it in each file.
scripts/test_rough_envs_newton.sh
Outdated
| if conda run -n env_isaaclab_develop python \ | ||
| "${REPO_ROOT}/${TRAIN_SCRIPT}" \ | ||
| --task "${TASK}" \ | ||
| --num_envs "${NUM_ENVS}" \ | ||
| --max_iterations "${MAX_ITERS}" \ | ||
| --headless \ | ||
| presets=newton \ | ||
| > "${LOG_FILE}" 2>&1; then |
There was a problem hiding this comment.
Hardcoded conda environment name
conda run -n env_isaaclab_develop hardcodes the author's local environment name. Anyone else running this script needs to edit the file. Consider reading the name from an environment variable with a fallback, or using ./isaaclab.sh -p (the project's standard wrapper) as the invocation instead of a bare python call.
Add RoughPhysicsCfg(PresetCfg) with Newton solver settings to all 9 non-Anymal-D rough terrain env configs. Remove broken StartupEventsCfg inheritance and move event customizations to __post_init__. Newton solver config (shared across all robots): - pyramidal friction cone, njmax=200, nconmax=100 - implicitfast integrator, max_triangle_pairs=2.5M - Newton collision pipeline (use_mujoco_contacts=False) Digit-specific fixes for MuJoCo ball-joint DoF inflation: - Split actuator config to explicit LEG+ARM joint names - Use exact body names in feet_slide reward Add batch test script (scripts/test_rough_envs_newton.sh) for running all rough envs sequentially in tmux.
The base_com event in velocity_env_cfg.py hardcodes body_names="base", but 6 rough envs use different base body names. Override per-env: - a1, go1: set body_names to "trunk" - digit: set body_names to "torso_base" - g1, h1, cassie: disable base_com (consistent with add_base_mass=None) Verified with 300-iter PhysX/Newton parity tests on all 7 working envs.
710e99d to
293cadc
Compare
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR enables all rough terrain environments on the Newton backend by (1) adding RoughPhysicsCfg(PresetCfg) with shared Newton solver settings to 10 robot env configs, (2) merging the StartupEventsCfg into the base EventsCfg, (3) fixing base_com body name mismatches for non-base robots, and (4) adding body velocity terminations for NaN/explosion detection. It also includes the XformPrimView → FrameView refactoring from PR #5179 (new BaseFrameView ABC, UsdFrameView, FabricFrameView, NewtonSiteFrameView), ray caster sensor spawner improvements, and a terrain border mesh fix for Newton compatibility.
The approach is sound — unifying events into the base class and using presets for backend-specific behavior is the right direction. However, several Newton-specific event customizations were silently removed during the refactoring, which constitutes a behavioral change that should be documented.
Design Assessment
Design is sound, with one structural concern. The move from per-robot *NewtonEventsCfg / *PhysxEventsCfg / *EventsCfg(PresetCfg) classes to a unified EventsCfg with preset() fields and __post_init__ overrides is a significant simplification. The FrameView abstraction hierarchy (BaseFrameView → UsdFrameView / FabricFrameView / NewtonSiteFrameView with FrameView factory) is well-designed. The main structural concern is RoughPhysicsCfg being copy-pasted identically across 10 files — this should be defined once and imported.
Findings
🟡 Warning: RoughPhysicsCfg is duplicated identically across 10 env config files — source/isaaclab_tasks/.../config/*/rough_env_cfg.py
All 10 robot env configs define an identical RoughPhysicsCfg(PresetCfg) class with the same Newton solver parameters (njmax=200, nconmax=100, pyramidal, implicitfast, max_triangle_pairs=2_500_000). If any Newton parameter needs tuning, all 10 files must be updated in sync. Consider defining this once in velocity_env_cfg.py (or a shared module) and importing it. Robot-specific overrides (e.g., if G1 eventually needs higher njmax for its NaN issue) can still be done via __post_init__.
🟡 Warning: Newton-specific event customizations were silently removed — Multiple env config files
The old per-robot *NewtonEventsCfg classes contained Newton-specific overrides that are no longer present in the new code:
push_robot = Nonewas set for all 6 non-Anymal robots (A1, Go1, Go2, G1, H1, Cassie) on Newton. Nowpush_robotis active on Newton for all robots.reset_robot_joints.params["position_range"] = (1.0, 1.0)constrained joint resets to exact default positions on Newton. This override is now removed.reset_base.paramszeroed out velocity randomization on Newton. This override is now removed.
These are intentional simplifications backed by the parity test results (6/7 envs show parity), but they change training dynamics on Newton. Recommend documenting these behavioral changes in the PR description so downstream users know the Newton event configuration has changed.
🟡 Warning: base_com randomization is newly enabled for several robots on PhysX — velocity_env_cfg.py:183
Previously, A1, Go1, Go2, Digit, Cassie, G1, and H1 all set base_com = None in their PhysX event configs (disabling COM randomization). The new code enables it for A1, Go1, Go2, and Digit (with corrected body names) and keeps it disabled for Cassie, G1, H1 (bipeds). This is a training-behavior change for the quadrupeds — the PR description mentions the anymal_c parity gap is related to this, but the change applies to other robots too. Worth noting in the PR description.
🟡 Warning: New collider_offsets startup event added to all rough terrain envs — velocity_env_cfg.py:199
A new randomize_rigid_body_collider_offsets startup event is added to the base EventsCfg with contact_offset_distribution_params=(0.01, 0.01). This event was not present in any of the old configurations. While the identical min/max suggests it applies a fixed offset (not truly random), this is a new event that affects all rough terrain environments. Should be mentioned in the PR description / changelog.
🔵 Suggestion: body_lin_vel_out_of_limit checks all bodies every step — velocity_env_cfg.py:298
The new body_lin_vel termination uses body_names=".*" which checks the linear velocity of every body in the articulation every simulation step. For robots with many bodies (e.g., Digit with 30+ bodies), this computes a norm per body per env. While 20 m/s is a reasonable threshold for catching NaN/explosions, consider whether checking only a subset of bodies (e.g., the base body) would be sufficient and more performant. The NaN check is the most valuable part — an Inf/NaN in any body would propagate to the base anyway.
🔵 Suggestion: Terrain border mesh generation could be pre-validated — source/isaaclab/isaaclab/terrains/terrain_generator.py
The new _add_terrain_border implementation using subdivided grid strips is a good fix for Newton collision detection with large triangles. However, when horizontal_scale is very small relative to border_width, this could generate a very large number of vertices ((border_width/hs + 1)^2 per strip). Consider adding a sanity check or logging a warning when the generated mesh exceeds a threshold vertex count.
Test Coverage
- New code: The FrameView refactoring has good test coverage with
test_frame_view_contract.py(359 lines), backend-specific tests for Newton and Fabric, and a ray caster regression test (test_raycaster_offset_does_not_affect_pos_w). - Env configs: The env config changes are tested via the author's parity test (7 envs × 2 backends × 300 iterations), documented in the PR description. No automated regression tests for env config behavioral changes, but this is consistent with the project's testing approach.
- Gaps: The removed Newton event customizations (push_robot, reset params) and new events (collider_offsets, body_lin_vel) lack unit tests, but these are config-level changes exercised by integration tests.
- Feature PR: Coverage adequate? Yes for the framework changes, Partial for the env config behavioral changes.
CI Status
Only a labeler check is visible and passing. No lint, build, or test CI results are shown — this may be expected for this repo's CI configuration or the checks may be running elsewhere.
Verdict
Minor fixes needed
The core approach — unifying events, adding Newton solver configs, and the FrameView abstraction — is well-executed with demonstrated parity results. The main concerns are:
- The 10× code duplication of
RoughPhysicsCfg(easy to fix by extracting to a shared module) - Silent behavioral changes from removed Newton event customizations should be documented
- New events (
collider_offsets,body_lin_vel) should be mentioned in the PR description
None of these are blocking, but addressing them would make the PR cleaner and help downstream users understand what changed. The parity test results provide confidence that the changes work correctly at a training level.
| class A1EventsCfg(PresetCfg): | ||
| default = A1PhysxEventsCfg() | ||
| newton = A1NewtonEventsCfg() | ||
| class RoughPhysicsCfg(PresetCfg): |
There was a problem hiding this comment.
🟡 Warning: Duplicated RoughPhysicsCfg — This exact class is copy-pasted identically across all 10 robot env configs. Consider defining it once in a shared module (e.g., velocity_env_cfg.py) and importing it. This would make future Newton solver parameter changes a single-file edit.
| newton=None, | ||
| ) | ||
|
|
||
| collider_offsets = EventTerm( |
There was a problem hiding this comment.
🟡 Warning: New event not present in old configs. collider_offsets is a new startup event added to all rough terrain environments. The identical min/max (0.01, 0.01) makes this a fixed offset rather than randomized. Was this intentional? If so, worth noting in the PR description as a behavioral change.
| func=mdp.illegal_contact, | ||
| params={"sensor_cfg": SceneEntityCfg("contact_forces", body_names="base"), "threshold": 1.0}, | ||
| ) | ||
| body_lin_vel = DoneTerm( |
There was a problem hiding this comment.
🔵 Suggestion: body_names=".*" checks every body in the articulation each step. For robots with many bodies this computes num_bodies × num_envs norms per step. Consider whether checking just the base body would be sufficient — NaN in any body will eventually propagate to the root state. The NaN detection is the most valuable part of this termination.
| rewards: RewardsCfg = RewardsCfg() | ||
| terminations: TerminationsCfg = TerminationsCfg() | ||
| events: EventsCfg = MISSING | ||
| events: EventsCfg = EventsCfg() |
There was a problem hiding this comment.
🔵 Note: Changed from events: EventsCfg = MISSING to events: EventsCfg = EventsCfg(). This means per-robot env configs no longer must provide events — they inherit the base class defaults. Good simplification, but means any robot that previously relied on MISSING to force explicit configuration will now silently get defaults.
Move duplicated RoughPhysicsCfg(PresetCfg) from 10 per-env configs into velocity_env_cfg.py. The base LocomotionVelocityRoughEnvCfg now sets sim: SimulationCfg = SimulationCfg(physics=RoughPhysicsCfg()) as default, eliminating 247 lines of duplication. Per-env configs inherit it.
Local testing script not needed in the final PR.
…es (isaac-sim#5251) # Description Adds fix for Cassie with ability to force newton to create joint drives ## Type of change - Bug fix (non-breaking change which fixes an issue) - New feature (non-breaking change which adds functionality) ## Checklist - [ ] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [ ] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
Go2 overflows the default njmax=200 constraint budget on rough terrain, causing solver failures. Increase to njmax=400 and nconmax=200 for this environment only.
Go1 ActuatorNetMLP benefits from armature=0.02 on Newton rough terrain. Add a Newton-only preset and a hydra test to prevent regression. Bump isaaclab_tasks to 1.5.22 with changelog entries for the Go1 armature, Go2 njmax override, and the planned action_scale preset.
Summary
Enable all rough terrain locomotion environments on the Newton physics backend.
Root Cause Identified
Newton's MuJoCo contact solver produces extreme forces on lightweight robots.
Contact spring time constant (
solref[0]=0.02) is very stiff — when lightweight feet (0.04 kg)strike rough terrain features, Newton generates forces up to 17,000 N on a 16 kg robot. These
forces flip the robot sideways within a single solver step. Joint velocity limits are also not
enforced as physics constraints (130 rad/s observed vs 21 rad/s configured limit).
This does NOT affect heavy robots (Anymal B/C/D at 30+ kg) where the same forces produce
manageable velocities.
Current Fix Direction: Contact Softening (
solref)Softening the MuJoCo contact spring (
solref=(0.05, 1.0)instead of default(0.02, 1.0))at original action scale shows promising results — Go2 ep_len improved from 11 to 276 at 300
iterations. Full 1500-iteration training in progress.
Note: Reducing
action_scalewas tested but rejected — the robot survives but cannot walkat meaningful speed.
Changes
RoughPhysicsCfgwith Newton solver settings (pyramidal cone, implicitfast integrator)solref/solimpparameters toMJWarpSolverCfgfor contact stiffness tuningarmature=0.02)StartupEventsCfgremoval from PR Supports Anymal-D Rough terrain training in newton #5225 for all envsbase_combody name mismatch for non-"base" robotsTraining Results
Dependencies
Test plan
🤖 Generated with Claude Code