Skip to content

Commit ccbd6f0

Browse files
committed
Bump G1 max_iterations to 6000 on Newton for rough-terrain parity
Newton converges ~2x slower than PhysX on G1 rough terrain. Both backends reach the same reward plateau (r~+17-19): PhysX near iter 3000, Newton near iter 6000. Ablation (armature 0.01/0.03, damping tuning, Newton a27277 upstream) did not close the gap — the cost is sample efficiency, not a ceiling. Use the framework preset on max_iterations rather than tuning physics or reward terms, keeping the env config engine-agnostic. Precedent: Allegro Hand (5000), Spot (20000).
1 parent a2998e5 commit ccbd6f0

3 files changed

Lines changed: 19 additions & 2 deletions

File tree

source/isaaclab_tasks/config/extension.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "1.5.23"
4+
version = "1.5.24"
55

66
# Description
77
title = "Isaac Lab Environments"

source/isaaclab_tasks/docs/CHANGELOG.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,19 @@
11
Changelog
22
---------
33

4+
1.5.24 (2026-04-22)
5+
~~~~~~~~~~~~~~~~~~~
6+
7+
Added
8+
^^^^^
9+
10+
* Added Newton rough terrain support for the G1 biped locomotion velocity
11+
env. The only engine-specific change is a 2x ``max_iterations`` preset on
12+
:class:`~isaaclab_tasks.manager_based.locomotion.velocity.config.g1.agents.rsl_rl_ppo_cfg.G1RoughPPORunnerCfg`
13+
(Newton = 6000, PhysX = 3000). Both backends converge to the same reward
14+
plateau (r≈+17-19); Newton is phase-shifted ~2000 iter later, so the
15+
iteration budget is bumped rather than tuning physics or reward terms.
16+
417
1.5.23 (2026-04-22)
518
~~~~~~~~~~~~~~~~~~~
619

source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/config/g1/agents/rsl_rl_ppo_cfg.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,16 @@
66
from isaaclab.utils import configclass
77

88
from isaaclab_rl.rsl_rl import RslRlOnPolicyRunnerCfg, RslRlPpoActorCriticCfg, RslRlPpoAlgorithmCfg
9+
from isaaclab_tasks.utils import preset
910

1011

1112
@configclass
1213
class G1RoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
1314
num_steps_per_env = 24
14-
max_iterations = 3000
15+
# Newton needs ~2x the PPO iterations to reach the same reward plateau as PhysX on G1.
16+
# Both backends converge to r≈+17-19; PhysX hits it near iter 3000, Newton near iter 6000.
17+
# The gap is sample-efficiency, not a ceiling — no physics or reward tuning closes it.
18+
max_iterations = preset(default=3000, newton=6000)
1519
save_interval = 50
1620
experiment_name = "g1_rough"
1721
policy = RslRlPpoActorCriticCfg(

0 commit comments

Comments
 (0)