Skip to content

Commit e1b7bd6

Browse files
committed
Bump G1 max_iterations to 5000 on Newton for rough-terrain parity
PhysX G1 saturates near iter 3000: reward ≈ +18, ep_len ≈ 980. Past iter 3000 PhysX does not meaningfully improve on either metric — reward oscillates +16-19 through iter 7500, ep_len stays flat. Newton vanilla reaches matching (reward, ep_len) = (+16, 984) at iter 5000 and equals/exceeds PhysX by iter 6000 (+18.9 / 996). The gap is sample-efficiency, not a ceiling. Ablation (armature 0.01/0.03, damping 5→20, finger-removal from action space, Newton upstream a27277) did not change Newton's curve shape. Use the framework preset on max_iterations rather than tuning physics or reward terms, keeping the env config engine-agnostic. Precedent: Allegro Hand (5000), Spot (20000).
1 parent b6fd87d commit e1b7bd6

3 files changed

Lines changed: 25 additions & 2 deletions

File tree

source/isaaclab_tasks/config/extension.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "1.5.26"
4+
version = "1.5.27"
55

66
# Description
77
title = "Isaac Lab Environments"

source/isaaclab_tasks/docs/CHANGELOG.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,22 @@
11
Changelog
22
---------
33

4+
1.5.27 (2026-04-24)
5+
~~~~~~~~~~~~~~~~~~~
6+
7+
Added
8+
^^^^^
9+
10+
* Added Newton rough terrain support for the G1 biped locomotion velocity
11+
env. The only engine-specific change is a ~1.7x ``max_iterations`` preset on
12+
:class:`~isaaclab_tasks.manager_based.locomotion.velocity.config.g1.agents.rsl_rl_ppo_cfg.G1RoughPPORunnerCfg`
13+
(Newton = 5000, PhysX = 3000). PhysX saturates near iter 3000 on both
14+
reward (≈ +18) and episode length (≈ 980) and does not meaningfully
15+
improve further; Newton reaches the same (reward, ep_len) quality at
16+
iter 5000. The iteration budget is bumped rather than tuning physics
17+
or reward terms.
18+
19+
420
1.5.26 (2026-04-24)
521
~~~~~~~~~~~~~~~~~~~
622

source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/config/g1/agents/rsl_rl_ppo_cfg.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,19 @@
66
from isaaclab.utils import configclass
77

88
from isaaclab_rl.rsl_rl import RslRlOnPolicyRunnerCfg, RslRlPpoActorCriticCfg, RslRlPpoAlgorithmCfg
9+
from isaaclab_tasks.utils import preset
910

1011

1112
@configclass
1213
class G1RoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
1314
num_steps_per_env = 24
14-
max_iterations = 3000
15+
# Newton needs ~1.7x the PPO iterations to match PhysX on G1. PhysX saturates near iter 3000
16+
# (reward ≈ +18, ep_len ≈ 980) and does not meaningfully improve on either metric past that —
17+
# reward oscillates +16 to +19 through iter 7500, ep_len stays flat. Newton reaches the same
18+
# (reward, ep_len) quality at iter 5000 (+16 / 984). Comparing reward alone is misleading:
19+
# ep_len confirms the robot is stable in both cases. The gap is sample-efficiency, not a
20+
# ceiling — no physics or reward tuning closes it.
21+
max_iterations = preset(default=3000, newton=5000)
1522
save_interval = 50
1623
experiment_name = "g1_rough"
1724
policy = RslRlPpoActorCriticCfg(

0 commit comments

Comments
 (0)