AMP for Locomotion-Aliengo-Flat

Hi! First of al, thank you for this repo! I've been training for the `Locomotion-Aliengo-Flat` task (and the `Locomotion-Go2-Flat` task) using the `AMP_PPO` algorithm. I'm generally seeing that using `AMP_PPO` is performing worse than pure `PPO`, as I slowly decrease the `reward_scale` from 1.0 to 0.0, the rewards increase. I am using `IsaacLab 2.3.2`, `rsl-rl-lib 3.1.2` and  `amp-rsl-rl 1.2.0`.

For instance, keeping the hyperparameters for PPO and environment the same (as the repo's), and adding Discriminator config of 
```
hidden_dims = [128, 128]
empirical_normalization = False
loss_type = "BCEWithLogits"
```
I get the following results, where performance is better with lower `reward_scale` as we use discriminator less. The videos I generated using `play_amp.py` script also show better performance with lower `reward_scale`. 

<img width="1113" height="330" alt="Image" src="https://github.com/user-attachments/assets/9720a620-b22a-4fad-bf59-a2375d76562d" />

rew_scale=0.005:
<img src="https://github.com/user-attachments/assets/07aae1d5-db62-4140-ae56-b391da456fc2" width="400"/>
rew_scale=0.2:
<img src="https://github.com/user-attachments/assets/c9b4e8de-7ff6-45cf-9f54-ef22d961df3e" width="400"/>
rew_scale=0.5:
<img src="https://github.com/user-attachments/assets/de49044d-54cd-42e9-b4fc-d2d368b530ca" width="400"/>


I was wondering if you have any insights or tips on how to use `AMP_PPO` successfully, from any ways to debug to suggestions for hyperparameters? Thank you so much for reading this and for your work! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMP for Locomotion-Aliengo-Flat #42

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AMP for Locomotion-Aliengo-Flat #42

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions