✨ add step-wise intermediate rewards #670

Open

Assignees

opened

on May 11, 2026

The RL agent currently only learns from terminal rewards. Intermediate rewards lead to more efficient policies.

Mostly implemented in #526

Metadata

Assignees

Shaobo-Zhou

Labels

No labels

No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests