Skip to content

Action space mismatch across paper / dataset / code (20 vs 21 vs 24 vs 25) #43

@mxlin043

Description

@mxlin043

Hi, I’m confused by inconsistent action-space definitions:

  • Paper body (“Unified observation and action space”): 16 binary buttons + 4 joystick axes = 20 dims
  • Paper appendix (A.1): action chunk a ∈ R^(16×24) (24 dims per step)
  • Dataset parquet: 17 boolean button columns (includes guide) + (j_left, j_right) each with (x,y) = 21 dims
  • Code/checkpoint: model uses 25 dims = 21 button tokens + 4 joystick axes, where the extra 4 tokens are:
    RIGHT_UP, RIGHT_BOTTOM, RIGHT_LEFT, RIGHT_RIGHT

Questions:

  1. What is the canonical per-step action dim used by NitroGen (20/21/24/25)?
  2. How is the dataset 21-dim action mapped to the model 25-dim action (esp. discretization of right stick into RIGHT_* tokens: thresholds/dead-zone)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions