Refactor train and play scripts for single entry point by StafaH · Pull Request #5554 · isaac-sim/IsaacLab

StafaH · 2026-05-08T21:09:30Z

Description

This PR refactors the reinforcement learning train/play scripts into unified entrypoints while preserving the existing library folder structure and adding a lightweight uv workflow for fresh source checkouts.

The main changes are:

Added unified RL entrypoints:
- scripts/reinforcement_learning/train.py --library <library>
- scripts/reinforcement_learning/play.py --library <library>
Added library-specific implementation files under the existing library folders, for example:
- scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py
- scripts/reinforcement_learning/rsl_rl/play_rsl_rl.py
Kept the old per-library train.py and play.py scripts intact, with deprecation warnings and migration examples.
Added shared RL entrypoint utilities in scripts/reinforcement_learning/common.py.
Added direct Isaac Lab CLI commands:
- ./isaaclab.sh train --library <library> ...
- ./isaaclab.sh play --library <library> ...
Kept bare script aliases for ./isaaclab.sh -p train.py ... and ./isaaclab.sh -p play.py ....
Added Python package entry points so installed environments can run:
- train --library <library> ...
- play --library <library> ...
Added a root source-checkout pyproject.toml project so a fresh clone can run kitless Newton training with:
- uv run train --library rsl_rl --task Isaac-Cartpole-Direct-v0 presets=newton_mjwarp --num_envs 4096
Pinned the source-checkout uv environment to the same PyTorch family used by the Isaac Lab installer to avoid CUDA stack churn when users switch between uv run and ./isaaclab.sh.
Updated docs, tests, tools, and pretrained checkpoint helpers to use the unified train/play entrypoints.
Fixed CLI Python discovery so ./isaaclab.sh prefers an active or repo-local virtual environment before falling back to system Python.
Fixed ./isaaclab.sh --install in uv-created virtual environments that do not include the pip module by using uv pip for venv-targeted pip operations.

Motivation:

The previous RL scripts duplicated substantial train/play setup logic across libraries. This made behavior harder to keep consistent and increased maintenance cost when updating shared functionality. The new structure keeps library-specific logic in each library folder while centralizing shared dispatch and common helpers.

The uv workflow gives users a fast path from a fresh clone to kitless Newton training without manually creating an environment first. Isaac Sim / Kit workflows, including PhysX, continue to use the existing full installation path.

Dependencies:

No new required runtime dependencies are added to Isaac Lab packages. The root source-checkout pyproject.toml describes the local development environment used by uv run.

Fixes # N/A

Type of change

New feature (non-breaking change which adds functionality)
Documentation update

Screenshots

Not applicable.

Validation

Ran:

uv lock
uv lock --check
uv run --frozen train --help
./isaaclab.sh --install
./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py -q
./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py::TestGetPipCommand source/isaaclab/test/cli/test_install.py::TestExtractPythonExe -q
Help smoke tests for unified train/play entrypoints and ./isaaclab.sh -p train.py --help / ./isaaclab.sh -p play.py --help

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog
I have added my name to the CONTRIBUTORS.md or my name already exists there

greptile-apps · 2026-05-08T21:17:39Z

Greptile Summary

This PR centralizes the previously duplicated RL training/play entrypoints across six libraries (rl_games, rlinf, rsl_rl, sb3, skrl) into unified train.py / play.py scripts under scripts/reinforcement_learning/. Library-specific logic moves to new train_<lib>.py / play_<lib>.py files, the old per-library scripts get deprecation warnings, and the isaaclab.sh -p CLI gains train.py / play.py shorthands.

Unified dispatch (common.py + dispatch_library_entrypoint): train scripts are loaded via import_local_module and call run(argv), while play scripts are executed with runpy.run_path because they still carry module-level argparse code. This asymmetry is functional but undocumented.
CLI alias in cli/utils.py maps the bare names \"train.py\" / \"play.py\" to absolute paths; pretrained_checkpoint.py constants are updated to use the new unified paths with --library <lib> args injected by WORKFLOW_TRAINER_ARGS / WORKFLOW_PLAYER_ARGS.

Confidence Score: 4/5

The refactor is primarily additive — existing per-library scripts remain functional with deprecation warnings and the new unified entry-points replicate original logic faithfully.

The core dispatch logic, pretrained-checkpoint constant updates, and CLI alias tests are all consistent. The heavy module-level imports in common.py and the undocumented run_as_script API contract are design-level concerns rather than functional regressions on the standard train/play workflows.

scripts/reinforcement_learning/common.py for module-level import overhead and undocumented run_as_script contract; source/isaaclab/isaaclab/cli/utils.py for the Path normalization edge case in the alias lookup.

Important Files Changed

Filename	Overview
scripts/reinforcement_learning/common.py	New shared helper module; imports heavy packages (gymnasium, isaaclab.envs, isaaclab_tasks) at module level, adding startup overhead even for --help or bare library-dispatch paths.
scripts/reinforcement_learning/train.py	Unified train entrypoint; delegates to library-specific run() functions via dispatch_library_entrypoint (run_as_script=False).
scripts/reinforcement_learning/play.py	Unified play entrypoint; dispatches via runpy.run_path (run_as_script=True) to preserve module-level argparse in each play script.
source/isaaclab/isaaclab/cli/utils.py	Adds _PYTHON_SCRIPT_ALIASES dict and resolves bare train.py/play.py to full paths; Path normalization can cause Path('./train.py') to match the alias unexpectedly.
scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py	New library-specific train implementation; correctly exposes run(argv) and replicates all original rsl_rl/train.py functionality including external_callback, distributed, and IO descriptor support.
source/isaaclab_rl/isaaclab_rl/utils/pretrained_checkpoint.py	WORKFLOW_TRAINER/PLAYER updated to unified script paths; new WORKFLOW_TRAINER_ARGS/PLAYER_ARGS dicts correctly inject --library flag for each workflow.
source/isaaclab/test/cli/test_install.py	Well-structured tests for train.py/play.py shorthand resolution and explicit relative path passthrough; covers the key alias behaviors.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as "isaaclab.sh -p"
    participant utils as "cli/utils.py"
    participant train as "train.py"
    participant common as "common.py"
    participant lib as "train_lib.py"

    User->>CLI: "./isaaclab.sh -p train.py --library rsl_rl --task X"
    CLI->>utils: "run_python_command(train.py, args)"
    utils->>utils: "_PYTHON_SCRIPT_ALIASES lookup -> absolute path"
    utils->>train: "python .../train.py --library rsl_rl --task X"
    train->>common: "dispatch_library_entrypoint(argv, entrypoints)"
    common->>common: "parse --library rsl_rl"
    common->>lib: "import_local_module(train_rsl_rl, path)"
    lib-->>common: "module loaded (top-level code runs)"
    common->>lib: "module.run(library_args)"
    lib->>lib: "_parse_args + launch_simulation + runner.learn()"
    lib-->>User: "training complete"
    Note over common,lib: "Play path uses runpy.run_path() instead of import+run()"

_{Reviews (1): Last reviewed commit: "Refactor play scripts and create alias" | Re-trigger Greptile}

greptile-apps · 2026-05-08T21:17:42Z

+from __future__ import annotations
+
+import argparse
+import importlib.util
+import logging
+import os
+import runpy
+import sys
+from pathlib import Path
+from types import ModuleType
+from typing import Any
+
+import gymnasium as gym
+
+from isaaclab.envs import DirectMARLEnvCfg, ManagerBasedRLEnvCfg
+from isaaclab.utils.dict import print_dict
+from isaaclab.utils.io import dump_yaml
+


Heavy package imports at module level inflate startup time

common.py imports gymnasium, isaaclab.envs (two env base classes), isaaclab.utils.dict, isaaclab.utils.io, and isaaclab_tasks.utils unconditionally at import time. Since train.py and play.py both do from common import dispatch_library_entrypoint, these imports fire even when the user just runs train.py --help (no --library selected) or when any test imports the dispatch helper. Only dispatch_library_entrypoint itself is needed for routing; the remaining helpers (create_isaaclab_env, wrap_record_video, etc.) are only called from individual library scripts. Moving those heavyweight imports to be lazy (inside the functions that need them) or splitting dispatch_library_entrypoint into its own thin module would keep the help/routing path fast.

greptile-apps · 2026-05-08T21:17:43Z

+def dispatch_library_entrypoint(
+    argv: list[str] | None,
+    entrypoints: dict[str, Path],
+    *,
+    action: str,
+    description: str,
+    library_help: str,
+    run_as_script: bool = False,
+) -> int:
+    """Dispatch a unified entrypoint to a library-specific implementation.
+
+    Args:
+        argv: Command-line arguments, excluding the script path.
+        entrypoints: Mapping from library name to implementation path.
+        action: Action name used to create a unique module name.
+        description: Top-level parser description.
+        library_help: Help text for the ``--library`` argument.
+        run_as_script: Whether to execute the selected implementation as a script.
+
+    Returns:
+        Process exit code.
+    """
+    if argv is None:
+        argv = sys.argv[1:]
+
+    parser = argparse.ArgumentParser(add_help=False)
+    parser.add_argument("--library", choices=sorted(entrypoints))
+    args_cli, library_args = parser.parse_known_args(argv)
+
+    if args_cli.library is None:
+        help_parser = argparse.ArgumentParser(description=description)
+        help_parser.add_argument("--library", choices=sorted(entrypoints), required=True, help=library_help)
+        help_parser.add_argument("args", nargs=argparse.REMAINDER, help="Arguments forwarded to the selected library.")
+        help_parser.print_help()
+        return 0 if "-h" in argv or "--help" in argv else 2
+
+    module_path = entrypoints[args_cli.library]
+    if run_as_script:
+        original_argv = sys.argv
+        original_path = list(sys.path)
+        try:
+            sys.argv = [str(module_path)] + library_args
+            sys.path.insert(0, str(module_path.parent))
+            runpy.run_path(str(module_path), run_name="__main__")
+        finally:
+            sys.argv = original_argv
+            sys.path[:] = original_path
+        return 0
+
+    module = import_local_module(f"isaaclab_rl_{action}_{args_cli.library}", module_path)
+    module.run(library_args)
+    return 0
+
+
+def add_common_train_args(
+    parser: argparse.ArgumentParser,
+    *,
+    agent_default: str | None,
+    agent_help: str,
+    include_agent: bool = True,
+    include_distributed: bool = True,
+) -> None:


Undocumented API contract between dispatch strategy and script structure

dispatch_library_entrypoint supports two incompatible modes keyed on run_as_script:

False (train): loads the target file as a module and calls module.run(argv) — the file must expose a run() function and must not parse sys.argv at module level.

True (play): uses runpy.run_path(..., run_name="__main__") and sets sys.argv itself — the file must not have a run() function and must parse sys.argv at module level.

Neither mode is validated at call time. If a new contributor creates a play script that exposes run() and uses run_as_script=True, the dispatch silently ignores run(). Conversely, if a new train script parses sys.argv at module level, it will read the wrong argv. Adding an assertion or a clear note in the docstring about the run(argv) requirement for run_as_script=False would prevent future breakage.

greptile-apps · 2026-05-08T21:17:44Z

+    else:
+        script_or_module = _PYTHON_SCRIPT_ALIASES.get(str(script_or_module), script_or_module)


Path("./train.py") silently matches the alias while "./train.py" string does not

str(Path("./train.py")) normalises to "train.py" on all platforms, so a caller passing Path("./train.py") would be redirected to the global script, whereas the string "./train.py" correctly passes through. The test covers the string form but not the Path form. Guarding on len(Path(_key).parts) == 1 would make the intent explicit and consistent.

Suggested change

else:

script_or_module = _PYTHON_SCRIPT_ALIASES.get(str(script_or_module), script_or_module)

else:

# Only resolve bare filenames (e.g. "train.py"), not explicit paths such as

# "./train.py" or "/abs/path/train.py". Using Path(x).parts ensures a leading

# "./" or directory component prevents a match.

_key = str(script_or_module)

if len(Path(_key).parts) == 1:

script_or_module = _PYTHON_SCRIPT_ALIASES.get(_key, script_or_module)

isaaclab-review-bot

Review Summary

This is a substantial quality-of-life improvement for the RL workflow. The unified train/play entrypoints with --library dispatch remove significant duplication, and the uv run path for fresh clones is a great developer experience improvement.

Strengths

Clean dispatch pattern via dispatch_library_entrypoint — extensible for new libraries
common.py consolidates duplicated arg definitions and env creation logic well
Deprecation warnings on old scripts with clear migration guidance
_PYTHON_SCRIPT_ALIASES is a nice touch for isaaclab.sh -p train.py shorthand
Python discovery improvements (active venv → repo-local → kit) are sensible prioritization
Root pyproject.toml with uv sources/overrides is well-structured

Concerns

See inline comments for specific issues. High-level:

The common.py relative import (from common import ...) requires the script's parent dir to be on sys.path — this works when run directly but may fail in certain packaging scenarios.
Hard-pinning torch==2.10.0 in the root pyproject.toml will need frequent updates.
The run_as_script path in dispatch_library_entrypoint mutates sys.argv and sys.path which could have side effects if the dispatched script doesn't clean up properly.

isaaclab-review-bot · 2026-05-09T02:47:17Z

+
+from pathlib import Path
+
+from common import dispatch_library_entrypoint


🟡 Relative import fragility: from common import dispatch_library_entrypoint relies on the script being run from its parent directory or with sys.path containing the scripts/reinforcement_learning/ directory. This works for ./isaaclab.sh -p and python scripts/... but would break if someone tries to import this module from elsewhere.

Consider using a relative path-based import pattern:

import sys from pathlib import Path sys.path.insert(0, str(Path(__file__).resolve().parent)) from common import dispatch_library_entrypoint

Or make common.py a proper package (scripts/reinforcement_learning/__init__.py).

isaaclab-review-bot · 2026-05-09T02:47:17Z

+        return 0 if "-h" in argv or "--help" in argv else 2
+
+    module_path = entrypoints[args_cli.library]
+    if run_as_script:


🟡 sys.argv / sys.path mutation in run_as_script: Mutating global state (sys.argv, sys.path) and relying on runpy.run_path is fragile. If the dispatched script raises an exception, the finally block restores state, but if it calls sys.exit(), the restore runs but the process is exiting anyway.

More concerning: runpy.run_path executes in the caller's process — any global state changes in the dispatched script persist. This is fine for train scripts that exit, but if used in a library context it could leak state.

Consider documenting this behavior or adding a comment that run_as_script=True is only safe for scripts that are expected to terminate the process.

isaaclab-review-bot · 2026-05-09T02:47:17Z

+    """
+    python_path = Path(python_exe)
+    return (python_path.parent.parent / "pyvenv.cfg").is_file()
+


🟡 Side effect: os.environ["UV_PYTHON"] set globally: Setting UV_PYTHON as an environment variable affects all subsequent subprocess.run calls that use uv. If get_pip_command is called multiple times with different python_exe values, only the last one wins globally.

Consider returning the env var as part of the command context rather than setting it globally:

# Return (cmd, env_override) tuple, or document the mutation

Or at minimum, document that this function has a global side effect.

isaaclab-review-bot · 2026-05-09T02:47:17Z


+def train(args: list[str] | None = None) -> None:
+    """Run the unified reinforcement learning training script."""
+    if args is None:


💡 CLI dispatch pattern: The if sys.argv[1] == "train" / "play" check before argparse means these sub-commands bypass the normal argparse help system. Running isaaclab.sh --help would show the epilog mentioning train and play, but isaaclab.sh train --help delegates directly to the train script.

This is fine for now, but if more commands are added, consider using argparse subparsers or a proper CLI framework (click, typer) for the top-level dispatch.

isaaclab-review-bot · 2026-05-09T02:47:17Z


 """Script to train RL agent with RL-Games."""

+import warnings


💡 Deprecation warning fires at import time: Because warnings.warn(...) is at module level, it fires even if someone imports a function from this module without intending to run it as a script. This is probably fine since the old scripts are meant to be run, not imported — but worth noting for anything that does from scripts.reinforcement_learning.rl_games.train import main.

isaaclab-review-bot

🤖 Isaac Lab Review Bot

Summary

Well-structured refactoring that unifies RL train/play scripts behind a single dispatch entry point (train.py --library <lib> / play.py --library <lib>). The architecture is sound — clean separation between shared utilities (common.py), unified dispatchers, and library-specific implementations. Old scripts are preserved with deprecation warnings for backwards compatibility. A few design concerns around import fragility, namespace collision, and test coverage should be addressed.

Design Assessment

Acceptable, but import resolution relies on an implicit invariant. The unified dispatch pattern is the right approach. The from common import ... in library-specific implementations works because Python adds the script's directory to sys.path[0] when running train.py as a subprocess — but this invariant isn't explicitly enforced by import_local_module() or dispatch_library_entrypoint(). If these modules are ever imported from a different context (tests, REPL, another script), they'll fail with ModuleNotFoundError.

Findings

🟡 Warning: Generic train/play console_script names risk collision — source/isaaclab/setup.py

The entry points "train=isaaclab.cli:train" and "play=isaaclab.cli:play" are extremely generic names that could conflict with other ML packages (fairseq, lightning, etc.). Since isaaclab train / isaaclab play already works via the CLI dispatcher, the bare train/play entry points are redundant. Consider namespacing them to isaaclab-train/isaaclab-play, or removing them entirely.

🟡 Warning: Import resolution relies on implicit sys.path[0] invariant — scripts/reinforcement_learning/common.py

When dispatch_library_entrypoint() calls import_local_module() to load a library module (e.g., train_rsl_rl.py), the from common import ... statement in that module resolves ONLY because scripts/reinforcement_learning/ is already on sys.path[0] (set by Python when train.py runs as a subprocess). This is fragile — if dispatch_library_entrypoint is ever called from a different context (tests, REPL), it fails. Consider adding explicit path management:

rl_scripts_dir = str(Path(__file__).resolve().parent)
if rl_scripts_dir not in sys.path:
    sys.path.insert(0, rl_scripts_dir)

🟡 Warning: common.py has no unit tests — 12+ shared utility functions (dispatch_library_entrypoint, import_local_module, apply_env_overrides, validate_distributed_device, etc.) have zero direct test coverage. These are the foundation used by all 5 library implementations. A lightweight test_common.py with edge-case tests (missing --library, invalid module path, CPU+distributed rejection) would significantly reduce regression risk.

🔵 Suggestion: env.close() skipped on KeyboardInterrupt — scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py

The except KeyboardInterrupt: pass at the end of training skips env.close(). Consider using finally to ensure cleanup (logging final metrics, releasing GPU resources):

try:
    runner.learn(...)
    print(f"Training time: ...")
except KeyboardInterrupt:
    pass
finally:
    env.close()

Test Coverage

CLI wiring (alias resolution, command dispatch): ✅ Well-tested with mocks
common.py utility functions (12+ functions): ❌ No unit tests
Dispatch mechanism edge cases: ❌ Not directly tested
Deprecated script backwards compatibility: ❌ Not tested
Integration tests: ⚠️ Present but currently skipped

CI Status

Build Wheel: ✅ pass
Most checks: ⏳ pending

Verdict

Minor fixes needed

The architecture and implementation are solid for all standard execution paths. The dispatch pattern is clean, backwards compatibility is preserved, and the code is well-organized. Address the namespace collision concern (train/play console_scripts) and consider adding explicit sys.path management for robustness. Unit tests for common.py would be a valuable addition but aren't strictly blocking.

StafaH added 2 commits May 8, 2026 13:42

Refactor train and play scripts

be0b208

Refactor play scripts and create alias

a3e7889

StafaH requested review from Mayankm96, hhansen-bdai, jtigue-bdai, kellyguo11 and ooctipus as code owners May 8, 2026 21:09

github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team infrastructure labels May 8, 2026

greptile-apps Bot reviewed May 8, 2026

View reviewed changes

StafaH changed the title ~~Train script~~ Refactor train and play scripts for single entry point May 8, 2026

StafaH added 2 commits May 8, 2026 15:35

Refactor to use ./isaaclab train

fe6582c

Add uv run train entrypoint

05ef839

isaaclab-review-bot Bot reviewed May 9, 2026

View reviewed changes

Merge branch 'develop' into train_script

efa81f7

isaaclab-review-bot Bot reviewed May 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor train and play scripts for single entry point#5554

Refactor train and play scripts for single entry point#5554
StafaH wants to merge 5 commits intoisaac-sim:developfrom
StafaH:train_script

StafaH commented May 8, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 8, 2026

Uh oh!

greptile-apps Bot May 8, 2026

Uh oh!

greptile-apps Bot May 8, 2026

Uh oh!

greptile-apps Bot May 8, 2026

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		else:
		script_or_module = _PYTHON_SCRIPT_ALIASES.get(str(script_or_module), script_or_module)

-    else:
-        script_or_module = _PYTHON_SCRIPT_ALIASES.get(str(script_or_module), script_or_module)
+    else:
+        # Only resolve bare filenames (e.g. "train.py"), not explicit paths such as
+        # "./train.py" or "/abs/path/train.py".  Using Path(x).parts ensures a leading
+        # "./" or directory component prevents a match.
+        _key = str(script_or_module)
+        if len(Path(_key).parts) == 1:
+            script_or_module = _PYTHON_SCRIPT_ALIASES.get(_key, script_or_module)


		from pathlib import Path

		from common import dispatch_library_entrypoint


		"""Script to train RL agent with RL-Games."""

		import warnings

Conversation

StafaH commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Screenshots

Validation

Checklist

Uh oh!

greptile-apps Bot commented May 8, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

Review Summary

Strengths

Concerns

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

🤖 Isaac Lab Review Bot

Summary

Design Assessment

Findings

Test Coverage

CI Status

Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

StafaH commented May 8, 2026 •

edited

Loading