Add OSMO AMR Navigation test case#1018
Conversation
Adds a complete NVIDIA OSMO test case for warehouse AMR (Autonomous Mobile Robot) navigation using a MobilityGen-style pipeline on Amazon EKS. The pipeline includes 6 stages: scene setup, occupancy mapping, trajectory generation, multi-modal rendering (parallel RGB/depth/segmentation), domain augmentation, and X-Mobility foundation model training. Uses OSMO DAG orchestration, KAI Scheduler, and heterogeneous compute (G-series for rendering, P-series for training).
bluecrayon52
left a comment
There was a problem hiding this comment.
I'm getting a page not found error in the NVIDIA OSMO installation guide link provided in the AMRNavigation/kubernetes/README.md.
It looks like Nvidia provides an AWS Infra with Terraform guide, did you use something like this to deploy the EKS cluster?
I'm ok with the Terraform infrastructure being outside of the scope of the test case and externally referenceable to a stack that Nvidia maintains.
Does it make sense to also include:
- Links or Helm commands for installing the in-cluster prerequisites (GPU operator, KAI scheduler, Karpenter, OSMO)
- YAML manifests for the required Karpenter NodePools that the
4.verify-osmo.shscript checks for with instructions on how to apply (osmo-rendering, osmo-gpu-training, osmo-cpu-batch, osmo-cpu-system) - Instructions for loading the nvidia/X-Mobility dataset from HuggingFace to S3
If you plan to cut another PR to provide architecture setup automations, then perhaps we can link to those resources in the prerequisites section of this test case?
| args: | ||
| - | | ||
| /scripts/stage6_train_evaluate.py \ | ||
| --data_dir {{output}}/data \ |
There was a problem hiding this comment.
| --data_dir {{output}}/data \ | |
| --dataset_dir {{output}}/data \ |
|
|
||
| ### OSMO Control Plane Prerequisites | ||
|
|
||
| The pipeline requires OSMO control plane components. Follow the [NVIDIA OSMO installation guide](https://docs.nvidia.com/osmo/) to deploy OSMO on your EKS cluster, then verify readiness: |
There was a problem hiding this comment.
| The pipeline requires OSMO control plane components. Follow the [NVIDIA OSMO installation guide](https://docs.nvidia.com/osmo/) to deploy OSMO on your EKS cluster, then verify readiness: | |
| The pipeline requires OSMO control plane components. Follow the [NVIDIA OSMO installation guide](https://nvidia.github.io/OSMO/main/deployment_guide/index.html) to deploy OSMO on your EKS cluster, then verify readiness: |
yoosful
left a comment
There was a problem hiding this comment.
Reproduced on EKS (OSMO v6.2.10, L4 GPU); a few issues inline.
| --scene_dir {{input:0}}/scene \ | ||
| --trajectory_dir {{input:1}}/trajectories \ | ||
| --output_dir {{output}}/rgb \ | ||
| --render_mode rgb \ |
There was a problem hiding this comment.
stage4_render.py has no --render_mode arg. This task fails with unrecognized arguments: --render_mode rgb. Same for render-depth and render-segmentation.
The script uses BasicWriter which emits rgb+depth+seg in one pass, so splitting into 3 tasks re-renders the same frames 3x for identical output. Either add real --render_mode filtering to the script, or collapse Group 4 to a single render task.
Minimal unblocker (just removes the unknown arg; doesn't fix the 3x re-render):
| --render_mode rgb \ |
| args: | ||
| - | | ||
| /isaac-sim/scripts/stage1_scene_setup.py \ | ||
| --output_dir {{output}}/scene \ |
There was a problem hiding this comment.
--s3_bucket / --run_id missing on every stage, but all stage*.py scripts declare them required=True. Workflow argparse-fails on stage 1.
| --output_dir {{output}}/scene \ | |
| --s3_bucket ${S3_BUCKET} \ | |
| --run_id ${RUN_ID} \ | |
| --output_dir {{output}}/scene \ |
Same missing args on stages 2-5; apply the pattern there too.
| writer.initialize( | ||
| output_dir=output_dir, | ||
| rgb=True, | ||
| depth=True, | ||
| semantic_segmentation=True, | ||
| ) |
There was a problem hiding this comment.
On Replicator 1.11, CosmosWriter takes output_dir only. rgb / depth / semantic_segmentation kwargs raise TypeError. The except Exception silently falls back to BasicWriter every run, so this branch is dead. Either fix the call or drop the Cosmos path.
| writer.initialize( | |
| output_dir=output_dir, | |
| rgb=True, | |
| depth=True, | |
| semantic_segmentation=True, | |
| ) | |
| writer.initialize(output_dir=output_dir) |
| --input_dir {{input:0}}/rgb \ | ||
| --output_dir {{output}}/augmented |
There was a problem hiding this comment.
stage5_domain_augment.py reads raw-v1/ and writes augmented-v2/ (via make_stage_path), not rgb/ / augmented/. README's S3 layout example has the same mismatch.
| --input_dir {{input:0}}/rgb \ | |
| --output_dir {{output}}/augmented | |
| --input_dir {{input:0}}/raw-v1 \ | |
| --output_dir {{output}}/augmented-v2 |
Summary
3.test_cases/osmo/AMRNavigation/for warehouse AMR (Autonomous Mobile Robot) navigation synthetic data generation and trainingContents
src/): Full Python implementation for each stagekubernetes/): setup, build, verify, submitKey OSMO features demonstrated
inputs:Test plan