Deploy trained OpenPi policies on ARX-X5 dual-arm robots (RealSense cameras, ROS2). Inference runs by sending observations to a policy server (on a GPU host) and applying the returned action chunks on the robot, with optional temporal smoothing [2] or RTC (real-time chunking) [1].
Follow the official ARX-X5 setup so arms and cameras work correctly:
- ARX_X5 (ARX Robotics)
Clone the repo, follow the docs for:- ROS2 workspace (e.g.
ROS2/X5_ws), building packages - Arm control and feedback topics
- RealSense camera setup
- ROS2 workspace (e.g.
After the ARX-X5 and robot hardware are set up, install the Python 3.10 inference environment (below) and build the bimanual package in this repo.
On the machine that runs the inference client (IPC or same host as ROS2), use a dedicated Python 3.10 environment.
conda create -n kai0_inference python=3.10
conda activate kai0_inferenceInstall PyTorch (if needed for local use), OpenCV, NumPy, and other deps. From the repository root you can reuse the Agilex IPC requirements if present:
pip install -r train_deploy_alignment/inference/agilex/requirements_inference_ipc.txt(or install opencv-python, numpy, pyrealsense2, etc. as needed.)
From the repository root:
cd packages/openpi-client
pip install -e .
cd ../..The inference scripts use openpi_client to talk to the policy server.
From the ARX inference directory:
cd train_deploy_alignment/inference/arx
./build.shThis builds the bimanual package (e.g. cd bimanual && ./build.sh). Ensure your environment can load the built libraries (e.g. LD_LIBRARY_PATH as in setup.sh).
Source your ROS2 workspace and ensure the arx5_arm_msg (or equivalent) package is built and sourced so that RobotStatus / RobotCmd are available. The inference scripts import arx5_arm_msg.msg.
Inference uses two machines: a GPU host (policy server) and the client machine (ROS2 + inference script). Follow the order below.
From the repository root on the GPU machine:
uv run scripts/serve_policy.py policy:checkpoint --policy.config=<train_config> --policy.dir=<checkpoint_dir> [--port=8000]Use the same training config and checkpoint as your trained model. For RTC inference, use an RTC config (e.g. pi0_rtc_aloha_sim or pi05_rtc_flatten_fold_inference); see RTC mode below.
Before running inference (or DAGGER), CAN must be configured and up (per the ARX official repo; this repo does not provide CAN scripts), and you must enable both master and slave arms. Order:
-
Ensure CAN is configured and up (follow ARX official ARX_X5 / ARX_CAN setup).
-
Enable both master and slave arms. Start the master and slave controller nodes (e.g. in separate terminals, or use the DAGGER arx_start.sh in dagger/arx to start both):
ros2 launch arx_x5_controller open_remote_master.launch.py ros2 launch arx_x5_controller open_remote_slave.launch.py
Wait until nodes are up (e.g.
ros2 node listshows the arm nodes). -
Source ROS2 and (if needed) your conda env:
source /path/to/ros2_ws/install/setup.bash conda activate <your_inference_env>
-
Source ARX setup (for
LD_LIBRARY_PATHso bimanual libs load). From the arx directory:cd train_deploy_alignment/inference/arx source setup.sh
-
Run the inference script from the arx directory, with
--hostset to the GPU host IP:cd train_deploy_alignment/inference/arx source setup.sh cd inference python arx_openpi_inference_rtc.py --host <policy_server_ip> --port 8000 --rtc_mode --chunk_size 50
Or run another script (see Inference scripts below). Replace
<policy_server_ip>with your policy server IP.
RTC stands for real-time chunking [1]. For RTC inference, the policy server must load the RTC model (Pi0RTC). Start the server with an RTC config, e.g.:
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi0_rtc_aloha_sim --policy.dir=<path_to_jax_checkpoint> [--port=8000]Then run the ARX inference script with --rtc_mode (e.g. arx_openpi_inference_rtc.py --rtc_mode). RTC uses JAX checkpoints only.
Set the language prompt in the inference script to match training. In the scripts, set the global lang_embeddings at the top (e.g. lang_embeddings = "hang the cloth"). For AWBC-trained models, use the same advantage format as in stage_advantage (e.g. "<task>, Advantage: positive").
Run from train_deploy_alignment/inference/arx after source setup.sh, then cd inference:
| Script | Description | Example command |
|---|---|---|
inference/arx_openpi_inference_rtc.py |
RTC [1] with --rtc_mode; without it, same as temporal smoothing. Server must use RTC config for RTC. |
python arx_openpi_inference_rtc.py --host <IP> --port 8000 --rtc_mode --chunk_size 50 |
inference/arx_openpi_inference_temporal_smooth.py |
Temporal smoothing; async inference + stream buffer. | python arx_openpi_inference_temporal_smooth.py --host <IP> --port 8000 |
inference/arx_openpi_inference_sync.py |
Sync: blocking infer every chunk, then execute step-by-step (like Agilex sync). | python arx_openpi_inference_sync.py --host <IP> --port 8000 |
inference/arx_openpi_inference_temporal_ensembling.py |
Temporal ensembling [2]: --smooth_method naive_async or temporal_ensembling, --exp_weight_m for aggregation. |
python arx_openpi_inference_temporal_ensembling.py --host <IP> --smooth_method temporal_ensembling --exp_weight_m 0.01 |
--host: GPU host IP.--port: server port (default 8000).lang_embeddings: Set in the script (or inarx_openpi_inference_rtcfor scripts that import it) to match training.
-
Black, K., Galliker, M. Y., & Levine, S. (2025). Real-Time Execution of Action Chunking Flow Policies. arXiv preprint arXiv:2506.07339.
-
Zhao, T. Z., Kumar, V., Levine, S., & Finn, C. (2023). Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705.
BibTeX:
@misc{black2025realtime,
author = {Black, Kevin and Galliker, Manuel Y. and Levine, Sergey},
title = {Real-Time Execution of Action Chunking Flow Policies},
year = {2025},
eprint = {2506.07339},
archivePrefix = {arXiv},
primaryClass = {cs}
}
@misc{zhao2023learning,
author = {Zhao, Tony Z. and Kumar, V. and Levine, Sergey and Finn, Chelsea},
title = {Learning fine-grained bimanual manipulation with low-cost hardware},
year = {2023},
eprint = {2304.13705},
archivePrefix = {arXiv},
primaryClass = {cs}
}