Skip to content

Commit ca8fa07

Browse files
committed
[CI] Add ARM/Spark CI workflow
Mirrors build.yaml's spirit but stays minimal for the aarch64 path: Tier 1 (gates none — continue-on-error): general-arm, install-arm, kit-launch-arm Tier 2 (meaningful, marker-filtered): kitless-arm, determinism-arm Every job sets continue-on-error: true while the aarch64 runner setup stabilizes. Every pytest invocation passes --timeout=N --timeout-method=signal so a single hung test cannot consume the whole job slot. Inline scripts use set -e to fail on the first nonzero return. Tags three test_rendering_*_kitless.py files plus test_differential_ik.py and test_operational_space.py with the arm_ci marker so the Tier 2 jobs can select them via pytest -m arm_ci.
1 parent 9242498 commit ca8fa07

7 files changed

Lines changed: 358 additions & 3 deletions

File tree

.github/workflows/arm-ci.yaml

Lines changed: 350 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,350 @@
1+
# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2+
# All rights reserved.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
6+
# ARM/Spark CI — exercises Isaac Lab on aarch64 Linux self-hosted runners
7+
# (NVIDIA DGX Spark). Mirrors the spirit of build.yaml but stays minimal:
8+
# Tier 1 (smoke + install): general-arm, install-arm, kit-launch-arm
9+
# Tier 2 (meaningful, marker-filtered): kitless-arm, determinism-arm
10+
#
11+
# Every job sets `continue-on-error: true` until the aarch64 runner setup
12+
# is bulletproof. Every pytest invocation passes `--timeout=N` (pytest-timeout
13+
# plugin) so a single hung test cannot consume the whole 120-minute job slot.
14+
15+
name: ARM CI
16+
17+
on:
18+
pull_request:
19+
types: [opened, synchronize, reopened]
20+
branches:
21+
- main
22+
- develop
23+
- 'release/**'
24+
push:
25+
branches:
26+
- main
27+
- develop
28+
- 'release/**'
29+
workflow_dispatch:
30+
31+
concurrency:
32+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
33+
cancel-in-progress: true
34+
35+
permissions:
36+
contents: read
37+
pull-requests: write
38+
checks: write
39+
40+
jobs:
41+
changes:
42+
name: Detect Changes
43+
runs-on: ubuntu-latest
44+
outputs:
45+
run_arm_ci: ${{ steps.detect.outputs.run_arm_ci }}
46+
steps:
47+
- id: detect
48+
env:
49+
GH_TOKEN: ${{ github.token }}
50+
PR_NUMBER: ${{ github.event.pull_request.number }}
51+
EVENT_NAME: ${{ github.event_name }}
52+
REPO: ${{ github.repository }}
53+
run: |
54+
set -euo pipefail
55+
# ARM CI runs only when paths in the patterns table change.
56+
# Otherwise jobs skip via `if:` and report green to branch protection.
57+
patterns=(
58+
$'^source/\tLibrary source code'
59+
$'^tools/\tBuild tooling'
60+
$'^apps/\tStandalone apps'
61+
$'(^|/)pyproject\\.toml$\tPython project metadata'
62+
$'^\\.github/workflows/arm-ci\\.yaml$\tThis workflow file'
63+
$'^VERSION$\tVersion file'
64+
)
65+
any_match() {
66+
local files="$1" entry regex
67+
for entry in "${patterns[@]}"; do
68+
IFS=$'\t' read -r regex _ <<< "$entry"
69+
if grep -qE "$regex" <<< "$files"; then
70+
return 0
71+
fi
72+
done
73+
return 1
74+
}
75+
if [ "$EVENT_NAME" != "pull_request" ]; then
76+
echo "run_arm_ci=true" >> "$GITHUB_OUTPUT"
77+
exit 0
78+
fi
79+
changed_files="$(gh api --paginate "repos/$REPO/pulls/$PR_NUMBER/files" --jq '.[].filename' || true)"
80+
if [ -z "$changed_files" ] || any_match "$changed_files"; then
81+
echo "run_arm_ci=true" >> "$GITHUB_OUTPUT"
82+
else
83+
echo "run_arm_ci=false" >> "$GITHUB_OUTPUT"
84+
fi
85+
86+
# Tier 1: dependency smoke. No isaaclab install, just torch + scipy import + simple ops.
87+
general-arm:
88+
name: general-arm
89+
needs: [changes]
90+
if: needs.changes.outputs.run_arm_ci == 'true'
91+
runs-on: [self-hosted, arm64]
92+
timeout-minutes: 30
93+
continue-on-error: true
94+
steps:
95+
- name: Checkout
96+
uses: actions/checkout@v4
97+
with:
98+
fetch-depth: 1
99+
lfs: false
100+
101+
- name: Setup env
102+
shell: bash
103+
run: |
104+
set -euo pipefail
105+
[ -d env_isaaclab ] || python3 -m venv env_isaaclab
106+
# shellcheck disable=SC1091
107+
source env_isaaclab/bin/activate
108+
python -m pip install --upgrade pip
109+
pip install pytest pytest-timeout scipy numpy
110+
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu130
111+
112+
- name: Run smoke tests
113+
shell: bash
114+
run: |
115+
set -e
116+
# shellcheck disable=SC1091
117+
source env_isaaclab/bin/activate
118+
mkdir -p reports
119+
export PYTHONUNBUFFERED=1
120+
# --timeout=60: each test must complete in 60s (smoke tests are tiny).
121+
# --timeout-method=signal: kill hung tests with SIGALRM, no waiting.
122+
# --continue-on-collection-errors: broken imports in unrelated files do not
123+
# poison the job; pytest still runs the arm_ci-tagged tests it found.
124+
# Marker-driven discovery: any test under source/isaaclab/test/deps tagged
125+
# with arm_ci is auto-picked. Adding a new smoke dep test = tag it, no yaml edit.
126+
# We do NOT use -x: want full coverage even on partial failure.
127+
python -m pytest \
128+
source/isaaclab/test/deps \
129+
--ignore=tools/conftest.py \
130+
-m arm_ci \
131+
--continue-on-collection-errors \
132+
--timeout=60 \
133+
--timeout-method=signal \
134+
-v \
135+
--junitxml=reports/general-arm.xml
136+
137+
- name: Upload results
138+
if: always()
139+
uses: actions/upload-artifact@v4
140+
with:
141+
name: general-arm-report
142+
path: reports/general-arm.xml
143+
retention-days: 7
144+
145+
# Tier 1: install probe. Uses uv to create env, install isaaclab editable, smoke-import.
146+
install-arm:
147+
name: install-arm
148+
needs: [changes]
149+
if: needs.changes.outputs.run_arm_ci == 'true'
150+
runs-on: [self-hosted, arm64]
151+
timeout-minutes: 45
152+
continue-on-error: true
153+
steps:
154+
- name: Checkout
155+
uses: actions/checkout@v4
156+
with:
157+
fetch-depth: 1
158+
lfs: false
159+
160+
- name: Install uv
161+
shell: bash
162+
run: |
163+
set -euo pipefail
164+
if ! command -v uv >/dev/null 2>&1; then
165+
curl -LsSf https://astral.sh/uv/install.sh | sh
166+
fi
167+
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
168+
169+
- name: uv venv + editable install + smoke import
170+
shell: bash
171+
timeout-minutes: 30
172+
run: |
173+
set -e
174+
uv venv --python 3.12 env_isaaclab_uv
175+
# shellcheck disable=SC1091
176+
source env_isaaclab_uv/bin/activate
177+
uv pip install --no-build-isolation -e source/isaaclab
178+
uv pip install --no-build-isolation -e source/isaaclab_assets
179+
uv pip install --no-build-isolation -e source/isaaclab_tasks
180+
# 30 sec import smoke; if hung, the timeout-minutes above will kill it.
181+
timeout 30 python -c "import isaaclab, isaaclab_assets, isaaclab_tasks; print('imports ok')"
182+
183+
# Tier 1: Kit launch. Validates aarch64 Isaac Sim wheels load Kit cleanly.
184+
kit-launch-arm:
185+
name: kit-launch-arm
186+
needs: [changes]
187+
if: needs.changes.outputs.run_arm_ci == 'true'
188+
runs-on: [self-hosted, arm64]
189+
timeout-minutes: 30
190+
continue-on-error: true
191+
steps:
192+
- name: Checkout
193+
uses: actions/checkout@v4
194+
with:
195+
fetch-depth: 1
196+
lfs: false
197+
198+
- name: Install uv
199+
shell: bash
200+
run: |
201+
set -euo pipefail
202+
if ! command -v uv >/dev/null 2>&1; then
203+
curl -LsSf https://astral.sh/uv/install.sh | sh
204+
fi
205+
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
206+
207+
- name: Install isaacsim + isaaclab and boot Kit headless
208+
shell: bash
209+
timeout-minutes: 20
210+
run: |
211+
set -e
212+
uv venv --python 3.12 env_isaaclab_uv
213+
# shellcheck disable=SC1091
214+
source env_isaaclab_uv/bin/activate
215+
uv pip install --no-build-isolation -e source/isaaclab
216+
# aarch64 Sim wheel from pypi.nvidia.com (6.0.0.0-cp312-aarch64).
217+
uv pip install --extra-index-url https://pypi.nvidia.com isaacsim
218+
# Boot Kit headless and exit cleanly. 90 sec hard cap.
219+
# Any non-zero return code (crash, hang killed by timeout, etc.) fails the step.
220+
timeout 90 python - <<'EOF'
221+
from isaaclab.app import AppLauncher
222+
import sys
223+
224+
app_launcher = AppLauncher(headless=True)
225+
sim = app_launcher.app
226+
assert sim is not None, "AppLauncher did not return a SimulationApp"
227+
sim.close()
228+
sys.exit(0)
229+
EOF
230+
231+
# Tier 2: kitless rendering tests. Exercises Warp aarch64 codegen + OvRTX-on-GPU.
232+
kitless-arm:
233+
name: kitless-arm
234+
needs: [changes]
235+
if: needs.changes.outputs.run_arm_ci == 'true'
236+
runs-on: [self-hosted, arm64]
237+
timeout-minutes: 45
238+
continue-on-error: true
239+
steps:
240+
- name: Checkout
241+
uses: actions/checkout@v4
242+
with:
243+
fetch-depth: 1
244+
lfs: true
245+
246+
- name: Install uv
247+
shell: bash
248+
run: |
249+
set -euo pipefail
250+
if ! command -v uv >/dev/null 2>&1; then
251+
curl -LsSf https://astral.sh/uv/install.sh | sh
252+
fi
253+
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
254+
255+
- name: Install isaaclab + run kitless rendering tests
256+
shell: bash
257+
timeout-minutes: 35
258+
run: |
259+
set -e
260+
uv venv --python 3.12 env_isaaclab_uv
261+
# shellcheck disable=SC1091
262+
source env_isaaclab_uv/bin/activate
263+
uv pip install --no-build-isolation -e source/isaaclab
264+
uv pip install --no-build-isolation -e source/isaaclab_assets
265+
uv pip install --no-build-isolation -e source/isaaclab_tasks
266+
uv pip install pytest pytest-timeout
267+
mkdir -p reports
268+
export PYTHONUNBUFFERED=1
269+
# --timeout=300: 5 min per test; renders should complete fast on Spark GPU.
270+
# --continue-on-collection-errors: a broken import in an unrelated test
271+
# file does not poison the job; pytest still runs the arm_ci-tagged tests.
272+
# Marker-driven discovery: any test under isaaclab_tasks/test tagged with
273+
# arm_ci is picked up automatically. Adding new aarch64-safe rendering
274+
# tests later requires only tagging them with arm_ci, no yaml edit.
275+
python -m pytest \
276+
source/isaaclab_tasks/test \
277+
--ignore=tools/conftest.py \
278+
-m arm_ci \
279+
--continue-on-collection-errors \
280+
--timeout=300 \
281+
--timeout-method=signal \
282+
-v \
283+
--junitxml=reports/kitless-arm.xml
284+
285+
- name: Upload results
286+
if: always()
287+
uses: actions/upload-artifact@v4
288+
with:
289+
name: kitless-arm-report
290+
path: reports/kitless-arm.xml
291+
retention-days: 7
292+
293+
# Tier 2: numerical determinism on controllers. aarch64 FP rounding surface.
294+
determinism-arm:
295+
name: determinism-arm
296+
needs: [changes]
297+
if: needs.changes.outputs.run_arm_ci == 'true'
298+
runs-on: [self-hosted, arm64]
299+
timeout-minutes: 30
300+
continue-on-error: true
301+
steps:
302+
- name: Checkout
303+
uses: actions/checkout@v4
304+
with:
305+
fetch-depth: 1
306+
lfs: false
307+
308+
- name: Install uv
309+
shell: bash
310+
run: |
311+
set -euo pipefail
312+
if ! command -v uv >/dev/null 2>&1; then
313+
curl -LsSf https://astral.sh/uv/install.sh | sh
314+
fi
315+
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
316+
317+
- name: Install isaaclab + run controller determinism tests
318+
shell: bash
319+
timeout-minutes: 20
320+
run: |
321+
set -e
322+
uv venv --python 3.12 env_isaaclab_uv
323+
# shellcheck disable=SC1091
324+
source env_isaaclab_uv/bin/activate
325+
uv pip install --no-build-isolation -e source/isaaclab
326+
uv pip install pytest pytest-timeout
327+
mkdir -p reports
328+
export PYTHONUNBUFFERED=1
329+
# --timeout=180: 3 min per controller test (IK solves are fast).
330+
# --continue-on-collection-errors: tolerate broken neighbors during collection.
331+
# Marker-driven discovery: tag a controller / math / utils test with arm_ci
332+
# anywhere under source/isaaclab/test and it gets picked up here.
333+
python -m pytest \
334+
source/isaaclab/test \
335+
--ignore=tools/conftest.py \
336+
--ignore=source/isaaclab/test/deps \
337+
-m arm_ci \
338+
--continue-on-collection-errors \
339+
--timeout=180 \
340+
--timeout-method=signal \
341+
-v \
342+
--junitxml=reports/determinism-arm.xml
343+
344+
- name: Upload results
345+
if: always()
346+
uses: actions/upload-artifact@v4
347+
with:
348+
name: determinism-arm-report
349+
path: reports/determinism-arm.xml
350+
retention-days: 7
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Skip changelog: CI-infrastructure only (no user-facing API change). Adds .github/workflows/arm-ci.yaml carrying the ARM/Spark CI pipeline against self-hosted [self-hosted, arm64] runners. Tier 1 (smoke, install probe, Kit launch) plus Tier 2 (kitless rendering, controller determinism). All jobs use continue-on-error: true and pytest --timeout to fail fast on hangs. Tags three test_rendering_*_kitless.py files plus test_differential_ik.py / test_operational_space.py with arm_ci so the Tier 2 jobs can select them.

source/isaaclab/test/controllers/test_differential_ik.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515
import pytest
1616
import torch
1717

18+
pytestmark = pytest.mark.arm_ci
19+
1820
import isaaclab.sim as sim_utils
1921
from isaaclab import cloner
2022
from isaaclab.assets import Articulation

source/isaaclab/test/controllers/test_operational_space.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
import torch
1717
from flaky import flaky
1818

19+
pytestmark = pytest.mark.arm_ci
20+
1921
import isaaclab.envs.mdp as mdp
2022
import isaaclab.sim as sim_utils
2123
from isaaclab import cloner

source/isaaclab_tasks/test/test_rendering_cartpole_kitless.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
rendering_test_cartpole,
1818
)
1919

20-
pytestmark = pytest.mark.isaacsim_ci
20+
pytestmark = [pytest.mark.isaacsim_ci, pytest.mark.arm_ci]
2121

2222
_COMPARISON_SCORES: list[dict] = []
2323

source/isaaclab_tasks/test/test_rendering_dexsuite_kuka_kitless.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
rendering_test_dexsuite_kuka,
1818
)
1919

20-
pytestmark = pytest.mark.isaacsim_ci
20+
pytestmark = [pytest.mark.isaacsim_ci, pytest.mark.arm_ci]
2121

2222
_COMPARISON_SCORES: list[dict] = []
2323

source/isaaclab_tasks/test/test_rendering_shadow_hand_kitless.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
rendering_test_shadow_hand,
1818
)
1919

20-
pytestmark = pytest.mark.isaacsim_ci
20+
pytestmark = [pytest.mark.isaacsim_ci, pytest.mark.arm_ci]
2121

2222
_COMPARISON_SCORES: list[dict] = []
2323

0 commit comments

Comments
 (0)