[CI] Cross-platform — Part 3: Windows workflow#5700
Conversation
There was a problem hiding this comment.
Review Summary
This PR adds experimental Windows CI jobs for perception smoke testing, building on the uv-based installation path. The approach is well-structured, with incremental jobs that progressively test more functionality.
✅ Strengths
- Incremental approach: The workflow builds from simple torch/scipy tests → IsaacLab core install → full perception smoke
- Cross-platform fix in
AssetConverterBase: Moving from hardcoded/tmp/IsaacLabtotempfile.gettempdir()is the correct POSIX/Windows-agnostic solution - Proper marker recognition in AppLauncher: The
windows_ciandarm_cimarkers are now correctly filtered fromsys.argv - Good use of
continue-on-error: true: Appropriate for experimental jobs that should not gate merges
🔍 Findings
1. Path handling in GITHUB_PATH append (Low Priority)
echo "$HOME\\.local\\bin" | Out-File -FilePath $env:GITHUB_PATH -Append -Encoding utf8The $HOME variable in PowerShell uses forward slashes internally, but .local\\bin uses backslashes. While PowerShell is generally tolerant, consider using Join-Path for consistency:
Join-Path $HOME ".local\\bin" | Out-File -FilePath $env:GITHUB_PATH -Append -Encoding utf82. Potential silent failure in perception smoke (Medium Priority)
The inline Python smoke script does not capture or report specific failure modes:
python perception_smoke.pyIf the script fails during Kit initialization (e.g., GPU detection issues), the error may be buried in the output. Consider adding explicit error handling:
import sys
try:
# ... existing code ...
print("perception smoke: PASS")
except Exception as e:
print(f"perception smoke: FAIL - {type(e).__name__}: {e}", file=sys.stderr)
sys.exit(1)This would make CI failure reasons more immediately visible.
3. Artifact retention note
7-day retention for artifacts is appropriate for experimental CI. Once this stabilizes, consider if longer retention is needed for debugging flaky Windows-specific failures.
📝 Documentation
The PR description and commit messages are excellent - each iteration is well-documented with clear rationale for changes. The .skip changelog file appropriately notes this is CI-only.
✅ Verdict
LGTM - The changes are well-structured for an experimental CI addition. The continue-on-error: true at both job and step level appropriately prevents this from blocking the PR workflow while still providing visibility into Windows perception test stability.
The AssetConverterBase fix for cross-platform temp directories is a clean, correct change that should be merged regardless of the CI experiment outcome.
Update (6811028): PR rebased and squashed to two clean commits:
9242498— Adds Windows + ARM pytest markers,AppLauncherargv filtering, andAssetConverterBasecross-platform temp dir fix6811028— Full Windows CI workflow with the 5-job pipeline
Content is unchanged from prior review. All previous findings remain valid optional recommendations. Approval stands.
Foundation for cross-platform CI. Registers four pytest markers (windows, windows_ci, arm, arm_ci), teaches AppLauncher to recognize them in argv so they do not leak into Isaac Sim's argparse, and moves the AssetConverterBase USD scratch directory from a hardcoded /tmp/IsaacLab to tempfile.gettempdir() for cross-platform compatibility. Tags source/isaaclab/test/deps/test_torch.py and test_scipy.py with the new markers so they are selectable by future cross-platform jobs. Workflow files (arm-ci.yaml, windows-ci.yaml) ship in follow-up PRs.
d9d7dac to
d25f9bb
Compare
Same shape as arm-ci.yaml but the install path is native pip + uv on the Windows host (no Docker for Linux-based Isaac Sim wheels). Jobs (all continue-on-error: true): Tier 1 — general-windows, install-windows, kit-launch-windows Tier 2 — path-io-windows, perception-windows Every pytest invocation passes --timeout=N + --timeout-method=thread (signal is unavailable on Windows) plus --continue-on-collection-errors so a hung test cannot consume the full job slot and a broken neighbor file does not poison the marker-driven discovery. perception-windows wraps the cartpole-camera smoke in an inline Python script with explicit assertions and an inner watchdog thread that aborts the process after 180s. This replaces the previous pattern where Vulkan init failures hung the job instead of erroring. Tags four path-IO test files (test_configclass, test_dict, test_episode_data, test_hdf5_dataset_file_handler) with the windows_ci marker so path-io-windows picks them up via marker-driven discovery.
d25f9bb to
6811028
Compare
Summary
Part 3 of the cross-platform CI series. Adds
.github/workflows/windows-ci.yaml— CI pipeline for Windows GPU self-hosted runners. Same shape asarm-ci.yaml(Part 2, #5698) but native install path instead of Docker.general-windows,install-windows(uv install + wheel build + reinstall),kit-launch-windows.path-io-windows,perception-windows, ...continue-on-error: truewhile runners stabilize.--timeout=N+--timeout-method=signalso hung tests fail fast (fixes the previous perception-windows pattern where Vulkan failures hung the job instead of erroring).pytest <root> -m windows_ci --continue-on-collection-errors. Adding a new Windows-safe test requires only tagging itwindows_ci, no yaml edit.Series
PRs prefixed with
[CI] Cross-platform —. Current siblings:Depends on Part 1 (#5695). Independent of Part 2 (#5698).
Test plan
./isaaclab.sh -f(pre-commit) passes.