Skip to content

[CI] Cross-platform — Part 3: Windows workflow#5700

Draft
hujc7 wants to merge 2 commits into
isaac-sim:developfrom
hujc7:jichuanh/windows-spark-ci-perception
Draft

[CI] Cross-platform — Part 3: Windows workflow#5700
hujc7 wants to merge 2 commits into
isaac-sim:developfrom
hujc7:jichuanh/windows-spark-ci-perception

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 20, 2026

Summary

Part 3 of the cross-platform CI series. Adds .github/workflows/windows-ci.yaml — CI pipeline for Windows GPU self-hosted runners. Same shape as arm-ci.yaml (Part 2, #5698) but native install path instead of Docker.

  • Tier 1 (smoke + install): general-windows, install-windows (uv install + wheel build + reinstall), kit-launch-windows.
  • Tier 2 (meaningful, marker-driven): path-io-windows, perception-windows, ...
  • All jobs continue-on-error: true while runners stabilize.
  • All pytest invocations use --timeout=N + --timeout-method=signal so hung tests fail fast (fixes the previous perception-windows pattern where Vulkan failures hung the job instead of erroring).
  • Marker-driven discovery: pytest <root> -m windows_ci --continue-on-collection-errors. Adding a new Windows-safe test requires only tagging it windows_ci, no yaml edit.

Series

PRs prefixed with [CI] Cross-platform —. Current siblings:

Depends on Part 1 (#5695). Independent of Part 2 (#5698).

Test plan

  • ./isaaclab.sh -f (pre-commit) passes.
  • windows-ci.yaml triggers on PR push; all jobs run on Windows runners.
  • perception-windows fails fast on Vulkan/runtime errors instead of hanging.

@github-actions github-actions Bot added isaac-lab Related to Isaac Lab team infrastructure labels May 20, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR adds experimental Windows CI jobs for perception smoke testing, building on the uv-based installation path. The approach is well-structured, with incremental jobs that progressively test more functionality.

✅ Strengths

  1. Incremental approach: The workflow builds from simple torch/scipy tests → IsaacLab core install → full perception smoke
  2. Cross-platform fix in AssetConverterBase: Moving from hardcoded /tmp/IsaacLab to tempfile.gettempdir() is the correct POSIX/Windows-agnostic solution
  3. Proper marker recognition in AppLauncher: The windows_ci and arm_ci markers are now correctly filtered from sys.argv
  4. Good use of continue-on-error: true: Appropriate for experimental jobs that should not gate merges

🔍 Findings

1. Path handling in GITHUB_PATH append (Low Priority)

echo "$HOME\\.local\\bin" | Out-File -FilePath $env:GITHUB_PATH -Append -Encoding utf8

The $HOME variable in PowerShell uses forward slashes internally, but .local\\bin uses backslashes. While PowerShell is generally tolerant, consider using Join-Path for consistency:

Join-Path $HOME ".local\\bin" | Out-File -FilePath $env:GITHUB_PATH -Append -Encoding utf8

2. Potential silent failure in perception smoke (Medium Priority)

The inline Python smoke script does not capture or report specific failure modes:

python perception_smoke.py

If the script fails during Kit initialization (e.g., GPU detection issues), the error may be buried in the output. Consider adding explicit error handling:

import sys
try:
    # ... existing code ...
    print("perception smoke: PASS")
except Exception as e:
    print(f"perception smoke: FAIL - {type(e).__name__}: {e}", file=sys.stderr)
    sys.exit(1)

This would make CI failure reasons more immediately visible.

3. Artifact retention note

7-day retention for artifacts is appropriate for experimental CI. Once this stabilizes, consider if longer retention is needed for debugging flaky Windows-specific failures.

📝 Documentation

The PR description and commit messages are excellent - each iteration is well-documented with clear rationale for changes. The .skip changelog file appropriately notes this is CI-only.

✅ Verdict

LGTM - The changes are well-structured for an experimental CI addition. The continue-on-error: true at both job and step level appropriately prevents this from blocking the PR workflow while still providing visibility into Windows perception test stability.

The AssetConverterBase fix for cross-platform temp directories is a clean, correct change that should be merged regardless of the CI experiment outcome.


Update (6811028): PR rebased and squashed to two clean commits:

  1. 9242498 — Adds Windows + ARM pytest markers, AppLauncher argv filtering, and AssetConverterBase cross-platform temp dir fix
  2. 6811028 — Full Windows CI workflow with the 5-job pipeline

Content is unchanged from prior review. All previous findings remain valid optional recommendations. Approval stands.

Foundation for cross-platform CI. Registers four pytest markers
(windows, windows_ci, arm, arm_ci), teaches AppLauncher to recognize
them in argv so they do not leak into Isaac Sim's argparse, and moves
the AssetConverterBase USD scratch directory from a hardcoded
/tmp/IsaacLab to tempfile.gettempdir() for cross-platform compatibility.

Tags source/isaaclab/test/deps/test_torch.py and test_scipy.py with
the new markers so they are selectable by future cross-platform jobs.

Workflow files (arm-ci.yaml, windows-ci.yaml) ship in follow-up PRs.
@hujc7 hujc7 changed the title [CI] Cartpole-camera perception smoke on Windows (experimental) [CI] Add Windows CI workflow May 20, 2026
@hujc7 hujc7 changed the title [CI] Add Windows CI workflow [CI] Cross-platform CI 3/3: Windows workflow May 20, 2026
@hujc7 hujc7 changed the title [CI] Cross-platform CI 3/3: Windows workflow [CI] Cross-platform: Windows workflow May 20, 2026
@hujc7 hujc7 changed the title [CI] Cross-platform: Windows workflow [CI] Cross-platform — Part 3: Windows workflow May 20, 2026
@hujc7 hujc7 force-pushed the jichuanh/windows-spark-ci-perception branch 2 times, most recently from d9d7dac to d25f9bb Compare May 20, 2026 23:37
Same shape as arm-ci.yaml but the install path is native pip + uv on
the Windows host (no Docker for Linux-based Isaac Sim wheels).

Jobs (all continue-on-error: true):
  Tier 1 — general-windows, install-windows, kit-launch-windows
  Tier 2 — path-io-windows, perception-windows

Every pytest invocation passes --timeout=N + --timeout-method=thread
(signal is unavailable on Windows) plus --continue-on-collection-errors
so a hung test cannot consume the full job slot and a broken neighbor
file does not poison the marker-driven discovery.

perception-windows wraps the cartpole-camera smoke in an inline Python
script with explicit assertions and an inner watchdog thread that aborts
the process after 180s. This replaces the previous pattern where Vulkan
init failures hung the job instead of erroring.

Tags four path-IO test files (test_configclass, test_dict,
test_episode_data, test_hdf5_dataset_file_handler) with the windows_ci
marker so path-io-windows picks them up via marker-driven discovery.
@hujc7 hujc7 force-pushed the jichuanh/windows-spark-ci-perception branch from d25f9bb to 6811028 Compare May 20, 2026 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant