Skip to content

[ET-VK][CI] Add macOS MoltenVK Vulkan model tests#18723

Open
SS-JIA wants to merge 1 commit intomainfrom
pr18723
Open

[ET-VK][CI] Add macOS MoltenVK Vulkan model tests#18723
SS-JIA wants to merge 1 commit intomainfrom
pr18723

Conversation

@SS-JIA
Copy link
Copy Markdown
Contributor

@SS-JIA SS-JIA commented Apr 6, 2026

Summary:
Add a test-vulkan-models-macos job to the pull CI workflow that tests
the Vulkan backend on macOS Apple Silicon using MoltenVK. This mirrors
the existing test-vulkan-models-linux job (which uses SwiftShader) but
runs on a macOS M1 runner with the LunarG Vulkan SDK providing MoltenVK
as the Vulkan ICD.

The new setup-vulkan-macos-deps.sh script downloads the macOS Vulkan
SDK and configures VK_ICD_FILENAMES to point at MoltenVK.

Generated with Claude.

@pytorch-bot pytorch-bot bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Apr 6, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 6, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18723

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit d036931 with merge base 3d2c853 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary:
Add a `test-vulkan-models-macos` job to the pull CI workflow that tests
the Vulkan backend on macOS Apple Silicon using MoltenVK. This mirrors
the existing `test-vulkan-models-linux` job (which uses SwiftShader) but
runs on a macOS M1 runner with the LunarG Vulkan SDK providing MoltenVK
as the Vulkan ICD.

The new `setup-vulkan-macos-deps.sh` script downloads the macOS Vulkan
SDK and configures `VK_ICD_FILENAMES` to point at MoltenVK.

Generated with Claude.
@SS-JIA
Copy link
Copy Markdown
Contributor Author

SS-JIA commented Apr 6, 2026

@claude analyse the failing test-vulkan-models-macos job.

@claude
Copy link
Copy Markdown

claude bot commented Apr 6, 2026

Claude finished @SS-JIA's task in 7m 58s —— View job


Analysis of failing test-vulkan-models-macos job

  • Read the changed files for context
  • Fetch CI job logs
  • Analyze the failure and provide diagnosis

Root Cause

The job fails during pip install . --no-build-isolation (the ExecuTorch wheel build) with a CMake error at backends/vulkan/cmake/ShaderLibrary.cmake:30:

CMake Error at backends/vulkan/cmake/ShaderLibrary.cmake:30 (message):
  glslc from the Vulkan SDK must be installed to build the Vulkan backend.
  Please install the Vulkan SDK 1.4.321.0 or newer from
  https://vulkan.lunarg.com/sdk/home and ensure that the glslc binary is in
  your PATH.

CMake's find_program(GLSLC_PATH glslc PATHS $ENV{PATH}) cannot locate glslc.

Analysis

The setup-vulkan-macos-deps.sh script:

  1. Downloads vulkansdk-macos-1.4.321.0.zip from LunarG (succeeds — no curl/unzip errors)
  2. Extracts to /tmp/vulkansdk/
  3. Sets VULKAN_SDK=/tmp/vulkansdk/VulkanSDK/1.4.321.0/macOS
  4. Adds ${VULKAN_SDK}/bin to PATH → /tmp/vulkansdk/VulkanSDK/1.4.321.0/macOS/bin

The logs confirm PATH is propagated correctly through the entire chain: outer shell → conda runsetup-macos.shinstall_executorch.shpip install . --no-build-isolation → CMake. At line 236988 in the log, inside conda run, PATH includes /tmp/vulkansdk/VulkanSDK/1.4.321.0/macOS/bin.

The problem is that glslc does not exist at /tmp/vulkansdk/VulkanSDK/1.4.321.0/macOS/bin/glslc.

The macOS Vulkan SDK zip likely has a different directory structure than what the script assumes. Comparing with the Linux setup script (.ci/scripts/setup-vulkan-linux-deps.sh), which uses ${_vulkan_sdk_dir}/${VULKAN_SDK_VERSION}/x86_64/bin/ — the macOS SDK zip may similarly place binaries under an architecture-specific path rather than macOS/bin/.

Suggested Fix

Add a diagnostic step to setup-vulkan-macos-deps.sh after extracting the zip to verify the actual directory layout and confirm glslc is present. For example:

# After unzip, verify the SDK structure
ls -la "${VULKAN_SDK}/bin/" || echo "WARNING: ${VULKAN_SDK}/bin/ does not exist"
which glslc || echo "WARNING: glslc not found in PATH"
# Also list the top-level extracted directories to discover actual structure
find /tmp/vulkansdk -name "glslc" -type f 2>/dev/null

The macOS Vulkan SDK zip may place glslc under a path like:

  • /tmp/vulkansdk/VulkanSDK/1.4.321.0/macOS/Bin/glslc (capital B)
  • /tmp/vulkansdk/VulkanSDK/1.4.321.0/arm64/bin/glslc
  • /tmp/vulkansdk/VulkanSDK/1.4.321.0/bin/glslc

Once you identify where glslc actually lives in the extracted zip, update the PATH accordingly in setup-vulkan-macos-deps.sh.

Secondary Note

There is also a minor earlier error at log line 236561:

/Users/ec2-user/runner/_work/_temp/c2e4b4fd-da9c-4cd9-834e-3b0480d21c0e.sh: line 10: 
  /Users/ec2-user/runner/_work/_temp/github_env_24048813425: No such file or directory

This appears to be a non-fatal issue in the macos_job.yml reusable workflow infrastructure, not directly related to the build failure.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant