You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Vulkan: support Linux/Windows desktop GPUs and opt-in wheel builds (#20138)
## Summary
We already provide good desktop support for NVIDIA GPUs through our CUDA
backend, which works well on both Linux and Windows. However, our Vulkan
backend only provides solid support for Android, leaving AMD GPUs
without sufficient support. That's a shame since Vulkan is well
supported on every operating system and with every major GPU
manufacturer.
This PR gets the Vulkan backend building and running correctly on Linux
and Windows desktop GPUs (NVIDIA, AMD, Intel), and adds an opt-in way to
build pre-built Vulkan binaries. It leaves everything Android related
the same so that we don't regress anything for that platform.
Most of the backend was already portable, So this is mostly build fixes,
a few small fixes for desktop GPUs, and packaging/CI plumbing.
fixes:#20140
## Changes
- Picks the right exception flag for MSVC vs GCC/Clang, finds glslc on
Windows, suppresses third-party (VMA) warnings on GCC/MSVC, and gives a
clear error if the Vulkan submodules aren't checked out.
- Compiles the cooperative-matrix shader. It needs a newer Vulkan target
than the default, so that one shader now builds against Vulkan 1.3.
- Fixes correctness on desktop GPUs. Turns on the device features the
shaders actually use (int16/int64/float64), which were never enabled
before; picks a discrete GPU instead of the first one found if one
exists (with an `ETVK_DEVICE_INDEX` override for multi-GPU machines);
avoids an invalid image-copy on compute-only queues; and fixes a
buffer-size check that compared the wrong units.
- Makes the `small_texture_limits` export option work as intended. It
was being silently dropped; now it round-trips so you can target GPUs
with smaller texture limits. Adds a small unit test for it.
- Adds opt-in packaging and CI. Behind the `EXECUTORCH_BUILD_VULKAN`
flag, the wheel build can include Vulkan; adds a real-GPU (NVIDIA) test
job and a Windows/MSVC build job. The default wheel and other backends
are untouched.
## Android Safety
Every change is gated so Android behaves exactly as before: build
differences are behind compiler/OS checks, the new export and runtime
options are opt-in and default to today's behavior, and the
device-selection / feature changes are based on what the GPU reports .
The existing SwiftShader CI job is unchanged.
## Testing
Built with the Vulkan SDK (glslc on PATH) and run on an NVIDIA A100
(driver 580.126.09, Vulkan 1.4.312):
```bash
# Build the backend + a runner (the Vulkan SDK's glslc must be on PATH)
cmake -B cmake-out -S . -GNinja -DCMAKE_BUILD_TYPE=Release \
-DEXECUTORCH_BUILD_VULKAN=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_EXECUTOR_RUNNER=ON
cmake --build cmake-out --target vulkan_backend executor_runner # 402/402, 0 errors
# Export a model to Vulkan and run it on the GPU
python -m examples.vulkan.export -m mv2 -o .
./cmake-out/executor_runner --model_path mv2.pte
```
I ran a small fp32 model and an int8 model on the A100 and matched the
reference output (fp32 to 5 decimals, int8 to 4 decimals). The int8 run
exercises the integer-dot-product / int16 shaders that SwiftShader can't
run.
All the shaders compiled fine and the new unit tests added here passed.
I didn't test the windows build yet, so I'll be relying on CI for that.
## TODO
We also need to publish a Vulkan wheel to PyPI. The build supports it
(`EXECUTORCH_BUILD_VULKAN=1` + glslc), but we need a Vulkan entry added
to the shared `build-wheels-*.yml` workflows.
cc @SS-JIA@manuelcandales@digantdesai@cbilgin
---------
Co-authored-by: Sicheng Stephen Jia <ssjia@meta.com>
0 commit comments