You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(voxtral_realtime): document CUDA Windows workflow (#17993)
Add CUDA-Windows instructions to the Voxtral Realtime README, including
export prerequisites and an example command.
Document Windows build steps via CMake workflow presets and add
PowerShell run examples with and without the .ptd data file.
Note recommended CUDA architectures for int4 kernels, and reformat
voxtral_realtime CMake presets without changing behavior.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This builds ExecuTorch with Metal backend support. The runner binary is at
229
278
the same path as above. Metal exports can only run on macOS with Apple Silicon.
230
279
280
+
### CUDA-Windows
281
+
282
+
On Windows (PowerShell), use CMake workflow presets directly from the executorch root directory. Note that if you exported the model with 4-bit quantization, you may need to specify your GPU's compute capability (e.g., `80;86;89;90;120` for Ampere, Lovelace, Hopper, and Blackwell) to avoid "invalid device function" errors at runtime, as the `int4mm` kernels require SM 80 or newer.
283
+
284
+
```powershell
285
+
$env:CMAKE_CUDA_ARCHITECTURES="80;86;89;90;120"
286
+
cmake --workflow --preset llm-release-cuda
287
+
Push-Location examples/models/voxtral_realtime
288
+
cmake --workflow --preset voxtral-realtime-cuda
289
+
Pop-Location
290
+
```
291
+
231
292
## Run
232
293
233
294
The runner requires:
@@ -237,35 +298,49 @@ The runner requires:
237
298
- A 16kHz mono WAV audio file (or live audio via `--mic`)
238
299
- For CUDA: `aoti_cuda_blob.ptd` — delegate data file (pass via `--data_path`)
0 commit comments