Name and Version
llama version
b9631-6e14286ed
llama-server --version
version: 9692 (f3e1828)
built with Clang 20.1.8 for Windows x86_64
llama-cli --version
version: 9692 (f3e1828)
built with Clang 20.1.8 for Windows x86_64
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
Other (Please specify in the next section)
Command line
Problem description & steps to reproduce
When installing llama.cpp on Windows from a CUDA release zip, the resulting llama.exe includes a llama update subcommand. Looking at the source in app/llama.cpp, on Windows this simply runs:
irm https://llama.app/install.ps1 | iex
However, install.ps1 only probes for Vulkan and CPU, unlike install.sh
curl -fsSL https://llama.app/install.sh
which has a probe_cuda() function.
This seems that running llama update on Windows will apparently replace a CUDA build with a Vulkan or CPU binary, silently losing GPU acceleration.
First Bad Commit
5a46b46 included the update command, not sure the date of each script, ps1 and sh.
Relevant log output
Logs
llama update
Version: b9631
Probing Vulkan...
Downloading vulkan-probe.exe...
Downloading unzstd.exe...
Downloading featcode.exe...
Found: bmi2
Found: avxvnni
Found: avx512vl
Found: avx512cd
Found: avx512dq
Found: avx512vnni
Found: avx512vbmi
Found: avx512bf16
Downloading llama.exe...
Installation completed successfully
Please run the following command to start it:
llama.exe serve
Name and Version
llama version
b9631-6e14286ed
llama-server --version
version: 9692 (f3e1828)
built with Clang 20.1.8 for Windows x86_64
llama-cli --version
version: 9692 (f3e1828)
built with Clang 20.1.8 for Windows x86_64
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
Other (Please specify in the next section)
Command line
Problem description & steps to reproduce
When installing llama.cpp on Windows from a CUDA release zip, the resulting
llama.exeincludes allama updatesubcommand. Looking at the source inapp/llama.cpp, on Windows this simply runs:However,
install.ps1only probes for Vulkan and CPU, unlikeinstall.shwhich has a
probe_cuda()function.This seems that running
llama updateon Windows will apparently replace a CUDA build with a Vulkan or CPU binary, silently losing GPU acceleration.First Bad Commit
5a46b46 included the
updatecommand, not sure the date of each script,ps1andsh.Relevant log output
Logs