Skip to content

bug: version >=v3.15.0 not support CUDA v13.1 & v13.2 #577

@xinxi053

Description

@xinxi053

Issue description

windows11, node-llama-cpp: >= 3.15.0 with error: CUDA is detected, but using it failed

Expected Behavior

expect works right with node-llama-cpp: >= 3.15.0

Actual Behavior

PS C:\Users\Administrator> npx --yes node-llama-cpp@3.15.0 inspect gpu
OS: Windows 10.0.26200 (x64)
Node: 24.13.0 (x64)

node-llama-cpp: 3.15.0
Prebuilt binaries: b7698

CUDA: CUDA is detected, but using it failed
To resolve errors related to CUDA, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
Vulkan: Vulkan is detected, but using it failed
To resolve errors related to Vulkan, see the Vulkan guide: https://node-llama-cpp.withcat.ai/guide/vulkan

CPU model: 13th Gen Intel(R) Core(TM) i9-13900K
Math cores: 1.16985510423e-311
Used RAM: 39.94% (12.68GB/31.75GB)
Free RAM: 60.05% (19.07GB/31.75GB)
Used swap: 42.5% (17.1GB/40.25GB)
Max swap size: 40.25GB
mmap: supported

Steps to reproduce

npx --yes node-llama-cpp@3.15.0 inspect gpu not work
npx --yes node-llama-cpp@3.14.5 inspect gpu <= 3.14.5 all works right

My Environment

PS C:\Users\Administrator> npx --yes node-llama-cpp@3.14.4 inspect gpu
npm warn deprecated npmlog@6.0.2: This package is no longer supported.
npm warn deprecated are-we-there-yet@3.0.1: This package is no longer supported.
npm warn deprecated gauge@4.0.4: This package is no longer supported.
OS: Windows 10.0.26200 (x64)
Node: 24.13.0 (x64)

node-llama-cpp: 3.14.4
Prebuilt binaries: b7324

CUDA: available
Vulkan: available

CUDA device: NVIDIA GeForce RTX 4090
CUDA used VRAM: 6.21% (1.49GB/23.99GB)
CUDA free VRAM: 93.78% (22.5GB/23.99GB)

Vulkan device: NVIDIA GeForce RTX 4090
Vulkan used VRAM: 4.56% (1.08GB/23.57GB)
Vulkan free VRAM: 95.43% (22.5GB/23.57GB)

CPU model: 13th Gen Intel(R) Core(TM) i9-13900K
Math cores: 8.728467834306e-312
Used RAM: 42.02% (13.34GB/31.75GB)
Free RAM: 57.97% (18.41GB/31.75GB)
Used swap: 44.5% (17.91GB/40.25GB)
Max swap size: 40.25GB
mmap: supported

PS C:\Users\Administrator> npx --yes node-llama-cpp@3.15.0 inspect gpu
OS: Windows 10.0.26200 (x64)
Node: 24.13.0 (x64)

node-llama-cpp: 3.15.0
Prebuilt binaries: b7698

CUDA: CUDA is detected, but using it failed
To resolve errors related to CUDA, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
Vulkan: Vulkan is detected, but using it failed
To resolve errors related to Vulkan, see the Vulkan guide: https://node-llama-cpp.withcat.ai/guide/vulkan

CPU model: 13th Gen Intel(R) Core(TM) i9-13900K
Math cores: 1.16985510423e-311
Used RAM: 39.94% (12.68GB/31.75GB)
Free RAM: 60.05% (19.07GB/31.75GB)
Used swap: 42.5% (17.1GB/40.25GB)
Max swap size: 40.25GB
mmap: supported

PS C:\Users\Administrator> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2026 NVIDIA Corporation
Built on Mon_Mar__2_21:54:11_Pacific_Standard_Time_2026
Cuda compilation tools, release 13.2, V13.2.51
Build cuda_13.2.r13.2/compiler.37434383_0

PS C:\Users\Administrator> nvidia-smi
Tue Mar 10 15:24:36 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 595.71 Driver Version: 595.71 CUDA Version: 13.2 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 WDDM | 00000000:01:00.0 On | Off |
| 0% 41C P8 28W / 450W | 1890MiB / 24564MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

Additional Context

I saw the release list:
v3.15.0
support new CUDA 13.1 archs (#538) (734693d)
build the prebuilt binaries with CUDA 13.1 instead of 13.0 (#538) (734693d)
but actually, not work with me.

Relevant Features Used

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingreleased

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions