Describe the Bug
Getting this message at the start of pod startup:
dynamo.nixl_connect: Failed to load CuPy for GPU acceleration, utilizing numpy to provide CPU based operations.
Steps to Reproduce
- Deploy a workload with the
vllm-runtime:1.1.0-cuda13 container
- Notice the pod startup logs containing
Failed to load CuPy for GPU acceleration
- Shell into the pod and run
pip list | grep cupy
- Notice that
cupy-cuda12x is installed instead of cupy-cuda13x
Expected Behavior
CuPy should be able to initialize.
Actual Behavior
Due to a CUDA version mismatch, CuPY is unable to initialize.
Environment
Container image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.1.0-cuda13
GPU driver: 580.126.20
CUDA version of GPU driver: 13.0
However the cupy version installed in the vllm-runtime container is for CUDA 12:
dynamo@qwen-3-32b-disagg-mn-0-vllmdecodeworker-0:/workspace/examples/backends/vllm$ pip list | grep cupy
cupy-cuda12x 14.0.1
Additional Context
No response
Screenshots
No response
Describe the Bug
Getting this message at the start of pod startup:
Steps to Reproduce
vllm-runtime:1.1.0-cuda13containerFailed to load CuPy for GPU accelerationpip list | grep cupycupy-cuda12xis installed instead ofcupy-cuda13xExpected Behavior
CuPy should be able to initialize.
Actual Behavior
Due to a CUDA version mismatch, CuPY is unable to initialize.
Environment
Container image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.1.0-cuda13
GPU driver: 580.126.20
CUDA version of GPU driver: 13.0
However the cupy version installed in the vllm-runtime container is for CUDA 12:
Additional Context
No response
Screenshots
No response