Skip to content

[BUG]: Failed to load CuPy for GPU acceleration due to CUDA version mismatch #9312

@kreeuwijk

Description

@kreeuwijk

Describe the Bug

Getting this message at the start of pod startup:

dynamo.nixl_connect: Failed to load CuPy for GPU acceleration, utilizing numpy to provide CPU based operations.

Steps to Reproduce

  1. Deploy a workload with the vllm-runtime:1.1.0-cuda13 container
  2. Notice the pod startup logs containing Failed to load CuPy for GPU acceleration
  3. Shell into the pod and run pip list | grep cupy
  4. Notice that cupy-cuda12x is installed instead of cupy-cuda13x

Expected Behavior

CuPy should be able to initialize.

Actual Behavior

Due to a CUDA version mismatch, CuPY is unable to initialize.

Environment

Container image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.1.0-cuda13
GPU driver: 580.126.20
CUDA version of GPU driver: 13.0

However the cupy version installed in the vllm-runtime container is for CUDA 12:

dynamo@qwen-3-32b-disagg-mn-0-vllmdecodeworker-0:/workspace/examples/backends/vllm$ pip list | grep cupy
cupy-cuda12x                             14.0.1

Additional Context

No response

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions