Skip to content

Commit 4c399af

Browse files
authored
Added fallback to preload cudnn dlls from nvidia cudnn venv package or torch venv package (#1135)
### What does this PR do? Type of change: Bug fix There was a QA team that was testing the modelopt 0.43 release and pointed out that we could install nvidia-cudnn pypi packages and use ort.preload_dlls() to load the dlls from the python venv instead of trying to search in system path only . Here is the info about onnxruntime.preload_dlls() function <img width="1478" height="414" alt="image" src="https://github.com/user-attachments/assets/e43ecbe3-ba52-4dd8-b2a2-e825d013205b" /> So added fallback to system path cudnn search to preload dlls and if that also fails then raise exception. ### Testing Tested quantization by installing nvidia-cudnn-cu12 package and removing cudnn dlls from system path. Working as expected. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved startup handling when CUDA/cuDNN libraries are missing: the app now attempts a conditional preload from installed Python packages (when supported), logs captured preload output for diagnostics, warns on preload errors, and only raises an error if preload ultimately fails. * **Documentation** * Error messages now better explain missing-library issues, note platform/version considerations, and recommend installing a cuDNN pip package (e.g., nvidia-cudnn-cu12) or setting the appropriate environment variable. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
1 parent f1beaba commit 4c399af

1 file changed

Lines changed: 50 additions & 4 deletions

File tree

modelopt/onnx/quantization/ort_utils.py

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,12 @@
1616
"""Provides basic ORT inference utils, should be replaced by modelopt.torch.ort_client."""
1717

1818
import glob
19+
import io
1920
import os
2021
import platform
22+
import sys
2123
from collections.abc import Sequence
24+
from contextlib import redirect_stderr, redirect_stdout
2225

2326
import onnxruntime as ort
2427
from onnxruntime.quantization.operators.qdq_base_operator import QDQOperatorBase
@@ -70,11 +73,54 @@ def _check_for_libcudnn():
7073
f" for your ORT version at https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements."
7174
)
7275
else:
73-
logger.error(f"cuDNN library not found in {env_variable}")
76+
# Fallback: ORT >=1.20 ships a preload_dlls() helper that loads CUDA/cuDNN
77+
# DLLs bundled inside pip packages (e.g. nvidia-cudnn-cu12) so they don't
78+
# need to be on the system PATH / LD_LIBRARY_PATH.
79+
# However, preload_dlls() is broken on Python 3.10 (missing os.add_dll_directory
80+
# behaviour), so we skip it for that version.
81+
if hasattr(ort, "preload_dlls") and sys.version_info[:2] != (3, 10):
82+
logger.warning(
83+
f"cuDNN not found in {env_variable}. "
84+
"Attempting onnxruntime.preload_dlls() to load from site-packages..."
85+
)
86+
# preload_dlls() does not raise on failure — it silently prints
87+
# "Failed to load ..." messages. Capture its output and check
88+
# whether the key cuDNN DLL actually loaded.
89+
cudnn_dll = "cudnn" if platform.system() == "Windows" else "libcudnn_adv"
90+
captured = io.StringIO()
91+
try:
92+
with redirect_stdout(captured), redirect_stderr(captured):
93+
ort.preload_dlls()
94+
except Exception as e:
95+
logger.warning(f"onnxruntime.preload_dlls() raised an exception: {e}")
96+
97+
preload_output = captured.getvalue()
98+
if preload_output:
99+
logger.debug(f"preload_dlls() output:\n{preload_output}")
100+
101+
if f"Failed to load {cudnn_dll}" in preload_output:
102+
logger.error(
103+
f"onnxruntime.preload_dlls() was called but {cudnn_dll} failed to load. "
104+
"cuDNN DLLs were NOT successfully loaded from site-packages."
105+
)
106+
else:
107+
logger.info(
108+
"onnxruntime.preload_dlls() succeeded — CUDA/cuDNN DLLs loaded"
109+
" from site-packages. Verify version compatibility at"
110+
" https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements."
111+
)
112+
return True
113+
74114
raise FileNotFoundError(
75-
f"{lib_pattern} is not accessible in {env_variable}! Please make sure that the path to that library"
76-
f" is in the env var to use the CUDA or TensorRT EP and ensure that the correct version is available."
77-
f" Versioning compatibility can be checked at https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements."
115+
f"{lib_pattern} is not accessible via {env_variable} or site-packages.\n"
116+
f"To fix this, either:\n"
117+
f" 1. Add the directory containing {lib_pattern} to your"
118+
f" {env_variable} env var, or\n"
119+
f" 2. Install the cuDNN pip package (Python>=3.11 only):"
120+
f" pip install nvidia-cudnn-cu12 (or nvidia-cudnn-cu13)\n"
121+
f"This is required for the CUDA / TensorRT execution provider.\n"
122+
f"Check version compatibility at"
123+
f" https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements."
78124
)
79125
return found
80126

0 commit comments

Comments
 (0)