You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix tensor bridge DLL import failure on Windows (NVIDIA#1988)
* Fix tensor bridge DLL import failure on Windows
aoti_torch_get_current_cuda_stream lives in torch_cuda.dll, not
torch_cpu.dll. The stub import library pointed at the wrong DLL,
causing "The specified procedure could not be found" on Windows.
- Move aoti_torch_get_current_cuda_stream from aoti_shim.def
(torch_cpu.dll) to new aoti_shim_cuda.def (torch_cuda.dll)
- Update build_hooks.py to generate stub libs for both DLLs
via a loop
- Add torch_cuda.dll to delvewheel exclude list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add SPDX headers to aoti_shim_cuda.def
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Resolve aoti_torch_get_current_cuda_stream lazily at runtime
The symbol lives in torch_cuda (not torch_cpu), so linking against it
at build time breaks CPU-only PyTorch installs and requires a second
stub import library on Windows.
Instead, resolve it lazily on first use via dlsym (Linux) /
LoadLibrary+GetProcAddress (Windows). The cached function pointer
keeps subsequent calls fully in C with zero Python overhead.
This reverts the two-def-file approach from the previous commit and
replaces it with a self-contained inline C helper that handles both
platforms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments