Fix: Native CUDA 13 support and dynamic CUDA_HOME fetching for CuPy#136
Fix: Native CUDA 13 support and dynamic CUDA_HOME fetching for CuPy#136wzgrx wants to merge 10 commits into
Conversation
|
And... it is still not fixed/updated into the main release. I posted this exact fix in the issues section also. @wzgrx To this: That will update film_net_fp32.pt to v1.0.2 and get rid of this warning that we always get with v1.0.0: Here are the details for the updated model: v1.0.2 Newer torch version, improved support for dt argument Latest Includes minor fixes: Fix UserWarning: Using padding='same' with even kernel lengths and odd dilation may require a zero-padded copy of the input be created https://github.com/dajes/frame-interpolation-pytorch/releases |
Fix: Native CUDA 13 support and dynamic CUDA_HOME fetching for CuPy
Description
Motivation & Context
With the release of the new RTX 50-series GPUs and the adoption of CUDA 13.x, users are experiencing installation and runtime failures with CuPy. Previously, the installation script either attempted a risky fallback to CUDA 12.x or forced a source compilation that often failed. Additionally, CuPy's JIT compiler (RawModule) frequently failed to locate the necessary CUDA headers in CUDA 13 environments when CUDA_HOME was not explicitly set.
Since CuPy now officially provides cupy-cuda13x wheels, this PR updates the installation logic and JIT compiler paths to natively support CUDA 13.
Changes Included:
install.py:
Updated get_cuda_ver_from_dir to correctly detect CUDA 13 and return '13x', fetching the official cupy-cuda13x wheel instead of falling back to 12.x.
Added cupy-cuda13x to the pip uninstall cleanup list to prevent environment conflicts.
requirements-with-cupy.txt:
Updated the generic cupy-wheel (which acts as an empty shell) to cupy-cuda13x to streamline manual installations on modern setups.
vfi_models/ops/cupy_ops/utils.py:
Enhanced cuda_launch to dynamically fetch and inject CUDA_HOME and CUDA_PATH using get_cuda_home_path(). This ensures that CuPy's NVRTC compiler can successfully locate the PyTorch-bundled or system CUDA components, preventing RawModule runtime crashes on unconfigured environments.
How to Test
Run install.py on a machine with CUDA 13.x (e.g., equipped with an RTX 5090).
Verify that cupy-cuda13x is installed automatically without triggering a source build.
Run a frame interpolation node requiring CuPy (e.g., RIFE) and verify that the CUDA kernels compile and launch successfully without missing header errors.