Skip to content

Commit 9bb917d

Browse files
authored
[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x (#1368)
### What does this PR do? Type of change: ? documentation <!-- Details about the change. --> Update windows documentation for onnxruntime quantization with Cuda13.x ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: N/A <!--- If ❌, explain why. --> - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: N/A <!--- Mandatory --> - Did you write any new necessary tests?: N/A <!--- Mandatory for new features or examples. --> - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: N/A <!--- Only for new features, API changes, critical bug fixes or backward incompatible changes. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Updated Windows installation guide with CUDA 13.x-specific setup instructions for GPU-accelerated dependencies, including CuPy and ONNX Runtime configuration with nightly builds. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: ynankani <ynankani@nvidia.com>
1 parent 077e29a commit 9bb917d

1 file changed

Lines changed: 16 additions & 0 deletions

File tree

docs/source/getting_started/windows/_installation_standalone.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,22 @@ If you need to use any other EP for calibration, you can uninstall the existing
6464

6565
By default, ModelOpt-Windows utilizes the `cupy-cuda12x <https://cupy.dev//>`_ tool for GPU acceleration during the INT4 ONNX quantization process. This is compatible with CUDA 12.x.
6666

67+
If you are using CUDA 13.x, update CUDA-dependent packages manually:
68+
69+
For official ONNX Runtime guidance, see `Nightly builds for CUDA 13.x <https://onnxruntime.ai/docs/install/#nightly-for-cuda-13x>`_.
70+
71+
1. Uninstall ``cupy-cuda12x`` and install ``cupy-cuda13x``.
72+
2. Uninstall ``onnxruntime-genai-cuda`` and ``onnxruntime-gpu``.
73+
3. Install ONNX Runtime CUDA 13 nightly and the pre-release ``onnxruntime-genai-cuda`` package.
74+
75+
.. code-block:: bash
76+
77+
pip uninstall -y cupy-cuda12x onnxruntime-genai-cuda onnxruntime-gpu
78+
pip install cupy-cuda13x
79+
pip install coloredlogs flatbuffers numpy packaging protobuf sympy
80+
pip install --pre --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-13-nightly/pypi/simple/ onnxruntime-gpu
81+
pip install --pre onnxruntime-genai-cuda
82+
6783
**6. Verify Installation**
6884

6985
Ensure the following steps are verified:

0 commit comments

Comments
 (0)