You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What does this PR do?
Type of change: new feature
Upgrade ONNX dependency from `~=1.19.0` to `~=1.21.0`. ONNX 1.20+
removed several deprecated
helper functions (`float32_to_bfloat16`, `float32_to_float8e4m3`,
`pack_float32_to_4bit`) that
`onnx_graphsurgeon` 0.5.x still references at import time. This PR adds
a compatibility shim
(`modelopt/onnx/_onnx_compat.py`) that restores these functions using
`ml_dtypes` before any
`onnx_graphsurgeon` import occurs. This supersedes the partial inline
fix from #1204 by also
handling `float32_to_float8e4m3`.
Changes:
- Bump `onnx~=1.19.0` to `onnx~=1.21.0` in `pyproject.toml`
- Add `modelopt/onnx/_onnx_compat.py` compatibility shim for removed
ONNX APIs
- Import shim in `modelopt/onnx/__init__.py` and
`tests/unit/onnx/conftest.py`
- Remove usage of removed `onnx.helper.pack_float32_to_4bit` in
`test_quant_utils.py`
- Update example requirements (`genai_llm`, `whisper`) to `onnx==1.21.0`
**TensorRT Compatibility:** TRT 10.16-GA supports opsets 9–24. ModelOpt
quantization modes
use opsets 19–23, all within range. ONNX 1.21 does not force opset 26.
### Usage
```python
# No API changes — the upgrade is transparent to users.
# The compatibility shim is applied automatically on import.
import modelopt.onnx
```
### Testing
- 469/470 ONNX unit tests pass inside
`nvcr.io/nvidia/tensorrt:25.06-py3` (1 pre-existing ORT
`CopyTensorAsync` EP issue, not ONNX-related)
- 6/6 `torch_onnx` integration tests pass (fp8, int8, nvfp4, mxfp8,
int4_awq, auto)
- ViT FP8 quantization via `torch_onnx` → TRT engine build → ImageNet
eval: **85.3% top-1, 97.8% top-5**
- ViT FP8 quantization via `onnx_ptq` → TRT engine build succeeds
- All pre-commit hooks pass (ruff, mypy, bandit, license headers)
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅
- Did you write any new necessary tests?: ✅ (updated existing tests,
added conftest.py for compat shim)
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
❌ (dependency upgrade, no API change)
### Additional Information
Related: #1204 (partial fix for `float32_to_bfloat16` only — this PR
supersedes it with full coverage)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Dependencies**
* Removed unpinned ONNX from example requirement files and updated the
ONNX optional dependency to ~=1.21.0.
* **Refactor**
* Centralized an ONNX compatibility shim to restore missing helper APIs
when needed.
* **Tests**
* Added tests for the compatibility shim, adjusted quantization tests to
remove reliance on removed ONNX helpers, and ensured shim runs before
related tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments