You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[5525939] Allow user to select target opset in MOQ (#809)
## What does this PR do?
**Type of change:** new feature
**Overview:**
- Allow user to select the target opset
- Minimum opset will be defined according to quantization mode
- Add tests in tests/unit/onnx/test_quantize_api.py
## Testing
Added unit tests
tests/unit/onnx/test_quantize_api.py::test_opset_below_minimum_upgrades_to_minimum[int8]
PASSED [ 11%]
tests/unit/onnx/test_quantize_api.py::test_opset_below_minimum_upgrades_to_minimum[fp8]
PASSED [ 22%]
tests/unit/onnx/test_quantize_api.py::test_opset_below_minimum_upgrades_to_minimum[int4]
PASSED [ 33%]
tests/unit/onnx/test_quantize_api.py::test_opset_below_original_uses_original[int8]
PASSED [ 44%]
tests/unit/onnx/test_quantize_api.py::test_opset_below_original_uses_original[fp8]
PASSED [ 55%]
tests/unit/onnx/test_quantize_api.py::test_opset_below_original_uses_original[int4]
PASSED [ 66%]
tests/unit/onnx/test_quantize_api.py::test_opset_above_minimum[int8]
PASSED [ 77%]
tests/unit/onnx/test_quantize_api.py::test_opset_above_minimum[fp8]
PASSED [ 88%]
tests/unit/onnx/test_quantize_api.py::test_opset_above_minimum[int4]
PASSED [100%]
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes - auto
update according to argparser help
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes
## Additional Information
Requested as a WAR for a Windows-onnxruntime issue in 5525939, but
regardless, it's a useful feature to have
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added `--opset` CLI option enabling users to specify target ONNX opset
version when quantizing models.
* Automatic validation ensures the opset version is compatible with
quantization requirements, with warnings when adjustments are made.
* **Tests**
* Added comprehensive test coverage for opset version handling across
quantization workflows.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Gal Hubara-Agam <96368689+galagam@users.noreply.github.com>
Co-authored-by: Gwena Cunha <4861122+gcunhase@users.noreply.github.com>
Copy file name to clipboardExpand all lines: CHANGELOG.rst
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,7 @@ NVIDIA Model Optimizer Changelog (Linux)
14
14
- Add support for Kimi K2 Thinking model quantization from the original int4 checkpoint.
15
15
- Add support for ``params`` constraint based automatic neural architecture search in Minitron pruning (``mcore_minitron``) as an alternative to manual pruning (using ``export_config``). See `examples/pruning/README.md <https://github.com/NVIDIA/Model-Optimizer/tree/main/examples/pruning>`_ for more details on its usage.
16
16
- Add support for calibration data with multiple samples in ``npz`` format in the ONNX Autocast workflow.
17
+
- Add ``--opset`` option to ONNX quantization CLI to specify the target opset version for the quantized model.
0 commit comments