Commit f99d83e
[Auto-23/24][ONNX][Autocast] Clear stale Cast-output type metadata before ORT InferenceSession load (#1565)
### What does this PR do?
Type of change: Bug fix
**Error:**
```
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Type Error: Type (tensor(float16)) of output arg (node_5bc985fa) of node (node_5bc985fa) does not match expected type (tensor(float)).
```
**Root cause:**
- Some ONNX exporters emit `graph.output` / `value_info` entries whose
dtype disagrees with the upstream `Cast` node's `to` attribute.
- ORT's type checker rejects such models on session load.
**Fix:**
- New helper `modelopt.onnx.utils.clear_stale_value_info()` reconciles
each `graph.output` elem_type to its producing Cast's `to`, then clears
`value_info` so ORT recomputes intermediate types.
- Called from `autocast/referencerunner.py` and
`quantization/quantize.py::_preprocess_onnx`.
### Usage
```python
# Internal fix; no new flag introduced. Generic CLI to exercise the affected Autocast path:
$ python -m modelopt.onnx.autocast --onnx=model.onnx
```
### Testing
```
pytest tests/unit/onnx/test_onnx_utils.py::test_clear_stale_value_info
```
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: N/A
- Did you write any new necessary tests?: ✅
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
❌
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Automatic cleaning and reconciliation of stale ONNX type metadata
before runtime and quantization; reconciled model files are produced and
used when inconsistencies are found.
* **Tests**
* New unit tests covering metadata-cleaning behavior across cast/type
scenarios to ensure correctness and prevent regressions.
<!-- review_stack_entry_start -->
[](https://app.coderabbit.ai/change-stack/NVIDIA/Model-Optimizer/pull/1565?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)
<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
Co-authored-by: modelopt-fix-agent-bot (Claude Opus 4.7) <noreply@anthropic.com>1 parent dd8314b commit f99d83e
4 files changed
Lines changed: 79 additions & 0 deletions
File tree
- modelopt/onnx
- autocast
- quantization
- tests/unit/onnx
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
295 | 295 | | |
296 | 296 | | |
297 | 297 | | |
| 298 | + | |
| 299 | + | |
298 | 300 | | |
299 | 301 | | |
300 | 302 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
| 76 | + | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
| |||
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
121 | 129 | | |
122 | 130 | | |
123 | 131 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1860 | 1860 | | |
1861 | 1861 | | |
1862 | 1862 | | |
| 1863 | + | |
| 1864 | + | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
| 1874 | + | |
| 1875 | + | |
| 1876 | + | |
| 1877 | + | |
| 1878 | + | |
| 1879 | + | |
| 1880 | + | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
| 1888 | + | |
| 1889 | + | |
| 1890 | + | |
| 1891 | + | |
| 1892 | + | |
| 1893 | + | |
| 1894 | + | |
| 1895 | + | |
| 1896 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
329 | 330 | | |
330 | 331 | | |
331 | 332 | | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
0 commit comments