Commit 2a9f431
Add Nemotron parse PTQ support (#786)
## What does this PR do?
**Type of change:** New model support <!-- Use one of the following: Bug
fix, new feature, new example, new tests, documentation. -->
**Overview:** Add PTQ support for
https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1
## Usage
<!-- You can potentially add a usage example below. -->
```python
python3 hf_ptq.py --pyt_ckpt_path /home/omniml_data_3/models/NVIDIA-Nemotron-Parse-v1.1 --qformat fp8 --export_path /home/omniml_data_3/zhiyuc/checkpoints/NVIDIA-Nemotron-Parse-v1.1-FP8 --trust_remote_code --kv_cache_qformat none --attn_implementation eager
```
By default, image-text data will be used in calibration for VLMs.
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes <!--- If No, explain why.
-->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Not yet <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for Nemotron-Parse multimodal models, including proper
device mapping, processor loading, and generation handling.
* **Improvements**
* Enhanced quantization robustness with safer handling of quantization
attributes and fallback logic.
* Improved model loading with better device placement and encoder buffer
management for vision-language models.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>1 parent d7f62d3 commit 2a9f431
File tree
6 files changed
+145
-73
lines changed- examples/llm_ptq
- modelopt/torch/export
6 files changed
+145
-73
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
75 | 76 | | |
76 | 77 | | |
77 | 78 | | |
78 | | - | |
| 79 | + | |
79 | 80 | | |
80 | 81 | | |
81 | 82 | | |
82 | 83 | | |
| 84 | + | |
83 | 85 | | |
84 | 86 | | |
85 | 87 | | |
86 | 88 | | |
87 | 89 | | |
88 | 90 | | |
89 | | - | |
90 | | - | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
109 | 113 | | |
110 | 114 | | |
111 | 115 | | |
| |||
158 | 162 | | |
159 | 163 | | |
160 | 164 | | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
161 | 168 | | |
162 | 169 | | |
163 | 170 | | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
164 | 179 | | |
165 | 180 | | |
166 | 181 | | |
| |||
172 | 187 | | |
173 | 188 | | |
174 | 189 | | |
175 | | - | |
176 | 190 | | |
177 | 191 | | |
178 | | - | |
179 | 192 | | |
180 | 193 | | |
181 | 194 | | |
| |||
312 | 325 | | |
313 | 326 | | |
314 | 327 | | |
315 | | - | |
316 | | - | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
317 | 337 | | |
318 | 338 | | |
319 | 339 | | |
| |||
447 | 467 | | |
448 | 468 | | |
449 | 469 | | |
| 470 | + | |
450 | 471 | | |
451 | 472 | | |
452 | 473 | | |
| |||
466 | 487 | | |
467 | 488 | | |
468 | 489 | | |
469 | | - | |
470 | | - | |
471 | 490 | | |
472 | 491 | | |
473 | 492 | | |
| |||
510 | 529 | | |
511 | 530 | | |
512 | 531 | | |
513 | | - | |
| 532 | + | |
514 | 533 | | |
515 | 534 | | |
516 | 535 | | |
517 | 536 | | |
518 | 537 | | |
519 | | - | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
520 | 543 | | |
521 | 544 | | |
522 | 545 | | |
| |||
527 | 550 | | |
528 | 551 | | |
529 | 552 | | |
530 | | - | |
| 553 | + | |
531 | 554 | | |
532 | 555 | | |
533 | 556 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
364 | 370 | | |
365 | 371 | | |
366 | 372 | | |
| |||
499 | 505 | | |
500 | 506 | | |
501 | 507 | | |
502 | | - | |
| 508 | + | |
503 | 509 | | |
504 | 510 | | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
505 | 514 | | |
506 | 515 | | |
507 | 516 | | |
| |||
686 | 695 | | |
687 | 696 | | |
688 | 697 | | |
689 | | - | |
| 698 | + | |
690 | 699 | | |
691 | 700 | | |
692 | 701 | | |
| |||
800 | 809 | | |
801 | 810 | | |
802 | 811 | | |
803 | | - | |
804 | | - | |
805 | | - | |
806 | | - | |
807 | | - | |
808 | | - | |
809 | | - | |
810 | | - | |
811 | | - | |
812 | | - | |
813 | | - | |
814 | | - | |
815 | | - | |
816 | | - | |
817 | | - | |
818 | | - | |
819 | | - | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
820 | 817 | | |
821 | | - | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
822 | 837 | | |
823 | | - | |
| 838 | + | |
824 | 839 | | |
825 | | - | |
826 | | - | |
827 | | - | |
828 | | - | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
833 | 848 | | |
834 | 849 | | |
835 | 850 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
129 | 133 | | |
130 | 134 | | |
131 | 135 | | |
| |||
139 | 143 | | |
140 | 144 | | |
141 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
142 | 152 | | |
143 | 153 | | |
144 | 154 | | |
| |||
148 | 158 | | |
149 | 159 | | |
150 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
151 | 164 | | |
152 | 165 | | |
153 | 166 | | |
154 | | - | |
155 | | - | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
156 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
157 | 178 | | |
158 | 179 | | |
159 | 180 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
88 | 89 | | |
89 | 90 | | |
90 | 91 | | |
| |||
103 | 104 | | |
104 | 105 | | |
105 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
106 | 111 | | |
107 | 112 | | |
108 | 113 | | |
| |||
112 | 117 | | |
113 | 118 | | |
114 | 119 | | |
| 120 | + | |
115 | 121 | | |
116 | 122 | | |
117 | 123 | | |
| |||
141 | 147 | | |
142 | 148 | | |
143 | 149 | | |
144 | | - | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
145 | 157 | | |
0 commit comments