Commit 9adc057
Track global_amax for weight FP4 MSE sweep; Refactor to NVFP4StaticQantizer, NVFP4MSECalibrator (#849)
## What does this PR do?
**Type of change:** ? <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
**Overview:** ?
## Usage
<!-- You can potentially add a usage example below. -->
```python
# Add a code snippet demonstrating how to use this
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added NVFP4StaticQuantizer for improved 4-bit quantization with
enhanced precision control
* Introduced NVFP4MSECalibrator with flexible candidate generation for
calibration optimization
* **Improvements**
* Optimized GPU kernels for Hopper+ graphics cards with better
performance
* Extended Triton support to broader GPU compatibility
* Enhanced backward compatibility for restoring previously quantized
models
* **Tests**
* Added comprehensive test coverage for new quantizers and calibration
methods
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: realAsma <akuriparambi@nvidia.com>1 parent eef96cb commit 9adc057
13 files changed
Lines changed: 685 additions & 454 deletions
File tree
- modelopt/torch/quantization
- calib
- nn/modules
- triton
- tests
- _test_utils/torch/quantization
- gpu/torch/quantization
- unit/torch/quantization
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
43 | 42 | | |
44 | 43 | | |
45 | 44 | | |
| |||
54 | 53 | | |
55 | 54 | | |
56 | 55 | | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | 56 | | |
61 | 57 | | |
62 | 58 | | |
| |||
67 | 63 | | |
68 | 64 | | |
69 | 65 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
91 | | - | |
92 | | - | |
| 90 | + | |
93 | 91 | | |
94 | 92 | | |
95 | | - | |
96 | 93 | | |
97 | 94 | | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
109 | 100 | | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
| 101 | + | |
116 | 102 | | |
117 | 103 | | |
118 | 104 | | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
| 105 | + | |
123 | 106 | | |
124 | 107 | | |
125 | 108 | | |
| |||
129 | 112 | | |
130 | 113 | | |
131 | 114 | | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | 115 | | |
136 | 116 | | |
137 | 117 | | |
138 | 118 | | |
139 | 119 | | |
140 | 120 | | |
141 | 121 | | |
142 | | - | |
143 | | - | |
| 122 | + | |
| 123 | + | |
144 | 124 | | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | 125 | | |
155 | 126 | | |
156 | 127 | | |
| |||
162 | 133 | | |
163 | 134 | | |
164 | 135 | | |
165 | | - | |
| 136 | + | |
166 | 137 | | |
167 | 138 | | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
| 139 | + | |
| 140 | + | |
176 | 141 | | |
177 | 142 | | |
178 | | - | |
179 | | - | |
| 143 | + | |
| 144 | + | |
180 | 145 | | |
181 | 146 | | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
186 | 150 | | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
| 151 | + | |
191 | 152 | | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
208 | 158 | | |
209 | 159 | | |
210 | 160 | | |
| |||
219 | 169 | | |
220 | 170 | | |
221 | 171 | | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
128 | 135 | | |
129 | 136 | | |
130 | 137 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| 53 | + | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
| |||
305 | 306 | | |
306 | 307 | | |
307 | 308 | | |
308 | | - | |
| 309 | + | |
309 | 310 | | |
310 | 311 | | |
311 | 312 | | |
| |||
317 | 318 | | |
318 | 319 | | |
319 | 320 | | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
320 | 339 | | |
321 | 340 | | |
322 | 341 | | |
| |||
331 | 350 | | |
332 | 351 | | |
333 | 352 | | |
334 | | - | |
335 | 353 | | |
336 | 354 | | |
337 | 355 | | |
| |||
350 | 368 | | |
351 | 369 | | |
352 | 370 | | |
353 | | - | |
354 | | - | |
355 | | - | |
| 371 | + | |
356 | 372 | | |
357 | 373 | | |
358 | 374 | | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
| 375 | + | |
| 376 | + | |
387 | 377 | | |
388 | 378 | | |
389 | 379 | | |
| |||
399 | 389 | | |
400 | 390 | | |
401 | 391 | | |
402 | | - | |
| 392 | + | |
403 | 393 | | |
404 | 394 | | |
405 | 395 | | |
406 | 396 | | |
407 | 397 | | |
408 | 398 | | |
409 | 399 | | |
410 | | - | |
| 400 | + | |
411 | 401 | | |
412 | | - | |
413 | | - | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | | - | |
418 | | - | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
419 | 405 | | |
420 | 406 | | |
421 | 407 | | |
| |||
0 commit comments