Commit d7e72f4
authored
Refine static NVFP4 MSE calibration (#1536)
### What does this PR do?
Type of change: Bug fix
Refines static NVFP4 MSE calibration and forces static NVFP4 amax state
to stay FP32 across calibration loading, quantizer promotion, dtype
casts, and restore paths.
Main changes:
- Tighten max/MSE calibration bootstrap and static NVFP4 quantizer
promotion.
- Keep static NVFP4 `_amax` and `_global_amax` in FP32.
- Update focused GPU/unit coverage for FP8 sweep calibration, promotion,
restore, and FP32 amax preservation.
### Usage
```yaml
algorithm:
method: mse
fp8_scale_sweep: true
```
### Testing
```bash
pre-commit run --files modelopt/torch/quantization/nn/modules/tensor_quantizer.py tests/gpu/torch/quantization/test_nvfp4_static_quantizer_cuda.py
pytest_pwd tests/gpu/torch/quantization/test_nvfp4_static_quantizer_cuda.py -q
```
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors).
- Is this change backward compatible?: Yes
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: N/A
- Did you write any new necessary tests?: Yes
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
N/A
- Did you get Claude approval on this PR?: Yes
### Additional Information
N/A
Signed-off-by: realAsma <akuriparambi@nvidia.com>1 parent 7aa0c95 commit d7e72f4
12 files changed
Lines changed: 695 additions & 306 deletions
File tree
- examples/llm_ptq
- modelopt/torch/quantization
- calib
- nn/modules
- utils
- tests
- gpu/torch/quantization
- plugins
- unit/torch/quantization
- plugins
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
300 | | - | |
301 | | - | |
| 300 | + | |
| 301 | + | |
302 | 302 | | |
303 | 303 | | |
304 | 304 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
178 | 178 | | |
179 | 179 | | |
180 | 180 | | |
181 | | - | |
182 | | - | |
183 | | - | |
| 181 | + | |
| 182 | + | |
184 | 183 | | |
185 | 184 | | |
186 | 185 | | |
| |||
193 | 192 | | |
194 | 193 | | |
195 | 194 | | |
196 | | - | |
197 | | - | |
198 | | - | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
199 | 198 | | |
200 | 199 | | |
201 | 200 | | |
202 | 201 | | |
203 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
204 | 205 | | |
205 | 206 | | |
206 | 207 | | |
| |||
235 | 236 | | |
236 | 237 | | |
237 | 238 | | |
238 | | - | |
239 | | - | |
| 239 | + | |
| 240 | + | |
240 | 241 | | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
| 242 | + | |
| 243 | + | |
246 | 244 | | |
247 | | - | |
248 | | - | |
249 | | - | |
| 245 | + | |
250 | 246 | | |
251 | 247 | | |
252 | 248 | | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
258 | 252 | | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
259 | 257 | | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
260 | 267 | | |
261 | 268 | | |
262 | 269 | | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
| 270 | + | |
| 271 | + | |
267 | 272 | | |
268 | 273 | | |
269 | 274 | | |
| |||
273 | 278 | | |
274 | 279 | | |
275 | 280 | | |
276 | | - | |
| 281 | + | |
277 | 282 | | |
278 | 283 | | |
279 | 284 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
722 | 722 | | |
723 | 723 | | |
724 | 724 | | |
725 | | - | |
| 725 | + | |
726 | 726 | | |
727 | | - | |
| 727 | + | |
728 | 728 | | |
729 | 729 | | |
730 | 730 | | |
| |||
755 | 755 | | |
756 | 756 | | |
757 | 757 | | |
758 | | - | |
759 | | - | |
| 758 | + | |
| 759 | + | |
760 | 760 | | |
761 | 761 | | |
762 | 762 | | |
| |||
0 commit comments