Commit dedd0a0
Fix a nvfp4 weight amax attribute issue during export (#785)
## What does this PR do?
**Type of change:** Bugfix <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
**Overview:** Fix a nvfp4 weight amax attribute issue during export,
especially when calibration size is small. Context:
sgl-project/sglang#14677 (comment)
## Usage
<!-- You can potentially add a usage example below. -->
```python
python3 hf_ptq.py --pyt_ckpt_path /home/scratch.jingyux_coreai/kimi-k2/models/Kimi-K2-Thinking-BF16 --qformat nvfp4_mlp_only --export_path /home/omniml_data_3/zhiyuc/checkpoints/Kimi-K2-Thinking-NVFP4 --kv_cache_qformat none --calib_size 20 --trust_remote_code --dataset cnn_dailymail
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes <!--- If No, explain why.
-->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Bug Fixes
* Improved weight quantizer calibration to ensure quantizers are
properly initialized with calibration statistics before computing
scaling factors.
* Enhanced reliability and consistency of quantized model exports.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>1 parent 6d35ffe commit dedd0a0
1 file changed
Lines changed: 44 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
239 | 264 | | |
240 | 265 | | |
241 | 266 | | |
| |||
279 | 304 | | |
280 | 305 | | |
281 | 306 | | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
282 | 311 | | |
283 | 312 | | |
284 | 313 | | |
| |||
307 | 336 | | |
308 | 337 | | |
309 | 338 | | |
310 | | - | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
311 | 353 | | |
312 | 354 | | |
313 | 355 | | |
314 | 356 | | |
315 | 357 | | |
316 | | - | |
| 358 | + | |
317 | 359 | | |
318 | 360 | | |
319 | 361 | | |
| |||
0 commit comments