You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Svdquant huggingface checkpoint export support (#754)
## What does this PR do?
**Type of change:** new feature
**Overview:**
## Usage
```bash
cd ./examples/llm_ptq/
python hf_ptq.py \
--pyt_ckpt_path Qwen/Qwen3-4B \
--export_path /home/scratch.shiychen_coreai/quantized_models/Qwen3-4B-svdq \
--qformat nvfp4_awq_svdquant --kv_cache_qformat none --sparsity_fmt dense --calib_size 8
```
## Testing
exported checkpoint and loaded.
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added nvfp4_svdquant as a new quantization format option for LLM model
quantization workflows.
* **Limitations**
* Multi-GPU export configurations using tensor or pipeline parallelism
are not supported with nvfp4_svdquant quantization.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Shiyang Chen <shiychen@nvidia.com>
0 commit comments