Commit 6ffe4a5
Add nvfp4_local_hessian to QUANT_CFG_CHOICES (#1065)
### What does this PR do?
Type of change: New feature
Wire up `NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG` (from PR #788) to the
`hf_ptq.py` CLI so it can be used via `--qformat nvfp4_local_hessian`.
One-line addition to `QUANT_CFG_CHOICES` dict.
### Usage
```bash
python examples/llm_ptq/hf_ptq.py \
--model Qwen/Qwen3-8B \
--qformat nvfp4_local_hessian \
--kv_cache_qformat fp8 \
--export_fmt hf
```
### Testing
Tested via modelopt-quantization CI pipeline (quant_flow) on GB200
(`oci-hsg` launcher) with Qwen3-8B. PTQ stage completed successfully.
### Before your PR is "*Ready for review*"
- Is this change backward compatible?: ✅
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: N/A
- Did you write any new necessary tests?: N/A (wiring existing config to
existing CLI)
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
N/A
### Additional Information
- `NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG` was added in PR #788 but not
exposed via the CLI.
- Also used in modelopt-quantization CI (`quant_flow`) for automated
NVFP4 scale-setting sweeps.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added a new KV-cache quantization configuration option, expanding the
available quantization choices for users. This provides an additional
quantization mode to select from in configuration UIs and CLIs while
preserving existing behavior and compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent 1dc890d commit 6ffe4a5
1 file changed
+1
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| 110 | + | |
110 | 111 | | |
111 | 112 | | |
112 | 113 | | |
| |||
0 commit comments