Add nvfp4_local_hessian to QUANT_CFG_CHOICES by sungsooha · Pull Request #1065 · NVIDIA/Model-Optimizer

sungsooha · 2026-03-18T05:29:39Z

What does this PR do?

Type of change: New feature

Wire up NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG (from PR #788) to the hf_ptq.py CLI so it can be used via --qformat nvfp4_local_hessian.

One-line addition to QUANT_CFG_CHOICES dict.

Usage

python examples/llm_ptq/hf_ptq.py \
  --model Qwen/Qwen3-8B \
  --qformat nvfp4_local_hessian \
  --kv_cache_qformat fp8 \
  --export_fmt hf

Testing

Tested via modelopt-quantization CI pipeline (quant_flow) on GB200 (oci-hsg launcher) with Qwen3-8B. PTQ stage completed successfully.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A (wiring existing config to existing CLI)
Did you update Changelog?: N/A

Additional Information

NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG was added in PR add local hessian calibration #788 but not exposed via the CLI.
Also used in modelopt-quantization CI (quant_flow) for automated NVFP4 scale-setting sweeps.

Summary by CodeRabbit

New Features
- Added a new KV-cache quantization configuration option, expanding the available quantization choices for users. This provides an additional quantization mode to select from in configuration UIs and CLIs while preserving existing behavior and compatibility.

copy-pr-bot · 2026-03-18T05:29:42Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-18T05:29:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 79683ca3-507f-4dec-a0ce-e5f84df3f73b

📥 Commits

Reviewing files that changed from the base of the PR and between dce46d2dd7aac943fde29a5ac0ff87a72ecd6279 and d871382.

📒 Files selected for processing (1)

examples/llm_ptq/hf_ptq.py

🚧 Files skipped from review as they are similar to previous changes (1)

examples/llm_ptq/hf_ptq.py

📝 Walkthrough

Walkthrough

Added a new KV quantization configuration key "nvfp4_local_hessian" to the example script, mapping it to mtq.NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG in the QUANT_CFG_CHOICES dictionary.

Changes

Cohort / File(s)	Summary
KV Quantization Configuration `examples/llm_ptq/hf_ptq.py`	Added `nvfp4_local_hessian` option to `QUANT_CFG_CHOICES`, mapped to `mtq.NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: adding a new quantization configuration option to QUANT_CFG_CHOICES.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	The pull request introduces no security anti-patterns as defined in SECURITY.md. The change is a minimal dictionary entry that exposes an existing, well-tested quantization configuration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-18T05:42:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.30%. Comparing base (7c33d85) to head (d871382).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1065   +/-   ##
=======================================
  Coverage   70.29%   70.30%           
=======================================
  Files         227      227           
  Lines       25860    25854    -6     
=======================================
- Hits        18179    18176    -3     
+ Misses       7681     7678    -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Wire up NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG (from PR NVIDIA#788) to the hf_ptq.py CLI so it can be used via --qformat nvfp4_local_hessian. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>

### What does this PR do? Type of change: New feature Wire up `NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG` (from PR #788) to the `hf_ptq.py` CLI so it can be used via `--qformat nvfp4_local_hessian`. One-line addition to `QUANT_CFG_CHOICES` dict. ### Usage ```bash python examples/llm_ptq/hf_ptq.py \ --model Qwen/Qwen3-8B \ --qformat nvfp4_local_hessian \ --kv_cache_qformat fp8 \ --export_fmt hf ``` ### Testing Tested via modelopt-quantization CI pipeline (quant_flow) on GB200 (`oci-hsg` launcher) with Qwen3-8B. PTQ stage completed successfully. ### Before your PR is "*Ready for review*" - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: N/A - Did you write any new necessary tests?: N/A (wiring existing config to existing CLI) - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: N/A ### Additional Information - `NVFP4_W4A4_WEIGHT_LOCAL_HESSIAN_CFG` was added in PR #788 but not exposed via the CLI. - Also used in modelopt-quantization CI (`quant_flow`) for automated NVFP4 scale-setting sweeps.  ## Summary by CodeRabbit * **New Features** * Added a new KV-cache quantization configuration option, expanding the available quantization choices for users. This provides an additional quantization mode to select from in configuration UIs and CLIs while preserving existing behavior and compatibility.  Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

sungsooha requested a review from a team as a code owner March 18, 2026 05:29

sungsooha requested a review from realAsma March 18, 2026 05:29

sungsooha force-pushed the sungsooh/add-nvfp4-local-hessian-qformat branch from b322b8e to dce46d2 Compare March 18, 2026 05:30

realAsma approved these changes Mar 18, 2026

View reviewed changes

realAsma requested a review from cjluo-nv March 18, 2026 16:01

cjluo-nv approved these changes Mar 18, 2026

View reviewed changes

sungsooha force-pushed the sungsooh/add-nvfp4-local-hessian-qformat branch from dce46d2 to d871382 Compare March 18, 2026 17:32

realAsma merged commit 6ffe4a5 into NVIDIA:main Mar 18, 2026
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nvfp4_local_hessian to QUANT_CFG_CHOICES#1065

Add nvfp4_local_hessian to QUANT_CFG_CHOICES#1065
realAsma merged 1 commit intoNVIDIA:mainfrom
sungsooha:sungsooh/add-nvfp4-local-hessian-qformat

sungsooha commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sungsooha commented Mar 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sungsooha commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

codecov bot commented Mar 18, 2026 •

edited

Loading