[fix] AutoQuant: clamp instead of use fp64 in auto quant score (#1156)

Fridah-nv · web-flow · commit 18ddcb766c8d · 2026-04-02T16:12:35.000-07:00
### What does this PR do? Type of change: ?   ### Usage ```python # Add a code snippet demonstrating how to use this ``` ### Testing  ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ / ❌ / N/A  - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A  - Did you write any new necessary tests?: ✅ / ❌ / N/A  - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ / ❌ / N/A  ### Additional Information   ## Summary by CodeRabbit * **Bug Fixes** * Improved numeric stability in quantization score calculations by introducing saturation bounds, preventing potential overflow during intermediate value processing and ensuring more robust computation across a wider range of input values.  Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
diff --git a/modelopt/torch/quantization/algorithms.py b/modelopt/torch/quantization/algorithms.py
@@ -764,7 +764,7 @@ def run_search(self):
 
 def _get_auto_quantize_score(grad_output, output_diff):
     x = grad_output.float() * output_diff.float()
-    return x.to(torch.float64).square().sum()
+    return x.clamp(-1e10, 1e10).square().sum()
 
 
 def _add_auto_quantize_score(grad_output, output_diff, score_tensor):