Commit d39cf45
[NVBug 5702186] Fix awq model export for Gemma3 (#793)
## What does this PR do?
**Type of change:** Bug fix <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
**Overview:** norms laers in Gemma that use (1 + weight) in forward, we
will fold pre_quant_scale into the effective weight. That is to find
folded w' subject to: `1 + w' = (1 + w) * s` => `w' = (1 + w) * s -1`
## Usage
<!-- You can potentially add a usage example below. -->
```python
# Add a code snippet demonstrating how to use this
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
./scripts/huggingface_example.sh --model google/gemma-3-1b-it --quant
int4_awq
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Improvements**
* Enhanced quantization utilities to better handle various LayerNorm
variants and normalization patterns, including support for weight-offset
variants and zero-centered gamma configurations.
* Optimized pre-quantization layer normalization fusion to apply
conditional weight scaling strategies based on normalization type.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>1 parent 304e81f commit d39cf45
1 file changed
Lines changed: 19 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1101 | 1101 | | |
1102 | 1102 | | |
1103 | 1103 | | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
| 1113 | + | |
1104 | 1114 | | |
1105 | 1115 | | |
1106 | 1116 | | |
| |||
1116 | 1126 | | |
1117 | 1127 | | |
1118 | 1128 | | |
1119 | | - | |
1120 | | - | |
| 1129 | + | |
| 1130 | + | |
1121 | 1131 | | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
1122 | 1138 | | |
1123 | | - | |
1124 | | - | |
1125 | | - | |
| 1139 | + | |
1126 | 1140 | | |
1127 | 1141 | | |
1128 | 1142 | | |
| |||
0 commit comments