Skip to content

Commit b0eadcd

Browse files
committed
Skip quant for linear_attn.in_proj_a/b
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
1 parent dec2952 commit b0eadcd

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

modelopt/torch/quantization/config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,8 @@ def find_quant_cfg_entry_by_path(
228228
"enable": False,
229229
}, # Skip the MOE router
230230
{"quantizer_name": "*linear_attn.conv1d*", "enable": False},
231+
{"quantizer_name": "*linear_attn.in_proj_a*", "enable": False},
232+
{"quantizer_name": "*linear_attn.in_proj_b*", "enable": False},
231233
{"quantizer_name": "*mixer.conv1d*", "enable": False}, # Skip mamba conv1d
232234
{"quantizer_name": "*output_layer*", "enable": False},
233235
{"quantizer_name": "output.*", "enable": False},

0 commit comments

Comments
 (0)