Skip to content

Commit 44970f8

Browse files
committed
Dedupe disabled_quantizers units by importing default_disabled_quantizers
Before: huggingface/{nemotron_vl,phi4mm}/ptq/disabled_quantizers.yaml each duplicated the 14-entry default_disabled_quantizers list verbatim and then appended the model-specific exclusions. After: both units use multi-document YAML to declare an `imports:` section, `$import: default_disabled_quantizers` as the first list entry, and only the model-specific exclusions explicitly. Recipes that import these units are unaffected; the resolved quant_cfg is unchanged. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
1 parent 41e1788 commit 44970f8

2 files changed

Lines changed: 18 additions & 73 deletions

File tree

modelopt_recipes/huggingface/nemotron_vl/ptq/disabled_quantizers.yaml

Lines changed: 9 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -13,45 +13,18 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16-
# QuantizerCfgList snippet of disabled quantizers for Nemotron VL. Merges the
17-
# standard `default_disabled_quantizers` exclusions with Nemotron-VL-specific
18-
# ones (only the decoder is quantized; vision/encoder branches, including the
19-
# Nemotron-Parse radio/model_encoder modules, are skipped). Recipes that
16+
# QuantizerCfgList snippet of disabled quantizers for Nemotron VL. Splices in
17+
# the standard `default_disabled_quantizers` exclusions and appends
18+
# Nemotron-VL-specific ones so that only the decoder (text-generation
19+
# component) is quantized; vision/encoder branches, including the
20+
# Nemotron-Parse radio/model_encoder modules, are skipped. Recipes that
2021
# import this should NOT also import `default_disabled_quantizers`.
2122

2223
# modelopt-schema: modelopt.torch.quantization.config.QuantizerCfgListConfig
23-
- quantizer_name: '*block_sparse_moe.gate*'
24-
enable: false
25-
- quantizer_name: '*linear_attn.conv1d*'
26-
enable: false
27-
- quantizer_name: '*lm_head*'
28-
enable: false
29-
- quantizer_name: '*mixer.conv1d*'
30-
enable: false
31-
- quantizer_name: '*mlp.gate.*'
32-
enable: false
33-
- quantizer_name: '*mlp.shared_expert_gate.*'
34-
enable: false
35-
- quantizer_name: '*output_layer*'
36-
enable: false
37-
- quantizer_name: '*proj_out.*'
38-
enable: false
39-
- quantizer_name: '*router*'
40-
enable: false
41-
- quantizer_name: 'output.*'
42-
enable: false
43-
- parent_class: 'nn.BatchNorm1d'
44-
quantizer_name: '*'
45-
enable: false
46-
- parent_class: 'nn.BatchNorm2d'
47-
quantizer_name: '*'
48-
enable: false
49-
- parent_class: 'nn.BatchNorm3d'
50-
quantizer_name: '*'
51-
enable: false
52-
- parent_class: 'nn.LeakyReLU'
53-
quantizer_name: '*'
54-
enable: false
24+
imports:
25+
default_disabled_quantizers: configs/ptq/units/default_disabled_quantizers
26+
---
27+
- $import: default_disabled_quantizers
5528
- quantizer_name: '*vision*'
5629
enable: false
5730
- quantizer_name: '*image*'

modelopt_recipes/huggingface/phi4mm/ptq/disabled_quantizers.yaml

Lines changed: 9 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -13,45 +13,17 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16-
# QuantizerCfgList snippet of disabled quantizers for Phi-4-Multimodal. Merges
17-
# the standard `default_disabled_quantizers` exclusions with Phi-4-MM-specific
18-
# ones (only the language model is quantized; speech/audio/image/vision
19-
# branches are skipped). Recipes that import this should NOT also import
20-
# `default_disabled_quantizers`.
16+
# QuantizerCfgList snippet of disabled quantizers for Phi-4-Multimodal.
17+
# Splices in the standard `default_disabled_quantizers` exclusions and appends
18+
# Phi-4-MM-specific ones so that only the language model is quantized;
19+
# speech/audio/image/vision branches are skipped. Recipes that import this
20+
# should NOT also import `default_disabled_quantizers`.
2121

2222
# modelopt-schema: modelopt.torch.quantization.config.QuantizerCfgListConfig
23-
- quantizer_name: '*block_sparse_moe.gate*'
24-
enable: false
25-
- quantizer_name: '*linear_attn.conv1d*'
26-
enable: false
27-
- quantizer_name: '*lm_head*'
28-
enable: false
29-
- quantizer_name: '*mixer.conv1d*'
30-
enable: false
31-
- quantizer_name: '*mlp.gate.*'
32-
enable: false
33-
- quantizer_name: '*mlp.shared_expert_gate.*'
34-
enable: false
35-
- quantizer_name: '*output_layer*'
36-
enable: false
37-
- quantizer_name: '*proj_out.*'
38-
enable: false
39-
- quantizer_name: '*router*'
40-
enable: false
41-
- quantizer_name: 'output.*'
42-
enable: false
43-
- parent_class: 'nn.BatchNorm1d'
44-
quantizer_name: '*'
45-
enable: false
46-
- parent_class: 'nn.BatchNorm2d'
47-
quantizer_name: '*'
48-
enable: false
49-
- parent_class: 'nn.BatchNorm3d'
50-
quantizer_name: '*'
51-
enable: false
52-
- parent_class: 'nn.LeakyReLU'
53-
quantizer_name: '*'
54-
enable: false
23+
imports:
24+
default_disabled_quantizers: configs/ptq/units/default_disabled_quantizers
25+
---
26+
- $import: default_disabled_quantizers
5527
- quantizer_name: '*speech*'
5628
enable: false
5729
- quantizer_name: '*audio*'

0 commit comments

Comments
 (0)