Skip to content

Commit 3f9f709

Browse files
committed
new updates
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
1 parent f014cc8 commit 3f9f709

2 files changed

Lines changed: 30 additions & 24 deletions

File tree

docs/source/guides/_recipes.rst

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Recipe file format
4747
==================
4848

4949
A recipe is a YAML file with two top-level sections: ``metadata`` and a
50-
type-specific configuration section (currently ``ptq_cfg`` for PTQ recipes).
50+
type-specific configuration section (currently ``quantize`` for PTQ recipes).
5151

5252
Single-file format
5353
------------------
@@ -62,20 +62,20 @@ The simplest form is a single ``.yml`` or ``.yaml`` file:
6262
recipe_type: ptq
6363
description: FP8 per-tensor weight and activation (W8A8), FP8 KV cache, max calibration.
6464
65-
ptq_cfg:
65+
quantize:
6666
algorithm: max
6767
quant_cfg:
68-
- quantizer_path: '*'
68+
- quantizer_name: '*'
6969
enable: false
70-
- quantizer_path: '*input_quantizer'
70+
- quantizer_name: '*input_quantizer'
7171
cfg:
7272
num_bits: e4m3
7373
axis:
74-
- quantizer_path: '*weight_quantizer'
74+
- quantizer_name: '*weight_quantizer'
7575
cfg:
7676
num_bits: e4m3
7777
axis:
78-
- quantizer_path: '*[kv]_bmm_quantizer'
78+
- quantizer_name: '*[kv]_bmm_quantizer'
7979
enable: true
8080
cfg:
8181
num_bits: e4m3
@@ -91,7 +91,7 @@ quantization configuration, use a directory with two files:
9191
9292
my_recipe/
9393
recipe.yml # metadata section
94-
ptq_cfg.yml # ptq_cfg section (quant_cfg + algorithm)
94+
quantize.yml # quantize section (quant_cfg + algorithm)
9595
9696
``recipe.yml``:
9797

@@ -101,19 +101,19 @@ quantization configuration, use a directory with two files:
101101
recipe_type: ptq
102102
description: My custom NVFP4 recipe.
103103
104-
``ptq_cfg.yml``:
104+
``quantize.yml``:
105105

106106
.. code-block:: yaml
107107
108108
algorithm: max
109109
quant_cfg:
110-
- quantizer_path: '*'
110+
- quantizer_name: '*'
111111
enable: false
112-
- quantizer_path: '*weight_quantizer'
112+
- quantizer_name: '*weight_quantizer'
113113
cfg:
114114
num_bits: e2m1
115115
block_sizes: {-1: 16, type: dynamic, scale_bits: e4m3}
116-
- quantizer_path: '*input_quantizer'
116+
- quantizer_name: '*input_quantizer'
117117
cfg:
118118
num_bits: e4m3
119119
axis:
@@ -142,7 +142,7 @@ Every recipe file must contain a ``metadata`` mapping with at least a ``recipe_t
142142
PTQ configuration section
143143
=========================
144144

145-
For PTQ recipes (``recipe_type: ptq``), the ``ptq_cfg`` mapping contains:
145+
For PTQ recipes (``recipe_type: ptq``), the ``quantize`` mapping contains:
146146

147147
.. list-table::
148148
:header-rows: 1
@@ -264,16 +264,16 @@ against the built-in library first, then the filesystem:
264264
recipe = load_recipe("general/ptq/fp8_default-fp8_kv")
265265
assert isinstance(recipe, ModelOptPTQRecipe)
266266
267-
# The ptq_cfg dict can be passed directly to mtq.quantize()
267+
# The quantize dict can be passed directly to mtq.quantize()
268268
import modelopt.torch.quantization as mtq
269269
270-
model = mtq.quantize(model, recipe.ptq_cfg, forward_loop)
270+
model = mtq.quantize(model, recipe.quantize, forward_loop)
271271
272272
.. code-block:: python
273273
274274
# Load a custom recipe from the filesystem
275275
recipe = load_recipe("/path/to/my_custom_recipe.yml")
276-
model = mtq.quantize(model, recipe.ptq_cfg, forward_loop)
276+
model = mtq.quantize(model, recipe.quantize, forward_loop)
277277
278278
Command-line usage
279279
------------------
@@ -289,8 +289,8 @@ The ``hf_ptq.py`` example accepts a ``--recipe`` flag:
289289
--calib_size 512 \
290290
--export_fmt hf
291291
292-
When ``--recipe`` is provided, the script loads the recipe and uses its ``ptq_cfg``
293-
directly, bypassing the ``--qformat`` / ``--kv_cache_qformat`` flags.
292+
When ``--recipe`` is provided, the script loads the recipe and uses its ``quantize``
293+
config directly, bypassing the ``--qformat`` / ``--kv_cache_qformat`` flags.
294294

295295

296296
Loading standalone configs
@@ -347,22 +347,22 @@ Example -- creating an INT8 per-channel recipe:
347347
recipe_type: ptq
348348
description: INT8 per-channel weight, per-tensor activation.
349349
350-
ptq_cfg:
350+
quantize:
351351
algorithm: max
352352
quant_cfg:
353-
- quantizer_path: '*'
353+
- quantizer_name: '*'
354354
enable: false
355-
- quantizer_path: '*weight_quantizer'
355+
- quantizer_name: '*weight_quantizer'
356356
cfg:
357357
num_bits: 8
358358
axis: 0
359-
- quantizer_path: '*input_quantizer'
359+
- quantizer_name: '*input_quantizer'
360360
cfg:
361361
num_bits: 8
362362
axis:
363-
- quantizer_path: '*lm_head*'
363+
- quantizer_name: '*lm_head*'
364364
enable: false
365-
- quantizer_path: '*output_layer*'
365+
- quantizer_name: '*output_layer*'
366366
enable: false
367367
368368
@@ -397,7 +397,7 @@ Recipes are validated at load time using Pydantic models:
397397
Base class for all recipe types. Contains ``recipe_type`` and ``description``.
398398

399399
:class:`~modelopt.recipe.config.ModelOptPTQRecipe`
400-
PTQ-specific recipe. Adds the ``ptq_cfg`` field (a dict with ``quant_cfg`` and
400+
PTQ-specific recipe. Adds the ``quantize`` field (a dict with ``quant_cfg`` and
401401
``algorithm``).
402402

403403
:class:`~modelopt.recipe.config.RecipeType`

modelopt/torch/quantization/config.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1654,6 +1654,12 @@ def _dict_to_entry(key: str, value) -> list[QuantizerCfgEntry]:
16541654
raise ValueError(f"Invalid quant_cfg entry: {raw!r}.")
16551655

16561656
for entry in entries:
1657+
# Normalize: empty cfg (empty dict or empty list) carries no information
1658+
# and is equivalent to no cfg. Strip it so the validation below can
1659+
# detect entries that have *neither* cfg nor enable.
1660+
if "cfg" in entry and isinstance(entry["cfg"], (dict, list)) and len(entry["cfg"]) == 0:
1661+
del entry["cfg"]
1662+
16571663
# Validate: must carry at least one instruction beyond the path selector.
16581664
if "cfg" not in entry and "enable" not in entry:
16591665
raise ValueError(

0 commit comments

Comments
 (0)