Skip to content

Commit 5901aae

Browse files
committed
license headers + more doc
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
1 parent 73871da commit 5901aae

File tree

7 files changed

+137
-1
lines changed

7 files changed

+137
-1
lines changed

docs/source/guides/10_recipes.rst

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,16 @@ lists are authored once and referenced by name across recipes.
173173

174174
The ``imports`` section is a dict mapping short names to config file paths.
175175
References use the explicit ``{$import: name}`` marker so they are never
176-
confused with literal values. The marker can appear anywhere in the recipe:
176+
confused with literal values.
177+
178+
.. note::
179+
180+
``imports`` (no ``$``) is a **top-level structural section** — like
181+
``metadata`` or ``quantize``, it declares the recipe's dependencies.
182+
``$import`` (with ``$``) is an **inline directive** that appears inside
183+
data values and gets resolved at load time.
184+
185+
The ``$import`` marker can appear anywhere in the recipe:
177186

178187
- As a **dict value** — the marker is replaced with the snippet content.
179188
- As a **list element** — the snippet (which must itself be a list) is spliced
@@ -250,6 +259,35 @@ section. Each file's imports are scoped to that file — the same name can be
250259
used in different files without conflict. Circular imports are detected and
251260
raise ``ValueError``.
252261

262+
Multi-document snippets
263+
^^^^^^^^^^^^^^^^^^^^^^^
264+
265+
Dict-valued snippets (e.g., numeric format definitions) can use ``imports``
266+
directly because the ``imports`` key and the snippet content are both part of
267+
the same YAML mapping. List-valued snippets have a problem: YAML only allows
268+
one root node per document, so a file cannot be both a mapping (for
269+
``imports``) and a list (for entries) at the same time.
270+
271+
The solution is **multi-document YAML**: the first document holds the
272+
``imports``, and the second document (after ``---``) holds the list content.
273+
The loader parses both documents, resolves ``$import`` markers in the content,
274+
and returns the resolved list:
275+
276+
.. code-block:: yaml
277+
278+
# configs/ptq/fp8_kv.yaml — list snippet that imports a dict snippet
279+
imports:
280+
fp8: configs/numerics/fp8
281+
---
282+
- quantizer_name: '*[kv]_bmm_quantizer'
283+
cfg:
284+
$import: fp8
285+
286+
This enables full composability — list snippets can reference dict snippets,
287+
dict snippets can reference other dict snippets, and recipes can reference
288+
any of them. All import resolution happens at load time with the same
289+
precedence rules.
290+
253291
Built-in config snippets
254292
^^^^^^^^^^^^^^^^^^^^^^^^
255293

@@ -271,6 +309,8 @@ Reusable snippets are stored under ``modelopt_recipes/configs/``:
271309
- Disable all quantizers (deny-all-then-configure pattern)
272310
* - ``configs/ptq/default_disabled_quantizers``
273311
- Standard exclusions (LM head, routers, BatchNorm, etc.)
312+
* - ``configs/ptq/fp8_kv``
313+
- FP8 E4M3 KV cache quantization (multi-document, imports ``fp8``)
274314

275315

276316
Metadata section
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,17 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# FP8 E4M3 quantizer attributes (no axis — used for KV cache, etc.).
217
num_bits: e4m3

modelopt_recipes/configs/numerics/nvfp4_dynamic.yml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# NVFP4 E2M1 blockwise with dynamic calibration and FP8 E4M3 scales.
217
num_bits: e2m1
318
block_sizes:

modelopt_recipes/configs/numerics/nvfp4_static.yml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# NVFP4 E2M1 blockwise with static calibration and FP8 E4M3 scales.
217
num_bits: e2m1
318
block_sizes:

modelopt_recipes/configs/ptq/base_disable_all.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# Disable all quantizers by default (deny-all-then-configure pattern).
217

318
- quantizer_name: '*'

modelopt_recipes/configs/ptq/default_disabled_quantizers.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# Standard quantizer exclusions: layers that should not be quantized.
217

318
- quantizer_name: '*block_sparse_moe.gate*'

modelopt_recipes/configs/ptq/fp8_kv.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,25 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
116
# FP8 E4M3 KV cache quantization.
17+
#
18+
# This snippet uses multi-document YAML (separated by ---) because it is a
19+
# list-valued snippet that also needs to $import another snippet. YAML only
20+
# allows one root node per document, so a file cannot be both a mapping
21+
# (for imports) and a list (for entries). The first document holds the
22+
# imports, the second holds the list content that references them.
223
imports:
324
fp8: configs/numerics/fp8
425
---

0 commit comments

Comments
 (0)