Skip to content

Commit 9f69cd0

Browse files
committed
README
Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
1 parent 65b291d commit 9f69cd0

1 file changed

Lines changed: 17 additions & 0 deletions

File tree

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# PTQ Config Units
2+
3+
Reusable building blocks for composing PTQ quantization configurations.
4+
Each file defines one or more `quant_cfg` entries that can be imported
5+
into recipes or presets via `$import`.
6+
7+
Units are **not** standalone configs — they don't have `algorithm` or
8+
`metadata`. They are meant to be composed into complete configs by
9+
recipes (under `general/` or `models/`) or presets (under `presets/`).
10+
11+
| File | Description |
12+
|------|-------------|
13+
| `base_disable_all.yaml` | Deny-all entry: disables all quantizers as the first step |
14+
| `default_disabled_quantizers.yaml` | Standard exclusions (LM head, routers, BatchNorm, etc.) |
15+
| `fp8_kv.yaml` | FP8 E4M3 KV cache quantizer entry |
16+
| `w8a8_fp8_fp8.yaml` | FP8 weight + activation quantizer entries (W8A8) |
17+
| `w4a4_nvfp4_nvfp4.yaml` | NVFP4 weight + activation quantizer entries (W4A4) |

0 commit comments

Comments
 (0)