Commit a9c8ccf
committed
[Quantization] Address review feedback round 3 on FP8 sweep
Three changes from realAsma's latest review:
- nvfp4_fp8_sweep kernel: use ``scale_safe`` rather than ``scale`` in the
per-candidate diff so the divisor and multiplier match. Numerically
equivalent on real inputs (the only case where ``scale_safe`` differs
from ``scale`` is ``global_amax == 0``, in which case ``w_abs`` is also
zero so the loss is zero either way), but more consistent.
- Extract ``fp8_scale_candidates`` to a triton-free module
``_fp8_scale_candidates.py`` so the calibrator's reference sweep and the
Triton kernel wrapper share one definition. Removes the duplicate copy
in ``NVFP4MSECalibrator._generate_candidates``.
- Parity test: extend ``test_parity_random_weights`` to cover bf16 and
fp16 in addition to fp32 by parametrizing on dtype, so the canonical
parity grid (3 seeds × 3 num_blocks) is now exercised on every supported
dtype. Folded the smaller ``test_parity_dtypes`` into this since it was
a strict subset.
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>1 parent 95b8a95 commit a9c8ccf
4 files changed
Lines changed: 48 additions & 43 deletions
File tree
- modelopt/torch
- kernels/quantization/gemm
- quantization/calib
- tests/gpu/torch/quantization
Lines changed: 31 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
Lines changed: 4 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | 41 | | |
49 | 42 | | |
50 | 43 | | |
| |||
93 | 86 | | |
94 | 87 | | |
95 | 88 | | |
96 | | - | |
97 | | - | |
| 89 | + | |
| 90 | + | |
98 | 91 | | |
99 | 92 | | |
100 | | - | |
| 93 | + | |
101 | 94 | | |
102 | 95 | | |
103 | 96 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
203 | 203 | | |
204 | 204 | | |
205 | 205 | | |
206 | | - | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
207 | 210 | | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
| 211 | + | |
217 | 212 | | |
218 | 213 | | |
219 | 214 | | |
| |||
Lines changed: 8 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| 88 | + | |
88 | 89 | | |
89 | 90 | | |
90 | | - | |
91 | | - | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | | - | |
95 | | - | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
96 | 99 | | |
97 | 100 | | |
98 | 101 | | |
| |||
102 | 105 | | |
103 | 106 | | |
104 | 107 | | |
105 | | - | |
| 108 | + | |
106 | 109 | | |
107 | 110 | | |
108 | 111 | | |
109 | 112 | | |
110 | 113 | | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | 114 | | |
129 | 115 | | |
130 | 116 | | |
| |||
0 commit comments