Commit 2bc8a54
committed
[Quantization] Shrink FP8 sweep parity matrix from 27 to 12 cases
Trim the parity grid to keep all three axes but with smaller per-axis
ranges: 2 seeds × 2 num_blocks × 3 dtypes = 12 parametrized cases (down
from 3×3×3 = 27). Still exercises every supported dtype and the small/
large num_blocks extremes that drive different autotune choices, while
roughly halving the cold-compile cost on hosts where Triton compilation
is expensive.
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>1 parent 8f04a9a commit 2bc8a54
1 file changed
Lines changed: 2 additions & 2 deletions
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
89 | | - | |
90 | | - | |
| 89 | + | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
0 commit comments