Commit 3fcc5a7
committed
Fix non-scalar input amax in preprocess_linear_fusion for MoE export
preprocess_linear_fusion unconditionally asserts
`modules[0].input_quantizer.amax.numel() == 1`, which breaks for NVFP4
quantization when the model has per-expert-decomposed MoE linears
(gate_proj/up_proj pairs per expert). NVFP4's per-channel input quantizer
produces a vector amax, not a scalar, so the assertion trips immediately
on the first expert during `export_hf_checkpoint()`.
Root cause: the function was written assuming fused linears have per-tensor
scalar input amax. That's true for dense FP8/INT8 paths but false for
NVFP4's per-channel activation statistics, which modelopt's own
NVFP4_AWQ_FULL_CFG produces.
This change:
- Keeps the existing scalar-amax path (dense + FP8/INT8 unchanged)
- Adds a non-scalar path using elementwise max (`.amax(dim=0)`) across the
stacked per-channel amax tensors of the modules being fused
Numerical correctness for the MoE case: the modules being fused here
(e.g. gate_proj and up_proj of one expert) consume the *same* input
tensor by construction, so their per-channel input amax tensors are
identical. Elementwise max is therefore a no-op, and is the correct
unification rule if they ever differ due to floating-point accumulation.
Validated end-to-end on SuperGemma4 26B (128-expert MoE) with
NVFP4_AWQ_FULL_CFG; export now completes and the serialized checkpoint
loads + generates correctly. Before: export failed with
`AssertionError: Only support scalar input quant amax` after 2h 24min of
successful calibration.
Signed-off-by: AEON-7 <m2vgz48wpp@privaterelay.appleid.com>1 parent c9b1155 commit 3fcc5a7
1 file changed
+15
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1375 | 1375 | | |
1376 | 1376 | | |
1377 | 1377 | | |
1378 | | - | |
1379 | | - | |
1380 | | - | |
1381 | | - | |
1382 | | - | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
| 1388 | + | |
| 1389 | + | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
1383 | 1393 | | |
1384 | 1394 | | |
1385 | 1395 | | |
| |||
0 commit comments