Commit ada1e26
[NVBug: 6000530] Fix AWQ crash for uncalibrated MoE experts (#1142)
## Summary
- Fixes NVBugs 6000530: `AttributeError: 'float' object has no attribute
'pow'` when running AWQ lite with `moe_calib_experts_ratio < 1.0` on MoE
models (e.g. Qwen3-30B-A3B).
- **Root cause**: When `moe_calib_experts_ratio=0.5`, some MoE experts
receive zero tokens during the AWQ cache phase, leaving `act_scale` as a
Python float `0.0` instead of a tensor. This causes two failures:
1. **Search phase crash**: Uncalibrated experts crash in `get_scale()`
because `float.pow()` doesn't exist.
2. **Export crash**: Calibrated experts have `pre_quant_scale` but
uncalibrated ones don't, causing `torch.stack()` to fail on mixed
`None`/tensor values in `preprocess_linear_fusion()`.
- **Fix**: Handle uncalibrated experts (`num_cache_steps == 0`) in two
stages:
1. **Before search**: Disable AWQ search (`is_enabled = False`) to
prevent `get_scale()` crash on float `act_scale`.
2. **During postprocessing**: Max calibrate weights and apply a neutral
(all-ones) `pre_quant_scale` so export can stack scaling factors
consistently across all experts. The `pre_quant_scale` buffer must be
registered outside `enable_weight_access_and_writeback` because HF
accelerate's `post_forward` hook drops newly-registered submodule
buffers.
## Test plan
- [x] Reproduce with `Qwen/Qwen3-30B-A3B`, `--qformat int4_awq`,
`--moe_calib_experts_ratio 0.5` — verify no crash during calibration and
export
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 74a8694 commit ada1e26
1 file changed
Lines changed: 37 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1179 | 1179 | | |
1180 | 1180 | | |
1181 | 1181 | | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
1182 | 1193 | | |
1183 | 1194 | | |
1184 | 1195 | | |
| |||
1212 | 1223 | | |
1213 | 1224 | | |
1214 | 1225 | | |
1215 | | - | |
1216 | | - | |
1217 | | - | |
1218 | | - | |
1219 | | - | |
1220 | | - | |
1221 | | - | |
| 1226 | + | |
| 1227 | + | |
| 1228 | + | |
| 1229 | + | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
1222 | 1242 | | |
1223 | | - | |
1224 | | - | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
1225 | 1253 | | |
1226 | 1254 | | |
1227 | 1255 | | |
| |||
0 commit comments