Commit dc749a9
Fix stale pointer bug in batched MoE GEMM cache
Include data_ptr() values in the init cache key, not just dimensions.
CUTLASS initialize() bakes data pointers into kernel params. When
different callers (module's _forward_batched vs torch op gemm_nvfp4_moe)
use the same dimensions but different buffer addresses, the old cache
incorrectly skipped re-init, causing run() to write to stale pointers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 8bf8759 commit dc749a9
1 file changed
+6
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1363 | 1363 | | |
1364 | 1364 | | |
1365 | 1365 | | |
1366 | | - | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
1367 | 1372 | | |
1368 | 1373 | | |
1369 | 1374 | | |
| |||
0 commit comments