Commit d5bf8ba
fix(qwen35moe): contiguous top_k router ids when host-read (hybrid spark crash) (#481)
Regression from #472: ggml_argsort_top_k returns a strided VIEW into the
full [n_expert, n_tokens] argsort. The hybrid hot/cold path reads the
router outputs back with a raw packed ggml_backend_tensor_get
(qwen35moe_backend.cpp prefill chunk readback), which silently yields
garbage expert ids for every token after the first -> corrupted hot/cold
dispatch -> CUDA illegal memory access on the first multi-token spark
prefill. Decode reads a single row and was unaffected, which is why #472
validation (all-hot + decode) passed.
Scoped fix: build_qwen35moe_router(allow_fused_router=false) at the hybrid
export site emits contiguous ggml_top_k ids. The in-graph FFN path keeps
argsort_top_k and its CUDA topk-moe fusion (all-hot output verified
byte-identical, hash 8cda68b0ca5eb797, fused and unfused).
Validated on lucebox2 (3090): spark repro 2x clean + coherent (crashed
100% before), all-hot long prefill byte-identical, unit suite 2050
assertions 0 failures.
Co-authored-by: Claude <noreply@anthropic.com>1 parent 13ac209 commit d5bf8ba
3 files changed
Lines changed: 20 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1417 | 1417 | | |
1418 | 1418 | | |
1419 | 1419 | | |
1420 | | - | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
1421 | 1427 | | |
1422 | 1428 | | |
1423 | 1429 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
19 | 28 | | |
20 | 29 | | |
21 | 30 | | |
| |||
0 commit comments