Commit c4e397d
committed
glm5-fp8-mi355x-sglang-disagg: bump to v0.5.12.post1 image and patch DSA state-index path
amd-master.yaml
- Image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0402
-> lmsysorg/sglang-rocm:v0.5.12.post1-rocm720-mi35x-20260523
(matches qwen3.5-fp8-mi355x-sglang-disagg; the older 0.5.9 image is
no longer the reference build for hybrid-attention disagg models on
MI355X.)
- Scenarios: collapse the four legacy "top/middle/bottom/small-scale"
search-spaces per ISL into a single 1P+1D TP=8 EP=1 dp-attn=false
entry with the standard conc-list [8, 16, 32, 64, 128, 256, 512]
for both 1k1k and 8k1k. dp-attn=false avoids the
fused_moe_triton/layer.py:209 shared-slot assertion that
--enable-dp-attention + --moe-a2a-backend mori triggers for GLM-5
(256 routed + 1 shared expert; (256-1) % 8 = 7 != 0). The collapsed
layout mirrors the qwen3.5-fp8-mi355x-sglang-disagg shape so the
same CI matrix-expansion logic applies to both.
patches/mori_conn.py
- Add patch #4: rank + length normalization in
MoriKVReceiver._send_swa_dsa_state, immediately before the
group_concurrent_contiguous call. For GLM-5 (single DSA component),
upstream hands dst_state_indices as a 2-D (1, N) array while
src_state_indices is 1-D length 1; the existing [:common_len]
slice operates only on the outer axis, leaving the rank mismatched.
np.diff then produces (1, N-1) vs (0,), which can't broadcast and
crashes with "operands could not be broadcast together with shapes
(1,12) (0,)". The fix ravels both indices to 1-D and re-truncates
to common length so np.diff outputs compatible 1-D arrays. One-shot
log gates the warning to once per receiver class.
- Verified end-to-end:
glm5-fp8-mi355x-sglang-disagg gsm8k flexible-extract = 0.9704 +/- 0.0047
glm5-fp8-mi355x-sglang-disagg gsm8k strict-match = 0.9712 +/- 0.0046
qwen3.5-fp8-mi355x-sglang-disagg gsm8k (regression) = 0.9780 +/- 0.004
Patch #4 fires zero times on the Qwen3.5 Mamba path (it lives
inside _send_swa_dsa_state, never called for Mamba); patches #1-#3
behavior is unchanged.
patches/README.md
- Document patch #4 alongside the existing three. Cross-link the full
bug analysis at scripts/sglang_disagg/docs_glm5/01-bug-analysis.md
and the gsm8k verification at
scripts/sglang_disagg/docs_glm5/02-fix-and-verification.md.1 parent 688ebe6 commit c4e397d
3 files changed
Lines changed: 50 additions & 107 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
481 | 481 | | |
482 | 482 | | |
483 | 483 | | |
484 | | - | |
| 484 | + | |
485 | 485 | | |
486 | 486 | | |
487 | 487 | | |
| |||
494 | 494 | | |
495 | 495 | | |
496 | 496 | | |
497 | | - | |
498 | 497 | | |
499 | | - | |
| 498 | + | |
500 | 499 | | |
501 | 500 | | |
502 | 501 | | |
503 | 502 | | |
504 | 503 | | |
505 | 504 | | |
506 | 505 | | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | | - | |
517 | | - | |
518 | | - | |
519 | | - | |
520 | | - | |
521 | | - | |
522 | | - | |
523 | | - | |
524 | | - | |
525 | | - | |
526 | | - | |
527 | | - | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
532 | | - | |
533 | | - | |
534 | | - | |
535 | | - | |
536 | | - | |
537 | | - | |
538 | | - | |
539 | | - | |
540 | | - | |
541 | | - | |
542 | | - | |
543 | | - | |
544 | | - | |
545 | | - | |
546 | | - | |
547 | | - | |
548 | | - | |
549 | | - | |
550 | | - | |
551 | | - | |
552 | | - | |
553 | | - | |
554 | | - | |
555 | | - | |
556 | | - | |
557 | | - | |
558 | | - | |
559 | | - | |
560 | | - | |
561 | | - | |
562 | | - | |
563 | | - | |
564 | | - | |
565 | | - | |
566 | 506 | | |
567 | 507 | | |
568 | 508 | | |
| |||
575 | 515 | | |
576 | 516 | | |
577 | 517 | | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
594 | | - | |
595 | | - | |
596 | | - | |
597 | | - | |
598 | 518 | | |
599 | | - | |
| 519 | + | |
600 | 520 | | |
601 | 521 | | |
602 | 522 | | |
603 | 523 | | |
604 | 524 | | |
605 | 525 | | |
606 | 526 | | |
607 | | - | |
608 | | - | |
609 | | - | |
610 | | - | |
611 | | - | |
612 | | - | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
622 | | - | |
623 | | - | |
624 | | - | |
625 | | - | |
626 | | - | |
627 | | - | |
628 | 527 | | |
629 | 528 | | |
630 | 529 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
40 | 51 | | |
41 | 52 | | |
42 | 53 | | |
43 | 54 | | |
44 | | - | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
45 | 58 | | |
46 | 59 | | |
47 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1148 | 1148 | | |
1149 | 1149 | | |
1150 | 1150 | | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
| 1161 | + | |
| 1162 | + | |
| 1163 | + | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
| 1173 | + | |
| 1174 | + | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
1151 | 1182 | | |
1152 | 1183 | | |
1153 | 1184 | | |
| |||
0 commit comments