Commit 688ebe6
committed
qwen3.5-fp8-mi355x-sglang-disagg: bump image and disable dp-attn
Two YAML changes for this config row:
* image: lmsysorg/sglang-rocm:v0.5.11-rocm700-mi35x-20260511
-> lmsysorg/sglang-rocm:v0.5.12.post1-rocm720-mi35x-20260523
Brings this entry onto the same rocm720 / mi35x lineage every other
mi355x sglang config in this file already uses; image is also
proven to support disagg (matches rocm720 base of dsr1-fp4-mi355x-
sglang-disagg).
* 8k1k row: prefill.dp-attn / decode.dp-attn true -> false
With --enable-dp-attention + --moe-a2a-backend mori, sglang
auto-promotes moe_ep_size=tp_size=8 (log line: "MoRI MoE is
enabled. The expert parallel size is adjusted to be the same as
the tensor parallel size[8]"). is_deepep_class_backend() does NOT
include MoRI, so num_shared_slots stays at the global value (1)
rather than the per-rank num_fused_shared_experts*moe_ep_size = 8,
and the assertion
(num_experts - num_shared_slots) % self.moe_ep_size == 0
in fused_moe_triton/layer.py fires for Qwen3.5 (512 routed +
1 shared, ep=8): (512 - 1) % 8 = 7. Setting dp-attn=false leaves
moe_ep_size=1, so (512 - 1) % 1 = 0 always.
The 1k1k row was already at dp-attn=false; this aligns the 8k1k
row with it. Comment block above each row records the dependency
on the upstream sglang fix (add MoRI to is_deepep_class_backend()
or reconcile shared-slot accounting); flip back once that lands.
Together with the MoRI conn.py overlay (commit <SHA-1>), the CI
matrix for this entry passes:
smoke benchmark, 1k1k 1P+1D TP=8/EP=1 dp-attn=false, conc 8..256:
request_throughput 0.85 -> 7.64 req/s
output_throughput 787 -> 7042 tok/s
smoke benchmark, 8k1k same topology, conc 8..256:
request_throughput 0.84 -> 7.09 req/s
output_throughput 774 -> 6537 tok/s
total_throughput 6884 -> 58818 tok/s
accuracy (gsm8k 5-shot, conc=128, 8k1k):
exact_match (strict) 0.978 +/- 0.004 PASS
exact_match (flex) 0.978 +/- 0.004 PASS
(conc=512 stalls in MoRI's high-concurrency tail-deadlock; tracked
separately, distinct from the registration/state-type bugs.)1 parent 48e459b commit 688ebe6
1 file changed
Lines changed: 23 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
352 | | - | |
| 352 | + | |
353 | 353 | | |
354 | 354 | | |
355 | 355 | | |
| |||
362 | 362 | | |
363 | 363 | | |
364 | 364 | | |
365 | | - | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
366 | 375 | | |
367 | 376 | | |
368 | 377 | | |
| |||
384 | 393 | | |
385 | 394 | | |
386 | 395 | | |
387 | | - | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
388 | 406 | | |
389 | 407 | | |
390 | 408 | | |
391 | 409 | | |
392 | 410 | | |
393 | 411 | | |
394 | | - | |
| 412 | + | |
395 | 413 | | |
396 | 414 | | |
397 | 415 | | |
398 | 416 | | |
399 | 417 | | |
400 | 418 | | |
401 | | - | |
| 419 | + | |
402 | 420 | | |
403 | 421 | | |
404 | 422 | | |
| |||
0 commit comments