Skip to content

Commit cdf21c3

Browse files
Oseltamivirclaude
andcommitted
fix: switch gb200 sglang mtp recipes to nixl transfer backend
mooncake's MNNVL handshake fails on every prefill->decode transfer on the GB200 NV cluster, producing 0 output tokens with all requests aborted (KVTransferError(...): Aborted by AbortReq, ~100K errors/job in run 25785003012). Same dynamo hash + same SGLang container with dep8-dep8 works on GB300, isolating the break to GB200's MNNVL fabric. Switching to nixl uses dynamo's native transport, which has GB200 NVL4 support. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 910d791 commit cdf21c3

7 files changed

Lines changed: 14 additions & 14 deletions

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-low-latency-1p1d-tp8-tp8-mtp.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ backend:
8383
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
8484

8585
disaggregation-mode: "prefill"
86-
disaggregation-transfer-backend: mooncake
86+
disaggregation-transfer-backend: nixl
8787

8888
tensor-parallel-size: 8
8989
data-parallel-size: 1
@@ -104,7 +104,7 @@ backend:
104104
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
105105

106106
disaggregation-mode: "decode"
107-
disaggregation-transfer-backend: mooncake
107+
disaggregation-transfer-backend: nixl
108108

109109
tensor-parallel-size: 8
110110
data-parallel-size: 1

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-low-latency-1p6d-dep8-tp8-mtp.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ backend:
9494
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
9595

9696
disaggregation-mode: "prefill"
97-
disaggregation-transfer-backend: mooncake
97+
disaggregation-transfer-backend: nixl
9898

9999
tensor-parallel-size: 8
100100
data-parallel-size: 8
@@ -118,7 +118,7 @@ backend:
118118
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
119119

120120
disaggregation-mode: "decode"
121-
disaggregation-transfer-backend: mooncake
121+
disaggregation-transfer-backend: nixl
122122

123123
tensor-parallel-size: 8
124124
data-parallel-size: 1

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-mid-curve-1p1d-dep8-dep16-mtp.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ backend:
103103
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
104104

105105
disaggregation-mode: "prefill"
106-
disaggregation-transfer-backend: mooncake
106+
disaggregation-transfer-backend: nixl
107107

108108
tensor-parallel-size: 8
109109
data-parallel-size: 8
@@ -128,7 +128,7 @@ backend:
128128
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
129129

130130
disaggregation-mode: "decode"
131-
disaggregation-transfer-backend: mooncake
131+
disaggregation-transfer-backend: nixl
132132

133133
tensor-parallel-size: 16
134134
data-parallel-size: 16

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-mid-curve-1p1d-dep8-dep8-mtp.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ backend:
103103
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
104104

105105
disaggregation-mode: "prefill"
106-
disaggregation-transfer-backend: mooncake
106+
disaggregation-transfer-backend: nixl
107107

108108
tensor-parallel-size: 8
109109
data-parallel-size: 8
@@ -128,7 +128,7 @@ backend:
128128
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
129129

130130
disaggregation-mode: "decode"
131-
disaggregation-transfer-backend: mooncake
131+
disaggregation-transfer-backend: nixl
132132

133133
tensor-parallel-size: 8
134134
data-parallel-size: 8

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-mid-curve-4p1d-dep8-dep8-mtp-c8192.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ backend:
103103
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
104104

105105
disaggregation-mode: "prefill"
106-
disaggregation-transfer-backend: mooncake
106+
disaggregation-transfer-backend: nixl
107107

108108
tensor-parallel-size: 8
109109
data-parallel-size: 8
@@ -128,7 +128,7 @@ backend:
128128
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
129129

130130
disaggregation-mode: "decode"
131-
disaggregation-transfer-backend: mooncake
131+
disaggregation-transfer-backend: nixl
132132

133133
tensor-parallel-size: 8
134134
data-parallel-size: 8

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-mid-curve-5p1d-dep8-dep8-mtp-c12288.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ backend:
103103
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
104104

105105
disaggregation-mode: "prefill"
106-
disaggregation-transfer-backend: mooncake
106+
disaggregation-transfer-backend: nixl
107107

108108
tensor-parallel-size: 8
109109
data-parallel-size: 8
@@ -128,7 +128,7 @@ backend:
128128
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
129129

130130
disaggregation-mode: "decode"
131-
disaggregation-transfer-backend: mooncake
131+
disaggregation-transfer-backend: nixl
132132

133133
tensor-parallel-size: 8
134134
data-parallel-size: 8

benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k/disagg-gb200-mid-curve-6p1d-dep8-dep8-mtp-c16384.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ backend:
103103
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
104104

105105
disaggregation-mode: "prefill"
106-
disaggregation-transfer-backend: mooncake
106+
disaggregation-transfer-backend: nixl
107107

108108
tensor-parallel-size: 8
109109
data-parallel-size: 8
@@ -128,7 +128,7 @@ backend:
128128
tool-call-parser: deepseekv4 # gates dsv4 chat-encoding spec.
129129

130130
disaggregation-mode: "decode"
131-
disaggregation-transfer-backend: mooncake
131+
disaggregation-transfer-backend: nixl
132132

133133
tensor-parallel-size: 8
134134
data-parallel-size: 8

0 commit comments

Comments
 (0)