Commit cdf21c3
fix: switch gb200 sglang mtp recipes to nixl transfer backend
mooncake's MNNVL handshake fails on every prefill->decode transfer on the
GB200 NV cluster, producing 0 output tokens with all requests aborted
(KVTransferError(...): Aborted by AbortReq, ~100K errors/job in run
25785003012). Same dynamo hash + same SGLang container with dep8-dep8
works on GB300, isolating the break to GB200's MNNVL fabric. Switching to
nixl uses dynamo's native transport, which has GB200 NVL4 support.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent 910d791 commit cdf21c3
7 files changed
Lines changed: 14 additions & 14 deletions
File tree
- benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/8k1k
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
83 | 83 | | |
84 | 84 | | |
85 | 85 | | |
86 | | - | |
| 86 | + | |
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
| 107 | + | |
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
97 | | - | |
| 97 | + | |
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
121 | | - | |
| 121 | + | |
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
0 commit comments