Skip to content

Commit fa28004

Browse files
cquil11claude
andcommitted
add dsv4-fp4-gb300-cw-dynamo-vllm-agentic — CoreWeave sibling config
Mirrors dsv4-fp4-gb300-dynamo-vllm-agentic 1:1 except runner switches from gb300-nv to gb300-cw. Same image, search space (conc 32/192/4096), and recipe files. Recipe sharing works because launch_gb300-cw.sh already has the IS_AGENTIC overlay branch (mirrors launch_gb300-nv.sh) that copies benchmarks/multi_node/srt-slurm-recipes/vllm/deepseek-v4/ agentic into the srt-slurm clone. Separate config (not a runner-label widening on the -nv entry) so we can dispatch NV and CW as independent sweep runs — bundling SKUs in one `gh workflow run` causes fault cascades per [[feedback_separate_b200_b300_runs]]. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent f8b85c9 commit fa28004

1 file changed

Lines changed: 68 additions & 0 deletions

File tree

.github/configs/nvidia-master.yaml

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8781,6 +8781,74 @@ dsv4-fp4-gb300-dynamo-vllm-agentic:
87818781
ep: 8
87828782
dp-attn: true
87838783

8784+
# CoreWeave sibling of dsv4-fp4-gb300-dynamo-vllm-agentic — same image,
8785+
# recipes, and search space; only `runner` differs (gb300-cw vs gb300-nv).
8786+
# Kept as a separate config (not a label-widening on the -nv entry)
8787+
# because we dispatch NV and CW as independent sweep runs — bundling
8788+
# both SKUs into one `gh workflow run` invocation lets a fault on one
8789+
# cascade-cancel the other (see prior R20–R23 outages). The two sibling
8790+
# configs share recipe files via the same launch_gb300-cw.sh IS_AGENTIC
8791+
# overlay (recipes/vllm/deepseek-v4/agentic/), so a change to the recipe
8792+
# applies to both clusters with no duplication.
8793+
dsv4-fp4-gb300-cw-dynamo-vllm-agentic:
8794+
image: vllm/vllm-openai:v0.21.0-ubuntu2404
8795+
model: deepseek-ai/DeepSeek-V4-Pro
8796+
model-prefix: dsv4
8797+
runner: gb300-cw
8798+
precision: fp4
8799+
framework: dynamo-vllm
8800+
multinode: true
8801+
disagg: true
8802+
scenarios:
8803+
agentic-coding:
8804+
- duration: 1800
8805+
search-space:
8806+
# Low-latency: 1p6d at conc=32.
8807+
- spec-decoding: none
8808+
conc-list: [32]
8809+
prefill:
8810+
num-worker: 1
8811+
tp: 4
8812+
ep: 4
8813+
dp-attn: true
8814+
additional-settings:
8815+
- "CONFIG_FILE=recipes/vllm/deepseek-v4/agentic/disagg-gb300-1p6d-dep4-tp4-agentic.yaml"
8816+
decode:
8817+
num-worker: 6
8818+
tp: 4
8819+
ep: 1
8820+
dp-attn: false
8821+
# Mid: 1p6d at conc=192.
8822+
- spec-decoding: none
8823+
conc-list: [192]
8824+
prefill:
8825+
num-worker: 1
8826+
tp: 4
8827+
ep: 4
8828+
dp-attn: true
8829+
additional-settings:
8830+
- "CONFIG_FILE=recipes/vllm/deepseek-v4/agentic/disagg-gb300-1p6d-dep4-tp4-agentic.yaml"
8831+
decode:
8832+
num-worker: 6
8833+
tp: 4
8834+
ep: 1
8835+
dp-attn: false
8836+
# High-throughput: 4p1d at conc=4096.
8837+
- spec-decoding: none
8838+
conc-list: [4096]
8839+
prefill:
8840+
num-worker: 4
8841+
tp: 4
8842+
ep: 4
8843+
dp-attn: true
8844+
additional-settings:
8845+
- "CONFIG_FILE=recipes/vllm/deepseek-v4/agentic/disagg-gb300-4p1d-dep4-dep8-24-c4096-agentic.yaml"
8846+
decode:
8847+
num-worker: 1
8848+
tp: 8
8849+
ep: 8
8850+
dp-attn: true
8851+
87848852
dsv4-fp4-gb300-dynamo-sglang:
87858853
image: lmsysorg/sglang-staging:deepseek-v4-grace-blackwell-dev
87868854
model: deepseek-ai/DeepSeek-V4-Pro

0 commit comments

Comments
 (0)