Skip to content

Commit 6d10eaf

Browse files
cquil11claude
andcommitted
b200/b300 vllm-agentic: no-offload curves vs new cc-traces 051826
Replaces the cpu-offload-only search-space on both single-node configs with no-offload curves at the user-requested conc points, against the freshly-bumped cc-traces-weka-no-subagents-051826 dataset (98 traces, v5-only + CC ≥ 2.1.139). B300 (15 shards): - TP=8 offload=none conc=[1,2,4] - TP=4 offload=none conc=[1,2,4,8,10,12,16] - DEP=4 (tp4 ep4 dp-attn) offload=none conc=[16,24,32,40,48] B200 (14 shards): - TP=8 offload=none conc=[1,2,4,8,12,16] - DEP=8 (tp8 ep8 dp-attn) offload=none conc=[12,16,24,32,48,64,96,128] Dispatched as two separate workflow runs per [[feedback_separate_b200_b300_runs]] (cascade-cancel hazard if bundled). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 21f71b6 commit 6d10eaf

1 file changed

Lines changed: 13 additions & 11 deletions

File tree

.github/configs/nvidia-master.yaml

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1773,11 +1773,12 @@ dsv4-fp4-b200-vllm-agentic:
17731773
agentic-coding:
17741774
- duration: 1800
17751775
search-space:
1776-
# cpu offload only this iteration — none entries already validated in
1777-
# earlier runs (B200 25332045030: TP=8 1..32 + DEP=8 16..128 all 100%).
1778-
# Re-add when investigating regressions in offload=none.
1779-
- { tp: 8, offloading: cpu, conc-list: [16, 32, 64] }
1780-
- { tp: 8, ep: 8, dp-attn: true, offloading: cpu, conc-list: [64, 128, 256] }
1776+
# no-offload curve against the new cc-traces-weka-no-subagents-051826
1777+
# dataset (98 traces, v5-only + CC ≥ 2.1.139). cpu-offload entries
1778+
# removed for this iteration; restore from prior commits if revisiting
1779+
# offload regressions.
1780+
- { tp: 8, offloading: none, conc-list: [1, 2, 4, 8, 12, 16] }
1781+
- { tp: 8, ep: 8, dp-attn: true, offloading: none, conc-list: [12, 16, 24, 32, 48, 64, 96, 128] }
17811782

17821783
dsv4-fp4-b200-trt:
17831784
image: ghcr.io#semianalysisai/trtllm-deepseek-v4:feat-deepseek_v4-9aa3715
@@ -3007,12 +3008,13 @@ dsv4-fp4-b300-vllm-agentic:
30073008
agentic-coding:
30083009
- duration: 1800
30093010
search-space:
3010-
# cpu offload only this iteration — none entries already validated in
3011-
# earlier runs. Re-add when investigating regressions in offload=none.
3012-
- { tp: 4, offloading: cpu, conc-list: [16, 32, 64] }
3013-
- { tp: 8, offloading: cpu, conc-list: [16, 32, 64] }
3014-
- { tp: 4, ep: 4, dp-attn: true, offloading: cpu, conc-list: [64, 128, 256] }
3015-
- { tp: 8, ep: 8, dp-attn: true, offloading: cpu, conc-list: [128, 256, 512] }
3011+
# no-offload curve against the new cc-traces-weka-no-subagents-051826
3012+
# dataset (98 traces, v5-only + CC ≥ 2.1.139). cpu-offload entries
3013+
# removed for this iteration; restore from prior commits if revisiting
3014+
# offload regressions.
3015+
- { tp: 8, offloading: none, conc-list: [1, 2, 4] }
3016+
- { tp: 4, offloading: none, conc-list: [1, 2, 4, 8, 10, 12, 16] }
3017+
- { tp: 4, ep: 4, dp-attn: true, offloading: none, conc-list: [16, 24, 32, 40, 48] }
30163018

30173019
dsv4-fp4-b300-trt:
30183020
image: ghcr.io#semianalysisai/trtllm-deepseek-v4:feat-deepseek_v4-9aa3715

0 commit comments

Comments
 (0)