Skip to content

Commit a258f90

Browse files
cquil11claude
andcommitted
benchmarks(agentic): drop conc=96,128 from b200 dsv4 vllm agentic sweep
Removes the two highest-concurrency points from the tp=8/ep=8/dp-attn=true row in dsv4-fp4-b200-vllm-agentic. Sweep now caps at conc=64 for the EP row; tp=8 plain row already caps at 16. b300 sibling unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Cam Quilici <cjquilici@gmail.com>
1 parent 8a267b7 commit a258f90

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

.github/configs/nvidia-master.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1778,7 +1778,7 @@ dsv4-fp4-b200-vllm-agentic:
17781778
# removed for this iteration; restore from prior commits if revisiting
17791779
# offload regressions.
17801780
- { tp: 8, offloading: none, conc-list: [1, 2, 4, 8, 12, 16] }
1781-
- { tp: 8, ep: 8, dp-attn: true, offloading: none, conc-list: [12, 16, 24, 32, 48, 64, 96, 128] }
1781+
- { tp: 8, ep: 8, dp-attn: true, offloading: none, conc-list: [12, 16, 24, 32, 48, 64] }
17821782

17831783
dsv4-fp4-b200-trt:
17841784
image: ghcr.io#semianalysisai/trtllm-deepseek-v4:feat-deepseek_v4-9aa3715

0 commit comments

Comments
 (0)