Skip to content

Commit 8eec0d4

Browse files
cquil11claude
andcommitted
benchmarks(single_node): move fixed-seq-len scripts into fixed_seq_len/ subdir
Match the existing benchmarks/single_node/agentic/ split: all 111 non- agentic per-cluster launch scripts move into benchmarks/single_node/ fixed_seq_len/. chat_templates/ stays at single_node/chat_templates/ as a shared resource (referenced by both agentic and fixed_seq_len scripts). Plumbing: - .github/workflows/benchmark-tmpl.yml + benchmark-multinode-tmpl.yml: SCENARIO_SUBDIR default flips from '' to 'fixed_seq_len/'. - runners/launch_mi355x-amds.sh: parameter-expansion fallback also defaults to fixed_seq_len/ so direct invocations (without the workflow setting SCENARIO_SUBDIR) still resolve. - Each moved script's `source "$(dirname \"$0\")/../benchmark_lib.sh"` becomes `../../benchmark_lib.sh`. - dsv4_fp4_mi355x_sglang.sh's --chat-template path becomes `../chat_templates/...` (matches the agentic copy's pattern). - .github/configs/{nvidia,amd}-master.yaml: forward-looking comments repath to fixed_seq_len/. perf-changelog.yaml historical entries left untouched (they describe paths at the time of the change). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Cam Quilici <cjquilici@gmail.com>
1 parent 4be3ef0 commit 8eec0d4

116 files changed

Lines changed: 120 additions & 120 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/configs/amd-master.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1796,7 +1796,7 @@ dsv4-fp4-mi355x-sglang-agentic:
17961796
# vLLM with AITER MLA decode for DSv4 on MI355X (vllm-project/vllm#40889,
17971797
# stacked on #40871). Uses the ATOM MI355X image (ROCm 7.2.2, aiter with
17981798
# MLA decode, MI355X GPU detection); vLLM is rebuilt from the PR branch
1799-
# at runtime by benchmarks/single_node/dsv4_fp8_mi355x_vllm.sh at a
1799+
# at runtime by benchmarks/single_node/fixed_seq_len/dsv4_fp8_mi355x_vllm.sh at a
18001800
# pinned SHA. Once both PRs merge into a release, switch to a vLLM ROCm
18011801
# MI355X image and remove the build step.
18021802
dsv4-fp8-mi355x-vllm:

.github/configs/nvidia-master.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1704,7 +1704,7 @@ dsv4-fp4-b200-sglang:
17041704
framework: sglang
17051705
multinode: false
17061706
# Two recipes from https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4
1707-
# are selected inside benchmarks/single_node/dsv4_fp4_b200.sh by DP_ATTENTION:
1707+
# are selected inside benchmarks/single_node/fixed_seq_len/dsv4_fp4_b200.sh by DP_ATTENTION:
17081708
# low-latency (DP_ATTENTION=false): TP-only, flashinfer_mxfp4
17091709
# DP-attention (DP_ATTENTION=true): DP-attn + DeepEP + mega_moe opts
17101710
# The DP-attention recipe covers both "balanced" (conc 64-128) and
@@ -1998,7 +1998,7 @@ dsv4-fp4-b300-sglang:
19981998
framework: sglang
19991999
multinode: false
20002000
# Three recipes from https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4
2001-
# are selected inside benchmarks/single_node/dsv4_fp4_b300_sglang.sh by CONC:
2001+
# are selected inside benchmarks/single_node/fixed_seq_len/dsv4_fp4_b300_sglang.sh by CONC:
20022002
# low-latency (CONC <= 32): TP-only
20032003
# balanced (32 < CONC <= 128): + DP-attn
20042004
# max-throughput (CONC > 128): + DP-attn
@@ -2024,7 +2024,7 @@ dsv4-fp4-b300-sglang:
20242024
- { tp: 8, ep: 8, dp-attn: true, conc-start: 4096, conc-end: 4096 }
20252025

20262026
# DeepSeek-V4-Pro on B300 with EAGLE/MTP speculative decoding. Recipe is
2027-
# selected inside benchmarks/single_node/dsv4_fp4_b300_sglang_mtp.sh by
2027+
# selected inside benchmarks/single_node/fixed_seq_len/dsv4_fp4_b300_sglang_mtp.sh by
20282028
# DP_ATTENTION:
20292029
# dp-attn: false -> TP-only + flashinfer_mxfp4 + chunked-prefill 8192
20302030
# + EAGLE (3,1,4) + mem-fraction 0.90

.github/workflows/benchmark-multinode-tmpl.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ env:
139139
EVAL_ONLY: ${{ inputs.eval-only }}
140140
EVAL_CONC: ${{ inputs.eval-conc }}
141141
SCENARIO_TYPE: ${{ inputs.scenario-type }}
142-
SCENARIO_SUBDIR: ${{ inputs.scenario-type == 'agentic-coding' && 'agentic/' || '' }}
142+
SCENARIO_SUBDIR: ${{ inputs.scenario-type == 'agentic-coding' && 'agentic/' || 'fixed_seq_len/' }}
143143
IS_AGENTIC: ${{ inputs.scenario-type == 'agentic-coding' && '1' || '0' }}
144144
CONC: ${{ inputs.conc }}
145145
DURATION: ${{ inputs.duration }}

.github/workflows/benchmark-tmpl.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ env:
109109
RUN_EVAL: ${{ inputs.run-eval }}
110110
EVAL_ONLY: ${{ inputs.eval-only }}
111111
SCENARIO_TYPE: ${{ inputs.scenario-type }}
112-
SCENARIO_SUBDIR: ${{ inputs.scenario-type == 'agentic-coding' && 'agentic/' || '' }}
112+
SCENARIO_SUBDIR: ${{ inputs.scenario-type == 'agentic-coding' && 'agentic/' || 'fixed_seq_len/' }}
113113
IS_AGENTIC: ${{ inputs.scenario-type == 'agentic-coding' && '1' || '0' }}
114114
OFFLOADING: ${{ inputs.offloading }}
115115
TOTAL_CPU_DRAM_GB: ${{ inputs.total-cpu-dram-gb }}

benchmarks/single_node/dsr1_fp4_b200.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_b200.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
source "$(dirname "$0")/../benchmark_lib.sh"
3+
source "$(dirname "$0")/../../benchmark_lib.sh"
44

55
check_env_vars \
66
MODEL \

benchmarks/single_node/dsr1_fp4_b200_trt.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_b200_trt.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
source "$(dirname "$0")/../benchmark_lib.sh"
3+
source "$(dirname "$0")/../../benchmark_lib.sh"
44

55
check_env_vars \
66
MODEL \

benchmarks/single_node/dsr1_fp4_b200_trt_mtp.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_b200_trt_mtp.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
source "$(dirname "$0")/../benchmark_lib.sh"
3+
source "$(dirname "$0")/../../benchmark_lib.sh"
44

55
check_env_vars \
66
MODEL \

benchmarks/single_node/dsr1_fp4_b300.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_b300.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# does not have a B300-specific recipe, so this script reuses the existing
55
# DSR1 FP4 B200 SGLang recipe as-is until B300-specific tuning is available.
66

7-
source "$(dirname "$0")/../benchmark_lib.sh"
7+
source "$(dirname "$0")/../../benchmark_lib.sh"
88

99
check_env_vars \
1010
MODEL \

benchmarks/single_node/dsr1_fp4_mi355x.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_mi355x.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
source "$(dirname "$0")/../benchmark_lib.sh"
3+
source "$(dirname "$0")/../../benchmark_lib.sh"
44

55
check_env_vars \
66
MODEL \

benchmarks/single_node/dsr1_fp4_mi355x_atom.sh renamed to benchmarks/single_node/fixed_seq_len/dsr1_fp4_mi355x_atom.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22

3-
source "$(dirname "$0")/../benchmark_lib.sh"
3+
source "$(dirname "$0")/../../benchmark_lib.sh"
44

55
check_env_vars \
66
MODEL \

0 commit comments

Comments
 (0)