Skip to content

Commit afb49f5

Browse files
perf-changelog: re-run DSR1 SGLang agg configs (B200/B300, FP8/FP4, no-MTP/MTP) (#1502)
* perf-changelog: re-run DSR1 SGLang agg configs to pick up tokenizer fix Re-runs DSR1 SGLang agg configs on B200/B300 (FP8/FP4, no-MTP/MTP) to pick up the tokenizer fix from #1381. * perf-changelog: set PR link to #1502 * launcher(b300-nv): drop nodelist pinning; restore perf-changelog entries - runners/launch_b300-nv.sh: remove --nodelist=b300-[001-006,008-012,017-020] from salloc so jobs can land on any healthy B300 node. - perf-changelog.yaml: restore ~18 entries that were unintentionally dropped during a prior rebase; net effect of this branch is now just the new DSR1 SGLang agg re-run entry. * perf-changelog: sync to origin/main and append DSR1 re-run entry at end * Update perf-changelog.yaml --------- Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
1 parent cd5f6dd commit afb49f5

2 files changed

Lines changed: 12 additions & 3 deletions

File tree

perf-changelog.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2997,3 +2997,14 @@
29972997
description:
29982998
- "Add MTP/EAGLE speculative-decoding sibling for dsr1-fp8-h200-sglang (model: deepseek-ai/DeepSeek-R1-0528) on lmsysorg/sglang:v0.5.12-cu130 — TP=8, EP=1, search-space conc 4..64 on 1k1k + 8k1k"
29992999
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1523
3000+
3001+
- config-keys:
3002+
- dsr1-fp8-b200-sglang
3003+
- dsr1-fp8-b300-sglang
3004+
- dsr1-fp4-b200-sglang
3005+
- dsr1-fp4-b300-sglang
3006+
- dsr1-fp8-b200-sglang-mtp
3007+
- dsr1-fp8-b300-sglang-mtp
3008+
description:
3009+
- "Re-run DSR1 SGLang agg configs (B200/B300, FP8/FP4, no-MTP/MTP) — picks up tokenizer fix from #1381"
3010+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1502

runners/launch_b300-nv.sh

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -331,9 +331,7 @@ else
331331
fi
332332
)
333333

334-
# Pin to one of the known-good B300 nodes; others have hardware/network
335-
# issues that cause benchmarks to hang or fail to start.
336-
salloc --partition=$SLURM_PARTITION --account=$SLURM_ACCOUNT --nodelist=b300-[001-006,008-012,017-020] -N 1 --gres=gpu:$TP --exclusive --time=180 --no-shell --job-name="$RUNNER_NAME"
334+
salloc --partition=$SLURM_PARTITION --account=$SLURM_ACCOUNT -N 1 --gres=gpu:$TP --exclusive --time=180 --no-shell --job-name="$RUNNER_NAME"
337335
JOB_ID=$(squeue --name="$RUNNER_NAME" -u "$USER" -h -o %A | head -n1)
338336

339337
srun --jobid=$JOB_ID \

0 commit comments

Comments
 (0)