Skip to content

Commit 2cd1d01

Browse files
Oseltamivirseungrokjclaude
authored
[AMD] retrigger dsv4-fp4-mi355x-atom benchmark sweep (#1817)
* [AMD] dsv4-fp4-mi355x-atom: enable DPA TBO at high concurrency, update image to atom0.1.4 - Enable --enable-tbo for ISL=1024/OSL=1024 at CONC>=1024 and ISL=8192/OSL=1024 at CONC>=256 - Update image to atom0.1.4_20260612 - Update ISL=8192 search-space to start at conc=4 and use DPA from conc=128 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] perf-changelog: dsv4-fp4-mi355x-atom DPA TBO + image atom0.1.4 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] perf-changelog: add PR link #1717 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: disable prefix caching Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4-fp4-mi355x-atom: add max-model-len, eval context, extend conc range - Pass --max-model-len to server using SERVE_MAX_MODEL_LEN - Add EVAL_ONLY path: compute eval context length via compute_eval_context_length - Extend conc-end to 8192 (isl=1024) and 4096 (isl=8192) in amd-master.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4-fp4-mi355x-atom: narrow eval to single conc=1024 point, disable max-model-len Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: add cudagraph-capture-sizes and max-num-seqs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4-fp4-mi355x-atom: bump to nightly image, expand search space, enable max-model-len Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] set GPU_MAX_HW_QUEUES=5 in dsv4_fp4_mi355x_atom.sh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4-fp4-mi355x-atom: disable TBO, add TP4 rows for isl=8192, cap conc ranges Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: quote SERVER_LOG variable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: comment out dense cudagraph sizes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: fix --hf-overrides JSON escaping Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: comment out dense cudagraph sizes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4-fp4-mi355x-atom: expand search space, restore isl=1024 rows Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] perf-changelog: update dsv4-fp4-mi355x-atom image and search-space description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] dsv4_fp4_mi355x_atom.sh: restore sparse cudagraph capture sizes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] perf-changelog: revert dsv4-fp4-mi355x-atom image/search-space, remove stale entries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [AMD] perf-changelog: add dsv4-fp4-mi355x-sglang entry for PR #1762 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * update dsv4-fp4-mi355x-atom: bump image, enable TBO conditionally, fix mem frac Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * expand dsv4-fp4-mi355x-atom search space: restore ISL1024 scenarios, add TP4/TP8 conc lists for ISL8192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update perf-changelog.yaml * Update perf-changelog.yaml * Update perf-changelog.yaml * Update perf-changelog.yaml * update perf-changelog: move dsv4-fp4-mi355x-atom entry to end Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * narrow dsv4-fp4-mi355x-atom to DPA conc=256-2048 ISL8192, fix TBO branch override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * restore full dsv4-fp4-mi355x-atom search space: ISL1024 + ISL8192 TP4/TP8/DPA Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: retrigger dsv4 atom benchmark sweep --------- Co-authored-by: seungrokj <seungrok.jung@amd.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: seungrokj <144636725+seungrokj@users.noreply.github.com>
1 parent 60bf726 commit 2cd1d01

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

perf-changelog.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3935,3 +3935,11 @@
39353935
- "Update ISL=8192 search-space: TP8-only from conc=4-64, DPA from conc=128-1024 (previously conc=1-64 and DPA conc=64-512)"
39363936
- "Update Applied TBO on high concurrencies"
39373937
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1717
3938+
3939+
- config-keys:
3940+
- dsv4-fp4-mi355x-atom
3941+
description:
3942+
- "Update image to rocm/atom:rocm7.2.4_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom0.1.4_20260612"
3943+
- "Update ISL=8192 search-space: TP8-only from conc=4-64, DPA from conc=128-1024 (previously conc=1-64 and DPA conc=64-512)"
3944+
- "Update Applied TBO on high concurrencies"
3945+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1717

0 commit comments

Comments
 (0)