[AMD] retrigger dsv4-fp4-mi355x-atom benchmark sweep#1817

Merged

Oseltamivir merged 34 commits into

mainfrom

amd/retrigger-dsv4-atom-sweep

Jun 18, 2026

Oseltamivir commented Jun 18, 2026 •

edited by cursor Bot

Loading

Collaborator

Summary

Re-appends the existing PR [AMD] dsv4-fp4-mi355x-atom: enable DPA at high concurrency, update image to atom0.1.4 #1717 benchmark trigger entry at the end of perf-changelog.yaml.
Requests a fresh sweep for dsv4-fp4-mi355x-atom with the same image and search-space description.

Validation

Parsed perf-changelog.yaml with PyYAML.
Confirmed the diff is append-only and file mode remains 100644.

Note

Low Risk
Append-only changelog metadata for CI orchestration; no runtime or config logic changes.

Overview
Retriggers the dsv4-fp4-mi355x-atom benchmark sweep by appending a new block at the end of perf-changelog.yaml. Sweep selection is driven by the changelog diff vs main, so a fresh append is enough to kick off another run.

The new entry is a duplicate of the existing PR #1717 changelog (same config-keys, description bullets, and pr-link). It does not change .github/configs, launch scripts, or search-space YAML—only documents the re-run intent for reviewers.

The described benchmark context (unchanged by this PR) is DeepSeek-V4 FP4 on MI355X ATOM: image rocm/atom:…atom0.1.4_20260612, ISL=8192 search-space updates (TP8 conc 4–64, DPA conc 128–1024), and TBO at high concurrency.

^{Reviewed by Cursor Bugbot for commit dbc4d69. Bugbot is set up for automated code reviews on this repo. Configure here.}

seungrokj and others added 30 commits

June 12, 2026 21:38


          [AMD] dsv4-fp4-mi355x-atom: enable DPA TBO at high concurrency, updat…

31b4fbe

…e image to atom0.1.4

- Enable --enable-tbo for ISL=1024/OSL=1024 at CONC>=1024 and ISL=8192/OSL=1024 at CONC>=256
- Update image to atom0.1.4_20260612
- Update ISL=8192 search-space to start at conc=4 and use DPA from conc=128

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] perf-changelog: dsv4-fp4-mi355x-atom DPA TBO + image atom0.1.4

c566e28

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] perf-changelog: add PR link #1717

7e1aa06

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: disable prefix caching

65e0fa3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4-fp4-mi355x-atom: add max-model-len, eval context, extend c…

3f3560b

…onc range

- Pass --max-model-len to server using SERVE_MAX_MODEL_LEN
- Add EVAL_ONLY path: compute eval context length via compute_eval_context_length
- Extend conc-end to 8192 (isl=1024) and 4096 (isl=8192) in amd-master.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4-fp4-mi355x-atom: narrow eval to single conc=1024 point, di…

c3b3289

…sable max-model-len

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: add cudagraph-capture-sizes and max-nu…

7ffa976

…m-seqs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4-fp4-mi355x-atom: bump to nightly image, expand search spac…

f2677b2

…e, enable max-model-len

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] set GPU_MAX_HW_QUEUES=5 in dsv4_fp4_mi355x_atom.sh

f5f0d66

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4-fp4-mi355x-atom: disable TBO, add TP4 rows for isl=8192, c…

dc5b239

…ap conc ranges

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          Merge branch 'main' into amd/dsv4_atom_0612

1dbf259


          [AMD] dsv4_fp4_mi355x_atom.sh: quote SERVER_LOG variable

9e18052

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: comment out dense cudagraph sizes

c1812ed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: fix --hf-overrides JSON escaping

28bdc6a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: comment out dense cudagraph sizes

b36218e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4-fp4-mi355x-atom: expand search space, restore isl=1024 rows

fa47caf

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          Merge branch 'main' into amd/dsv4_atom_0612

1022e0b


          [AMD] perf-changelog: update dsv4-fp4-mi355x-atom image and search-sp…

af82c27

…ace description

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] dsv4_fp4_mi355x_atom.sh: restore sparse cudagraph capture sizes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          [AMD] perf-changelog: revert dsv4-fp4-mi355x-atom image/search-space,…

f56f877

… remove stale entries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          Merge branch 'main' into amd/dsv4_atom_0612

f7c9de8


          [AMD] perf-changelog: add dsv4-fp4-mi355x-sglang entry for PR #1762

a4828cb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          update dsv4-fp4-mi355x-atom: bump image, enable TBO conditionally, fi…

19b8757

…x mem frac

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          expand dsv4-fp4-mi355x-atom search space: restore ISL1024 scenarios, …

03aaa6b

…add TP4/TP8 conc lists for ISL8192

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          Merge branch 'main' into amd/dsv4_atom_0612

cf3962f


          Update perf-changelog.yaml

421313c


          Update perf-changelog.yaml

ae77233


          Update perf-changelog.yaml

a8f6bd0


          Update perf-changelog.yaml

5fbd068


          update perf-changelog: move dsv4-fp4-mi355x-atom entry to end

d080faa

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj and others added 3 commits

June 17, 2026 17:21


          narrow dsv4-fp4-mi355x-atom to DPA conc=256-2048 ISL8192, fix TBO bra…

91f6277

…nch override

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          restore full dsv4-fp4-mi355x-atom search space: ISL1024 + ISL8192 TP4…

4364ef9

…/TP8/DPA

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: retrigger dsv4 atom benchmark sweep

52f9779

github-project-automation Bot added this to InferenceMAX Board

github-actions Bot commented Jun 18, 2026

Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Oseltamivir marked this pull request as ready for review

June 18, 2026 01:06

Oseltamivir requested a review from a team

June 18, 2026 01:06

Oseltamivir added the full-sweep-enabled label

claude Bot commented Jun 18, 2026 •

edited

Loading

Contributor

Claude finished @Oseltamivir's task in 35s —— View job

Review Complete

Gather context and read PR diff
Validate perf-changelog.yaml changes
Post review summary

LGTM - no blocking issues found.

The new entry is correctly appended at the end of perf-changelog.yaml (lines 3939–3945), preserving chronological ordering. The diff is append-only and re-triggers the dsv4-fp4-mi355x-atom sweep as intended.

Oseltamivir removed the full-sweep-enabled label

github-actions Bot commented Jun 18, 2026

Contributor

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27729964433
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27729964433

Oseltamivir added the full-sweep-enabled label

github-actions Bot commented Jun 18, 2026

Contributor

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27730066598
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27730066598


          chore: retain PR 1717 sweep ancestry

dbc4d69

Oseltamivir commented Jun 18, 2026

Collaborator Author

/reuse-sweep-run 27676739575

Oseltamivir merged commit 2cd1d01 into main

26 checks passed

Oseltamivir deleted the amd/retrigger-dsv4-atom-sweep branch

June 18, 2026 01:30

github-project-automation Bot moved this to Done in InferenceMAX Board

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full-sweep-enabled