Skip to content

Commit 8d76685

Browse files
seungrokjclaudefunctionstackx
authored
[AMD/ROCm] qwen3.5-fp8-mi355x-atom, Bump image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511 (#1411)
* Update qwen3.5-fp8-mi355x-atom ATOM image to nightly 20260511 Bump ATOM image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511. TP=4 shows +3.2% to +16.3% throughput improvement across 1k1k and 8k1k workloads (concurrency 4-256). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update perf-changelog.yaml * Update perf-changelog.yaml --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
1 parent 292726b commit 8d76685

2 files changed

Lines changed: 9 additions & 1 deletion

File tree

.github/configs/amd-master.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ qwen3.5-fp8-mi355x-sglang-agentic:
305305
- { tp: 8, ep: 1, offloading: none, conc-list: [1, 2, 4, 8, 16, 32] }
306306

307307
qwen3.5-fp8-mi355x-atom:
308-
image: rocm/atom:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom0.1.2.post
308+
image: rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511
309309
model: Qwen/Qwen3.5-397B-A17B-FP8
310310
model-prefix: qwen3.5
311311
runner: mi355x

perf-changelog.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3028,3 +3028,11 @@
30283028
description:
30293029
- "Add MTP/EAGLE speculative-decoding sibling for dsr1-fp4-b200-sglang (model: nvidia/DeepSeek-R1-0528-FP4-V2) on lmsysorg/sglang:v0.5.12-cu130"
30303030
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1522
3031+
3032+
- config-keys:
3033+
- qwen3.5-fp8-mi355x-atom
3034+
description:
3035+
- "Bump ATOM image to rocm/atom:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom20260511"
3036+
- "TP=4 shows +3.2% to +16.3% throughput improvement across 1k1k and 8k1k workloads (concurrency 4-256)"
3037+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1411
3038+

0 commit comments

Comments
 (0)