Skip to content

Commit 7388788

Browse files
[AMD][MI355X] update model for gpt-oss (#1670)
* Update amd-master.yaml * Add perf-changelog entry placeholder Record the GPT-OSS MI355X vLLM model update (amd/gpt-oss-120b-w-mxfp4-a-fp8 -> openai/gpt-oss-120b). * Update the changelog --------- Co-authored-by: ukannika <uma.kannikanti@amd.com>
1 parent 53f61f8 commit 7388788

2 files changed

Lines changed: 8 additions & 1 deletion

File tree

.github/configs/amd-master.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1134,7 +1134,7 @@ gptoss-fp4-mi325x-vllm:
11341134

11351135
gptoss-fp4-mi355x-vllm:
11361136
image: vllm/vllm-openai-rocm:v0.22.0
1137-
model: amd/gpt-oss-120b-w-mxfp4-a-fp8
1137+
model: openai/gpt-oss-120b
11381138
model-prefix: gptoss
11391139
runner: mi355x
11401140
precision: fp4

perf-changelog.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3487,3 +3487,10 @@
34873487
- "Switch attention backend from FLASHINFER to FLASH_ATTN for the 8k/1k cell of MiniMax-M2.5 FP8 H200 vLLM."
34883488
- "1k/1k cell not changed in this PR: at 1k/1k all three measured configs."
34893489
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1668
3490+
3491+
- config-keys:
3492+
- gptoss-fp4-mi355x-vllm
3493+
description:
3494+
- "Update GPT-OSS model for MI355X vLLM from amd/gpt-oss-120b-w-mxfp4-a-fp8 to openai/gpt-oss-120b"
3495+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1670
3496+

0 commit comments

Comments
 (0)