Skip to content

Commit ee8d743

Browse files
seungrokjclaude
andcommitted
[AMD] agentx-v0.4: add MiniMax/Kimi lmcache agentic entries, update Qwen hicache config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 38c365c commit ee8d743

1 file changed

Lines changed: 28 additions & 7 deletions

File tree

.github/configs/amd-master.yaml

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -872,6 +872,21 @@ minimaxm2.5-fp4-mi355x-atom:
872872
- { tp: 4, conc-start: 4, conc-end: 128 }
873873
- { tp: 8, conc-start: 4, conc-end: 16 }
874874

875+
minimaxm2.5-fp4-mi355x-vllm-agentic-lmcache:
876+
image: vllm/vllm-openai-rocm:v0.22.0
877+
model: amd/MiniMax-M2.5-MXFP4
878+
model-prefix: minimaxm2.5
879+
runner: mi355x
880+
precision: fp4
881+
framework: vllm
882+
multinode: false
883+
scenarios:
884+
agentic-coding:
885+
- duration: 1800
886+
search-space:
887+
- { tp: 1, ep: 1, offloading: none, conc-list: [4, 8, 16, 32, 40, 48] }
888+
- { tp: 1, ep: 1, offloading: lmcache, conc-list: [4, 8, 16, 32, 40, 48] }
889+
875890
minimaxm2.5-fp4-mi355x-vllm:
876891
image: vllm/vllm-openai-rocm:v0.22.0
877892
model: amd/MiniMax-M2.5-MXFP4
@@ -2518,6 +2533,16 @@ kimik2.5-fp4-mi355x-vllm-agentic:
25182533
- { tp: 4, offloading: none, conc-list: [16, 24, 32, 40] }
25192534
- { tp: 4, offloading: cpu, conc-list: [16, 24, 32, 40] }
25202535

2536+
kimik2.5-fp4-mi355x-vllm-agentic-lmcache:
2537+
image: vllm/vllm-openai-rocm:v0.22.0
2538+
model: amd/Kimi-K2.5-MXFP4
2539+
model-prefix: kimik2.5
2540+
agentic-coding:
2541+
- duration: 1800
2542+
search-space:
2543+
- { tp: 4, ep: 1, offloading: none, conc-list: [4, 8, 16, 32, 40, 48, 56, 64, 72] }
2544+
- { tp: 4, ep: 1, offloading: lmcache, conc-list: [4, 8, 16, 32, 40, 48, 56, 64, 72] }
2545+
25212546
minimaxm2.5-fp8-mi355x-vllm-agentic:
25222547
image: vllm/vllm-openai-rocm:v0.22.0
25232548
model: MiniMaxAI/MiniMax-M2.5
@@ -2574,19 +2599,15 @@ minimaxm2.5-fp8-mi325x-vllm-agentic:
25742599
- { tp: 4, offloading: cpu, conc-list: [16, 20, 24, 28, 32] }
25752600

25762601
qwen3.5-fp8-mi355x-sglang-agentic-hicache:
2577-
image: lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x-20260521
2602+
image: lmsysorg/sglang-rocm:v0.5.12.post1-rocm720-mi35x-20260531
25782603
model: Qwen/Qwen3.5-397B-A17B-FP8
25792604
model-prefix: qwen3.5
25802605
runner: mi355x
2581-
precision: fp8
2582-
framework: sglang
2583-
multinode: false
2584-
scenarios:
25852606
agentic-coding:
25862607
- duration: 1800
25872608
search-space:
2588-
- { tp: 8, ep: 1, offloading: none, conc-list: [1, 2, 4, 8, 16, 32] }
2589-
- { tp: 8, ep: 1, offloading: hicache, conc-list: [16, 32, 48, 64] }
2609+
- { tp: 4, ep: 1, offloading: none, conc-list: [4, 8, 16, 32, 40, 48, 56, 64, 128] }
2610+
- { tp: 4, ep: 1, offloading: hicache, conc-list: [4, 8, 16, 32, 40, 48, 56, 64, 128] }
25902611

25912612
dsv4-fp4-mi355x-vllm-agentic:
25922613
image: vllm/vllm-openai-rocm:v0.22.0

0 commit comments

Comments
 (0)