Skip to content

Commit c772387

Browse files
aryguptclaude
andcommitted
chore(sweep): re-run MiniMax-M2.5 vLLM sweeps to capture power telemetry
Re-runs the MiniMax-M2.5 single-node vLLM configs (H100/H200 FP8, B200/B300/MI355X FP4) with no recipe change, so the new rows carry the per-GPU power telemetry (avg_power_w) added in #1558. The power/energy canvas currently models power because its source rows predate the 2026-05-27 capture merge; this re-run lets it use measured power. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent ea4f575 commit c772387

1 file changed

Lines changed: 11 additions & 0 deletions

File tree

perf-changelog.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3474,3 +3474,14 @@
34743474
- "Use scheduler-recv-interval values 2/60/30/1200/600/1920 for conc 1-4/8/16/32/64/128-256"
34753475
- "Set max-running-requests=256, chunked-prefill-size=16384, mem-fraction-static=0.8, cuda-graph-max-bs=CONC, and enable symm-mem"
34763476
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1544
3477+
3478+
- config-keys:
3479+
- minimaxm2.5-fp8-h100-vllm
3480+
- minimaxm2.5-fp8-h200-vllm
3481+
- minimaxm2.5-fp4-b200-vllm
3482+
- minimaxm2.5-fp4-b300-vllm
3483+
- minimaxm2.5-fp4-mi355x-vllm
3484+
description:
3485+
- "Re-run MiniMax-M2.5 single-node vLLM sweeps (H100/H200 FP8, B200/B300/MI355X FP4) with no recipe change, to capture per-GPU power telemetry (avg_power_w) added in #1558 for the power/energy canvas"
3486+
- "Source rows for the canvas predate the 2026-05-27 power-capture merge, so they carry throughput/latency but no measured power; this re-run replaces the modeled power layer with measured power"
3487+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

0 commit comments

Comments
 (0)