Skip to content

chore(sweep): re-run MiniMax-M2.5 vLLM sweeps for motniroing#1666

Open
arygupt wants to merge 3 commits into
mainfrom
chore/recapture-minimax-power-canvas
Open

chore(sweep): re-run MiniMax-M2.5 vLLM sweeps for motniroing#1666
arygupt wants to merge 3 commits into
mainfrom
chore/recapture-minimax-power-canvas

Conversation

@arygupt
Copy link
Copy Markdown
Collaborator

@arygupt arygupt commented Jun 4, 2026

Why

The power/energy canvas currently models per-GPU power because its source rows predate the power-capture merge (#1558, merged 2026-05-27). Those MiniMax-M2.5 runs carry throughput / interactivity / latency but no measured power (avg_power_w).

This PR re-runs the exact same configs (no recipe change) on current main, so the new rows land with measured power telemetry. The canvas can then swap its modeled power layer for measured.

What

Adds one perf-changelog.yaml entry arming a full sweep of the five canvas configs:

config-key HW precision
minimaxm2.5-fp8-h100-vllm H100 FP8
minimaxm2.5-fp8-h200-vllm H200 FP8
minimaxm2.5-fp4-b200-vllm B200 FP4
minimaxm2.5-fp4-b300-vllm B300 FP4
minimaxm2.5-fp4-mi355x-vllm MI355X FP4

No recipe/code changes — changelog-only. Locally validated with utils/process_changelog.py: generates 107 single-node runs (b200:26, b300:23, mi355x:28, h100:12, h200:18) across 1k1k + 8k1k seq-len groups.

Downstream

Once this sweep completes, the rows publish via the weekly DB dump (unblocked by InferenceX-app#418, which fixes the 2 GiB asset cap), and the canvas re-points to the new dump to use measured power.

🤖 Generated with Claude Code


Note

Low Risk
Changelog-only sweep trigger plus small validation/processing flags; no inference recipes or runtime benchmark logic changed beyond skipping eval jobs when flagged.

Overview
Adds a benchmarks-only changelog path so power re-runs can schedule throughput sweeps without lm-eval jobs, and arms a MiniMax-M2.5 re-run across five single-node vLLM configs to backfill measured avg_power_w for the power/energy canvas.

Changelog plumbing: ChangelogEntry gains benchmarks-only (YAML alias), default false, mutually exclusive with existing evals-only. process_changelog.py skips the eval-generation pass when that flag is set; benchmarks still run with --no-evals as today.

Sweep entry: New perf-changelog.yaml block targets minimaxm2.5-fp8-h100-vllm, minimaxm2.5-fp8-h200-vllm, minimaxm2.5-fp4-b200-vllm, minimaxm2.5-fp4-b300-vllm, and minimaxm2.5-fp4-mi355x-vllm with no recipe changes—only re-execution so rows pick up power telemetry from #1558.

Tests: TestChangelogEntry covers defaults, alias mapping, mutual exclusion, and extra=forbid on typos.

Reviewed by Cursor Bugbot for commit 5fa9848. Bugbot is set up for automated code reviews on this repo. Configure here.

Re-runs the MiniMax-M2.5 single-node vLLM configs (H100/H200 FP8,
B200/B300/MI355X FP4) with no recipe change, so the new rows carry the
per-GPU power telemetry (avg_power_w) added in #1558. The power/energy
canvas currently models power because its source rows predate the
2026-05-27 capture merge; this re-run lets it use measured power.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@arygupt arygupt requested a review from a team June 4, 2026 21:11
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml
description:
- "Re-run MiniMax-M2.5 single-node vLLM sweeps (H100/H200 FP8, B200/B300/MI355X FP4) with no recipe change, to capture per-GPU power telemetry (avg_power_w) added in #1558 for the power/energy canvas"
- "Source rows for the canvas predate the 2026-05-27 power-capture merge, so they carry throughput/latency but no measured power; this re-run replaces the modeled power layer with measured power"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1666
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new changelog entry's pr-link is set to https://github.com/SemiAnalysisAI/InferenceX/pull/XXX — a literal XXX placeholder rather than the actual PR number. The PR description references pull/1666 and every other entry in this file resolves to a real PR number; please replace XXX with 1666 before merge so the canvas re-run rows remain traceable.

Extended reasoning...

What the bug is

perf-changelog.yaml:3487 (the only line added by this PR's pr-link: field) literally reads:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

The XXX is a leftover template placeholder that was never substituted with the real PR number (1666).

How it manifests

Every other entry in perf-changelog.yaml resolves to a numeric PR — e.g. /pull/1544 at line 3476, and /pull/1648, /pull/1663, /pull/1647 in nearby blocks. This entry is the only one whose link does not resolve. As shipped, anyone clicking the link from a canvas row that originated in this sweep would get a 404, and any tooling that joins changelog rows back to their originating PR (for traceability or audit) will see an unparseable PR id.

Code path that triggers it

This is a pure data/config bug — the row is appended verbatim to perf-changelog.yaml, which is the authoritative changelog for sweep triggers. The placeholder is in the field that downstream tooling (and humans) use to map a sweep back to the PR that armed it. Because the sweep itself is armed by config-keys/description, the bad pr-link will not block execution, so it will silently land on main.

Why existing code doesn't prevent it

There is no schema validator on pr-link requiring a numeric PR id, and utils/process_changelog.py (mentioned in the PR description as the local validator) keys on config-keys, not the link. The author validated processing but not link well-formedness, so the placeholder slipped through.

Impact

Traceability is broken for the five canvas re-run rows generated by this entry. A follow-up cleanup PR will be required to replace XXX with 1666 (or any future correct number) — exactly the kind of trivial follow-up that wastes a review cycle when it could be caught here.

Fix

Change line 3487 from:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

to:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1666

Step-by-step proof

  1. Read perf-changelog.yaml at lines 3474–3487 on the PR's HEAD (commit c772387).
  2. Line 3476 (prior entry) ends in /pull/1544 — a valid PR id.
  3. Lines 3478–3487 are the new entry added by this PR.
  4. Line 3487 ends in /pull/XXX — a literal three-character placeholder, not a number.
  5. The PR description explicitly states the canvas should point to pull/1666 ("the canvas re-points to the new dump…"), and this PR is itself #1666, confirming the intended value is 1666.
  6. Conclusion: the placeholder was never substituted before commit, and will be merged as-is unless fixed.

…e-run

Adds `benchmarks-only: true` to a changelog entry to skip the eval pass
(symmetric with the existing `evals-only`; the two are mutually exclusive).
Power telemetry comes from the benchmark runs, so the MiniMax power re-run
doesn't need evals — sets the flag, dropping 14 unnecessary eval runs.

- validation.py: new `benchmarks_only` field + mutual-exclusion validator
- process_changelog.py: skip eval generation when benchmarks_only is set
- test_validation.py: ChangelogEntry coverage (aliases, exclusivity, forbid)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

@functionstackx functionstackx changed the title chore(sweep): re-run MiniMax-M2.5 vLLM sweeps to capture power telemetry chore(sweep): re-run MiniMax-M2.5 vLLM sweeps for motniroing Jun 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant