-
Notifications
You must be signed in to change notification settings - Fork 186
[NV] Update B300 DSV4 SGLang Pareto sweep #1575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
a947d18
c183bc8
deee4cc
999175d
fbeb15a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3171,3 +3171,11 @@ | |
| description: | ||
| - "Validates measured-power aggregation pipeline (PR #1558) on both NVIDIA (H200) and AMD (MI355X) hardware — different SMI tools (nvidia-smi vs amd-smi), different CSV schemas (power.draw [W] vs socket_power), same aggregator. No config change. Entry intentionally kept past merge so run-sweep produces canonical agg JSONs with avg_power_w + joules_per_output_token on main for both vendors, seeding the dashboard's day-zero data." | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1558 | ||
|
|
||
| - config-keys: | ||
| - dsv4-fp4-b300-sglang | ||
| description: | ||
| - "Update DeepSeek-V4-Pro FP4 B300 SGLang non-MTP sweep to the 2026-05-19 8k/1k submission frontier: TP8 no-DP-attention c1-c64 and DEP8 DP-attention c512/c768/c1024/c1536/c2048" | ||
| - "Use lmsysorg/sglang:nightly-dev-cu13-20260522-7cf193fe to pick up the merged SGLang warmup path" | ||
| - "Map dp-attn=false to TP8 flashinfer_mxfp4 with chunked-prefill 8192; map dp-attn=true to DEP8 mixed-chunk MegaMoE throughput settings" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1552 | ||
|
Check failure on line 3181 in perf-changelog.yaml
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 The new Extended reasoning...What the bug isThe new entry appended at pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1552…but the PR being merged is #1575 (the rebased successor — the PR description itself notes "Rebased copy of #1552 with Why this is a real issue, not just cosmeticEvery other recent entry in
Step-by-step proof
FixUpdate line 3181 to: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1575or, equivalently, use the pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXThe historical reference to #1552 (which the PR description already provides) can stay in the PR description; the changelog entry should point at the PR that actually lands the change, both for convention and for tooling correctness. |
||
Uh oh!
There was an error while loading. Please reload this page.