Skip to content

Commit 8d30341

Browse files
committed
chore(perf-changelog): trigger multinode sweep for measured-power aggregation
Appends entry for dsv4-fp4-gb300-dynamo-sglang so run-sweep.yml fires when the sweep-enabled label is added to PR #1574. The sweep produces the first multinode agg JSONs with avg_power_w + joules_per_*_token, validating the per-source GPU-id namespacing and GPU_METRICS_CSV_GLOB env-var bridge end-to-end on real GB300 hardware (gb300-cw cluster).
1 parent 1b7cd09 commit 8d30341

1 file changed

Lines changed: 6 additions & 0 deletions

File tree

perf-changelog.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3171,3 +3171,9 @@
31713171
description:
31723172
- "Validates measured-power aggregation pipeline (PR #1558) on both NVIDIA (H200) and AMD (MI355X) hardware — different SMI tools (nvidia-smi vs amd-smi), different CSV schemas (power.draw [W] vs socket_power), same aggregator. No config change. Entry intentionally kept past merge so run-sweep produces canonical agg JSONs with avg_power_w + joules_per_output_token on main for both vendors, seeding the dashboard's day-zero data."
31733173
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1558
3174+
3175+
- config-keys:
3176+
- dsv4-fp4-gb300-dynamo-sglang
3177+
description:
3178+
- "Smoke run validating multinode measured-power aggregation (PR #1574). No config change; entry exists to trigger a sweep that produces the first multinode agg JSON with avg_power_w + joules_per_*_token populated from per-node srt-slurm perfmon CSVs. Validates per-source GPU-id namespacing in aggregate_power.py (without it, 14 nodes × 4 GPUs would report num_gpus=4 instead of 56) and the GPU_METRICS_CSV_GLOB env var bridge in process_result.py. Only the gb300-cw runner has the perfmon launcher changes; any gb300-nv runs in the sweep will succeed normally without power fields, which the dashboard handles gracefully (chart gates on field presence)."
3179+
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1574

0 commit comments

Comments
 (0)