You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/workflows/claude.yml
+2-23Lines changed: 2 additions & 23 deletions
Original file line number
Diff line number
Diff line change
@@ -221,27 +221,7 @@ jobs:
221
221
222
222
## Updating perf-changelog.yaml
223
223
224
-
When making changes to benchmark scripts or master config files that affect image tags, environment variables, or configuration parameters, you MUST add an entry to `perf-changelog.yaml`.
225
-
226
-
**When to update perf-changelog.yaml:**
227
-
- Updating image tags in `.github/configs/*-master.yaml` or `benchmarks/*.sh` scripts
228
-
- Adding or modifying environment variables in benchmark configurations
229
-
- Changing configuration parameters that affect performance
230
-
231
-
**Entry format:**
232
-
```yaml
233
-
- config-keys:
234
-
- dsr1-fp8-*-vllm # Use wildcards to match multiple configs
- Use wildcards (`*`) in config-keys to match multiple related configurations
243
-
- Each description item should be a concise change summary
244
-
- The pr-link should reference the PR number (use XXX as placeholder until PR is created)
224
+
See `AGENTS.md` → "Updating Docker Images" for entry format and rules. Required whenever you change image tags, env vars, or perf-affecting params in `.github/configs/*-master.yaml` or `benchmarks/*.sh`. Use `XXX` as the PR-link placeholder until the PR exists.
245
225
246
226
## Spawning Additional Workers:
247
227
You CAN spawn additional Claude workers by commenting "@claude" with a specific task.
@@ -272,8 +252,7 @@ jobs:
272
252
273
253
### Additional Knowledge
274
254
- MI355 is gfx950 not gfx1201
275
-
- **STP (Single Token Prediction)**: Standard autoregressive decoding — one token per forward pass. No speculative decoding or MTP. Benchmarks labeled "STP only" use vanilla decoding.
276
-
- **MTP (Multi-Token Prediction)**: Predicts multiple tokens per forward pass using speculative decoding (e.g., EAGLE, NEXTN).
255
+
- STP/MTP terminology: see `AGENTS.md` → "Terminology"
277
256
278
257
### Expert Parallelism in Benchmark Scripts
279
258
vLLM and SGLang handle expert parallelism differently. When writing or reviewing benchmark scripts for MoE models:
Copy file name to clipboardExpand all lines: .github/workflows/docker-tag-monitor.yml
+17-20Lines changed: 17 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,14 @@
1
1
name: Docker Tag Monitor
2
2
3
-
# trigger: re-run after token/allowed_bots fix
3
+
# Downstream merge note (human-only — intentionally NOT in the @claude prompt):
4
+
# Once the per-config-key PRs this workflow asks Claude to open have a
5
+
# green run-sweep.yml run and the `full-sweep-enabled` label, merge them
6
+
# with `utils/merge_with_reuse.sh <pr-number>` instead of the GitHub UI.
7
+
# That script posts /reuse-sweep-run, auto-resolves perf-changelog.yaml
8
+
# conflicts, cancels the merge-triggered sweep, and squash-merges with
9
+
# --admin so the post-merge run-sweep run reuses the PR's prior sweep.
10
+
# Claude doesn't have admin merge rights and shouldn't be told about this
11
+
# path — it's a maintainer-only finalization step.
4
12
5
13
on:
6
14
schedule:
@@ -207,29 +215,29 @@ jobs:
207
215
echo "2. Add entries to \`perf-changelog.yaml\` documenting the version changes"
208
216
echo "3. For each eligible config-key, push a branch and actually open a PR — do not stop at the \"Create a pull request for ...\" remote hint that \`git push\` prints. Run \`gh pr create\` (or the equivalent MCP tool) and verify the returned PR URL. Link every PR back to this issue in a comment."
209
217
echo ""
210
-
echo "**PR title / commit message formatting:** Multi-line titles and bodies MUST use a heredoc, not \`\\n\` escapes and not \`\$'...'\` ANSI-C quoting. Last run produced commits literally starting with \`\$\` and containing \`\\n\\n\` as text because of mis-quoted ANSI-C strings. Use this pattern instead:"
218
+
echo "**Required PR label:** Every PR you open from this issue MUST carry the \`full-sweep-enabled\` label. Apply it at creation time via \`gh pr create --label full-sweep-enabled\` (or add it immediately after with \`gh pr edit <num> --add-label full-sweep-enabled\`). Do not skip this — downstream automation keys off the label."
219
+
echo ""
220
+
echo "**PR title / commit message formatting:** Multi-line titles and bodies MUST use a heredoc, not \`\\n\` escapes and not \`\$'...'\` ANSI-C quoting. A prior run produced commits literally starting with \`\$\` and containing \`\\n\\n\` as text because of mis-quoted ANSI-C strings. Use this pattern instead:"
211
221
echo ""
212
222
echo "\`\`\`bash"
213
223
echo "git commit -m \"\$(cat <<'EOF'"
214
224
echo "Update qwen3.5-bf16-b300-sglang-mtp SGLang image to v0.5.11-cu130"
echo "Updates the SGLang image tag for \\\`qwen3.5-bf16-b300-sglang-mtp\\\` to v0.5.11-cu130."
233
+
echo "Updates the SGLang image tag for \`qwen3.5-bf16-b300-sglang-mtp\` to v0.5.11-cu130."
224
234
echo ""
225
-
echo "Ref #${ISSUE_NUMBER}"
235
+
echo "Ref #<this issue's number>"
226
236
echo "EOF"
227
237
echo ")\""
228
238
echo "\`\`\`"
229
239
echo ""
230
-
echo "PR titles must be a single line (no newlines). Bodies are multi-line and must contain real \\\\n, not the literal characters \`\\\\n\`. Never put \`\$\` in front of a quoted message string."
231
-
echo ""
232
-
echo "**Required PR label:** Every PR you open from this issue MUST carry the \`full-sweep-enabled\` label. Apply it at creation time via \`gh pr create --label full-sweep-enabled\` (or add it immediately after with \`gh pr edit <num> --add-label full-sweep-enabled\`). Do not skip this — downstream automation keys off the label."
240
+
echo "PR titles must be a single line (no newlines). Bodies should contain real newlines (use a heredoc), not the literal characters \`\\n\`. Never put \`\$\` in front of a quoted message string."
233
241
echo ""
234
242
echo "**Runner gating:** Only open PRs for config-keys whose runner SKU is in the allowed list (\`$ALLOWED_SKUS\`). The runner SKU is the hardware segment in the config-key (e.g. \`dsr1-fp4-b200-sglang\` → \`b200\`). For any config-key whose SKU is not in the allowed list, skip it and list the skipped keys plus the reason (not clear / all-offline / no idle capacity) in a single comment on this issue."
235
243
echo ""
@@ -243,18 +251,7 @@ jobs:
243
251
echo ""
244
252
echo "**Exception — MTP pairs:** When a config-key and its \`-mtp\` sibling exist for the same model/precision/runner/framework (e.g. \`qwen3.5-fp4-b300-sglang\` and \`qwen3.5-fp4-b300-sglang-mtp\`), bundle both into one PR. Treat the pair as a single unit for the per-SKU cap (counts as 1, not 2) and the sequential e2e queue. If only one side of the pair is present in the updates, open a PR for just that one."
245
253
echo ""
246
-
echo "For each eligible config-key, check if there are multiple CUDA/ROCm versions available and choose appropriately based on current usage patterns in the configs."
247
-
echo ""
248
-
echo "**Slack ping when done:** After all PRs have been opened, all e2e runs have reached a terminal state, and the wrap-up comment with any deferred config-keys has been posted, send a single summary message to Slack channel \`C09PULGMVNG\` via the Bash tool. The \`SLACK_BOT_TOKEN\` env var is available. Use:"
249
-
echo ""
250
-
echo "\`\`\`bash"
251
-
echo "curl -sS -X POST https://slack.com/api/chat.postMessage \\"
echo "Message text should include: this issue link, count of PRs opened (per SKU), count of e2e runs passed/failed, list of skipped or deferred config-keys, and link to this workflow run. Send exactly one Slack message — do not post per-SKU or per-PR."
254
+
echo "If Docker Hub lists multiple variants for the same base version (e.g. \`cu128\` vs \`cu130\`, \`rocm70\` vs \`rocm72\`), pick the variant whose suffix matches what the config-key's current image entry already uses — don't switch CUDA/ROCm minor versions in this update."
0 commit comments