[mirror] ci: warn (don't fail) on Bencher main regression#5
Open
yashwant86 wants to merge 4 commits intomm-base-3168from
Open
[mirror] ci: warn (don't fail) on Bencher main regression#5yashwant86 wants to merge 4 commits intomm-base-3168from
yashwant86 wants to merge 4 commits intomm-base-3168from
Conversation
Single-run alerts on GitHub-hosted runners are dominated by environmental noise: the set of (benchmark, measure) pairs flagged varies almost entirely across consecutive pushes on main (Jaccard ~0 between adjacent runs), while every non-docs commit fails the gate. Tracking issue shakacode#3116 has accumulated auto-regression comments with no actionable signal. Retain the Bencher dashboard upload, the auto-opened tracking issue, and the job-summary report (steps 7a/7b). Downgrade step 7c from `exit 1` to a `::warning::` annotation so main stops being permanently red. Re-enable a hard gate later when thresholds/sample-size tolerate the runner noise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Distinguish regression alerts from operational failures so the warn-only policy on main only applies to genuine performance signals: - Set BENCHER_HAS_ALERT=1 when bencher stderr mentions alert/threshold/ boundary indicators; leave it 0 for auth/API/network/CLI failures. - Step 7b (auto-open issue) and step 7c (::warning::) now require BENCHER_HAS_ALERT == 1 so operational errors do not create bogus regression issues. - New step 7d fails the workflow on non-regression Bencher errors on main, preserving the original safety net for a broken benchmark pipeline (missing uploads, bad credentials, Bencher outage). - Step 7c warning no longer hardcodes a specific issue number (shakacode#3116 will churn as issues are closed/reopened) and now includes the workflow run URL for a direct clickable link. Addresses review feedback on shakacode#3168. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Narrows BENCHER_HAS_ALERT from case-insensitive "alert" to word-bounded case-sensitive "Alert" / "Alerts" so that lowercase "alert" appearing in bencher URL paths (e.g., "/v0/.../alerts") or CLI text cannot misclassify an operational failure as a regression and let main go green. "threshold violation" and "boundary violation" remain case-insensitive since those specific phrases are unlikely to appear outside real alert output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses follow-up review feedback on the Bencher main-regression gate:
- Pin bencherdev/bencher to v0.6.2 and document the expected stderr
phrases ("Alert[s]", "threshold violation", "boundary violation") so a
silent CLI upgrade can't flip the alert classification.
- Switch the BENCHER_HAS_ALERT grep from case-sensitive `\bAlerts?\b` to
case-insensitive `\bAlerts?\b` to stay consistent with the 404 guard
and avoid misclassifying a lowercase `alert` variant as operational.
Word boundaries still prevent matches against URL paths like
`/v0/.../alerts`, so there's no new false-positive risk.
- Propagate the original Bencher exit code from step 7d (`exit
"${BENCHER_EXIT_CODE:-1}"` instead of bare `exit 1`) to preserve
diagnostic context in CI logs.
- Drop the redundant `rm -f "$BENCHER_STDERR"` now that the
`trap 'rm -f' EXIT` registered above handles cleanup.
|
No Reviewable Files All files in this PR were filtered out (possibly due to size limits or invalid format). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Mirror of upstream shakacode#3168 for benchmark. Do not merge.