Skip to content

Filter stale Bencher alerts before reporting#3822

Merged
justin808 merged 1 commit into
mainfrom
codex/b-3795-benchmark-regression-filter
Jun 9, 2026
Merged

Filter stale Bencher alerts before reporting#3822
justin808 merged 1 commit into
mainfrom
codex/b-3795-benchmark-regression-filter

Conversation

@justin808

@justin808 justin808 commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

  • Filter active Bencher alerts against the current report boundaries before treating them as regressions.
  • Keep malformed or unmatchable active alerts fail-safe so report-shape drift still fails visibly.
  • Normalize stale-only Bencher alert exits after the start-point retry and operational checks so cleared alerts do not file regression issues.
  • Add parser and benchmark tracking specs for stale alerts, measure-less alerts, fail-safe cases, and stale-only exit normalization.

Fixes #3795.
Refs #3798, #3799, #3807.

Validation

  • bundle exec rubocop benchmarks/lib/bencher_report.rb benchmarks/track_benchmarks.rb benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb -> 4 files inspected, no offenses
  • bundle exec rspec benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb benchmarks/spec/report_table_integration_spec.rb -> 83 examples, 0 failures
  • bundle exec rspec benchmarks/spec -> 229 examples, 0 failures
  • git diff --check -> passed
  • script/ci-changes-detector origin/main -> Benchmark scripts; recommends Lint (Ruby + JS)
  • (cd react_on_rails && bundle exec rubocop) -> 214 files inspected, no offenses
  • codex review --base origin/main -> no actionable correctness issues found
  • git push pre-push hook -> branch Ruby RuboCop passed on changed Ruby files; Markdown link check had no Markdown files

Labels: benchmark. Benchmark reporting is performance-sensitive; full-ci is not recommended because this is a focused benchmark script/parser/spec change and the CI detector only recommends lint.

Agent Merge Confidence

Mode: development (no open Release gate: tracker found by title search; batch assignment: rc-accelerated-2026-06-08-two-machine)
Score: 8/10
Auto-merge recommendation: no (user requested no auto-merge; no independent finalizer)
Affected areas: benchmark regression reporting, Bencher report parsing, benchmark confirmation handoff
CI detector: script/ci-changes-detector origin/main -> Benchmark scripts; recommends Lint (Ruby + JS)
Validation run:

  • bundle exec rubocop benchmarks/lib/bencher_report.rb benchmarks/track_benchmarks.rb benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb -> 4 files inspected, no offenses
  • bundle exec rspec benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb benchmarks/spec/report_table_integration_spec.rb -> 83 examples, 0 failures
  • bundle exec rspec benchmarks/spec -> 229 examples, 0 failures
  • git diff --check -> passed
  • (cd react_on_rails && bundle exec rubocop) -> 214 files inspected, no offenses
  • codex review --base origin/main -> no actionable correctness issues found
    Review/check gate:
  • Codex review: complete for c84a4be3, no actionable correctness issues
  • GitHub checks: pending after PR creation
    Known residual risk: No live Bencher service run was available locally; coverage uses pinned JSON fixtures/specs for the Bencher CLI v0.6.2 report shape.
    Finalized by: not finalized; authoring agent only

Note

Medium Risk
Changes benchmark regression gating and CI exit codes; incorrect filtering could hide real regressions or clear jobs that should fail, though fail-safe paths and extensive specs mitigate this.

Overview
Bencher regression detection now cross-checks active alerts[] against current report boundaries before counting a regression. Stale alerts (metrics back within limits) are dropped from #alerts / #regression? but tracked via new #filtered_alert?.

Fail-safe behavior is unchanged for ambiguous cases: missing benchmark, unmatchable boundary, unknown limit side, or measure-less alerts with no regression on the alert side still count as regressions so schema drift stays visible.

CI exit handling adds normalized_bencher_exit_code: after start-point-hash retry, a non-zero Bencher exit with only filtered (stale) alerts and no real regression is normalized to 0 with a ::notice::, avoiding false regression filing while preserving the raw exit for retry logic first.

Specs cover stale vs current alerts, measure-less alerts, fail-safe paths, and exit normalization.

Reviewed by Cursor Bugbot for commit c84a4be. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved alert classification to distinguish actual performance regressions from other active alerts.
    • Enhanced exit code handling to prevent false workflow failures when only non-regression alerts are present.
  • Tests

    • Expanded test coverage for edge cases in alert filtering and regression detection.
    • Added tests for retry behavior and exit code normalization logic.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 24dc8cb6-f570-40d0-b9de-723c5f394e92

📥 Commits

Reviewing files that changed from the base of the PR and between 25b94cc and c84a4be.

📒 Files selected for processing (4)
  • benchmarks/lib/bencher_report.rb
  • benchmarks/spec/bencher_report_spec.rb
  • benchmarks/spec/track_benchmarks_spec.rb
  • benchmarks/track_benchmarks.rb

Walkthrough

BencherReport now distinguishes current regression alerts from filtered/stale alerts during initialization by partitioning active alerts into separate collections. A new normalized_bencher_exit_code function converts non-zero Bencher exits to success (0) when only stale/filtered alerts remain. Test fixtures are refactored to use consistent helpers and expanded with new cases for exit-code and retry-handling behavior.

Changes

Bencher Alert Classification and Exit Normalization

Layer / File(s) Summary
Alert Classification Core Implementation
benchmarks/lib/bencher_report.rb
BencherReport partitions active alerts into @alerts (current regressions via current_regression_alert?) and @filtered_alerts (other active alerts). Adds filtered_alert? predicate and refactors regression? and perf_links_unavailable? as Ruby shorthand. Alert classification considers inferred direction from the alert limit and compares against configured boundaries, treating missing data as potential regressions.
Alert Classification Test Coverage
benchmarks/spec/bencher_report_spec.rb
Comprehensive test helpers and cases cover regression classification: active alerts ignored when metric is not a regression, malformed/missing alert data handling (missing benchmark name, missing boundary), measure-less alert behavior depending on benchmark regression presence, and failed_pct regressions detected when crossing upper boundary. Tests validate both regression? and filtered_alert? outcomes.
Exit Code Normalization Implementation
benchmarks/track_benchmarks.rb
New normalized_bencher_exit_code(exit_code, report) converts non-zero Bencher exits to 0 when the report contains only stale/filtered active alerts and no current regression, emitting a GitHub notice. Applied immediately after --start-point-hash retry logic and before confirmation/candidate flows.
Test Fixtures Modernization and Integration Tests
benchmarks/spec/track_benchmarks_spec.rb
Introduces JSON-building helpers (result, rps_measure, p50_measure, active_alert) to replace hardcoded alert/result hashes. Refactors regressed_benchmark_names, regressed_alert_pairs, and confirmation_outcome tests using consistent helper-built structures. Adds retry-handling tests: one preserves original alert exit code to enable retry_without_start_point_hash?, and another verifies normalized_bencher_exit_code converts stale-alert exits to 0 with notice emission.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

  • shakacode/react_on_rails#3627: Modified the same perf_links_unavailable? method in benchmarks/lib/bencher_report.rb (added there and now re-expressed in shorthand).
  • shakacode/react_on_rails#3586: Introduced earlier regression classification logic in BencherReport that this PR extends with alert filtering and boundary-based regression detection.
  • shakacode/react_on_rails#3810: Refines the confirmation-stage workflow that depends on the BencherReport current-regression classification and exit-code normalization introduced here.

Suggested labels

enhancement, review-needed, full-ci, benchmark, P2

Suggested reviewers

  • alexeyr-ci2
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Filter stale Bencher alerts before reporting' is specific and clearly describes the main change: filtering of stale/non-regression alerts before they are treated as regressions in reporting.
Linked Issues check ✅ Passed The PR implements the core objective from issue #3795 by filtering non-regression alerts (stale/unmatchable) so they no longer cause false-positive regression issues, matching the requirement to prevent spurious performance-regression posts.
Out of Scope Changes check ✅ Passed All changes are directly related to filtering Bencher alerts and normalizing exit codes to prevent false-positive regression reporting, with no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/b-3795-benchmark-regression-filter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Code Review — Filter stale Bencher alerts before reporting

Overview

This PR correctly addresses a false-positive regression problem: when Bencher carries an active alert from a previous run, the old code treated it as a current regression even when the current metric is healthy. The fix partitions alerts into current regressions vs. stale ones and normalizes the Bencher exit code after retry handling.

The fail-safe philosophy (missing benchmark name, invalid direction, no boundary on the expected side → treat as regression) is the right default for a reporting system where a silent miss is worse than a noisy false-positive.


What's solid

  • Fail-safe coverage — every uncertain case (malformed alert, unknown limit string, no matching boundary) keeps the alert live rather than silently dropping it. This is the correct default for regression detection.
  • Retry orderingnormalized_bencher_exit_code is called after the start-point-hash retry, which is tested explicitly. A stale alert won't mask a legitimate hash-miss retry.
  • Test breadth — all the interesting cases (stale alert filtered, measure-less alert, opposite-side improvement ignored, fail-safe passthrough, exit-code normalization) have coverage.
  • Boundary mirroring — the Boundary#significance logic correctly uses the symmetric t-test interval so one-sided thresholds still classify both directions.

Issues

1. .values can yield duplicate Boundary objects in the measure-less path

index_measure stores each boundary under both its slug key and its name key (when they normalize differently), so @boundaries.fetch(benchmark, {}).values can return the same Boundary instance twice. For any? this only causes a redundant evaluation, but it is subtle enough to warrant .uniq:

matching_boundaries = @boundaries.fetch(alert.benchmark, {}).values.uniq.select do |boundary|
  threshold_side?(boundary, direction)
end

2. filtered_alert? naming is ambiguous

filtered_alert? reads naturally as "is this alert filtered?" (instance predicate) rather than "does this report contain filtered alerts?". Since callers use it to decide whether to suppress a non-zero exit, a name that conveys the cardinality is clearer:

def stale_alerts? = !@filtered_alerts.empty?
# or
def filtered_alerts? = !@filtered_alerts.empty?

3. Double-negative condition in normalized_bencher_exit_code

return exit_code unless exit_code != 0 && ... is tricky to parse quickly. Flipping to a positive guard reads more naturally:

def normalized_bencher_exit_code(exit_code, report)
  return exit_code if exit_code.zero? || !report&.filtered_alert? || regression?(report)

  Github.notice("Bencher reported only stale active alert(s); no current boundary-backed regression remains.")
  0
end

Minor observations

  • The rubocop:disable Metrics/CyclomaticComplexity wrapper is a signal worth noting: current_regression_alert? could be decomposed into two private helpers — one for the measure-less path and one for the measure-specific path — which would satisfy RuboCop without a disable and make the two branches individually testable. Not blocking, but worth a follow-up.
  • threshold_side? returns the limit value (a Float or nil) rather than an explicit boolean. In Ruby this works correctly since 0.0 is truthy, but an explicit !boundary.lower_limit.nil? form would make the intent unmistakable for future readers.

Verdict

The logic is correct, the fail-safe decisions are sound, and the test coverage is thorough. The issues above are all minor: .uniq avoids subtle redundant work, renaming filtered_alert? improves readability at every call site, and straightening the guard condition makes the normalizer easier to audit. None block merge, but the renaming in particular will show up in every caller going forward, so it's worth fixing before this lands.

Comment thread benchmarks/lib/bencher_report.rb
Comment thread benchmarks/lib/bencher_report.rb
Comment thread benchmarks/track_benchmarks.rb
@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds stale-alert filtering to the Bencher benchmark regression reporter: active alerts from Bencher are now cross-referenced against the current report's boundaries before being treated as regressions, and a normalized_bencher_exit_code helper converts a stale-only non-zero exit to success after retry and operational checks have run.

  • BencherReport#current_regression_alert? partitions active alerts into current regressions (@alerts) and filtered/stale ones (@filtered_alerts), with fail-safe return true for any alert that can't be matched (no benchmark name, unknown direction, or missing boundary), so report-shape drift still surfaces loudly.
  • normalized_bencher_exit_code in track_benchmarks.rb is placed after the start-point-hash retry, so a stale-only exit does not suppress an otherwise retriable "Head Version not found" condition.
  • Specs cover stale alerts, measure-less alerts, fail-safe cases, opposite-side improvements, and the new stale-only exit normalization path; existing fixtures were updated to include matching boundary results.

Confidence Score: 4/5

Safe to merge; the filtering logic is well-guarded with fail-safe defaults and is thoroughly covered by specs.

The core current_regression_alert? method uses explicit fail-safes so any unrecognised alert shape keeps the alert active rather than silently dropping a real regression. The placement of normalized_bencher_exit_code after the retry block is correct and directly tested. All findings are style/clarity notes with no impact on runtime behaviour.

No files require special attention; the two style suggestions in bencher_report.rb and track_benchmarks.rb are non-blocking.

Important Files Changed

Filename Overview
benchmarks/lib/bencher_report.rb Adds current_regression_alert? to partition active alerts into current vs stale. Logic is sound with good fail-safe defaults; minor style note on threshold_side? returning a numeric value used as boolean.
benchmarks/track_benchmarks.rb Adds normalized_bencher_exit_code and calls it after retry handling. Guard condition uses return … unless with != 0 && … && !regression?; semantics are correct but slightly dense to read.
benchmarks/spec/bencher_report_spec.rb Comprehensive new specs for stale-alert filtering, measure-less alerts, fail-safe cases, and opposite-side improvements. Coverage is thorough and fixtures are well-constructed.
benchmarks/spec/track_benchmarks_spec.rb Updated fixtures now include matching results boundaries so the new filtering logic resolves correctly; adds specs for stale-only exit normalization and the preserved retry ordering.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Bencher CLI exits with JSON report] --> B{Parse JSON\nBencherReport.parse}
    B -->|FormatError| C[exit 1 — fail loud]
    B -->|OK| D[index_boundaries from results]
    D --> E[parse_alerts → partition via current_regression_alert?]
    E --> F["@alerts (current regressions)"]
    E --> G["@filtered_alerts (stale alerts)"]
    F --> H{regression?}
    H -->|true| J[Real regression — keep exit code]
    H -->|false| K{retry_without_start_point_hash?}
    K -->|exit≠0 + Head Version not found| L[Retry without --start-point-hash]
    L --> M[Re-run Bencher → new report]
    M --> N[normalized_bencher_exit_code]
    K -->|no| N
    N -->|exit≠0 + filtered_alert? + !regression?| O[emit notice → return 0]
    N -->|otherwise| P[Return original exit code]
    J --> Q[Downstream regression reporting]
    O --> Q
    P --> Q
Loading

Reviews (1): Last reviewed commit: "Filter stale Bencher alerts before repor..." | Re-trigger Greptile

Comment thread benchmarks/lib/bencher_report.rb
Comment thread benchmarks/track_benchmarks.rb
Comment thread benchmarks/lib/bencher_report.rb
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Pro Node Renderer Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
Pro Node Renderer: simple_eval (non-RSC) 2180.64 ▼1.1% (2205.6) 3.98 ▼3.4% (4.12) 6.14 ▲17.4% (5.23) 200=65433
Pro Node Renderer: react_ssr (non-RSC) 1944.53 ▲0.3% (1939.63) 4.57 ▼2.6% (4.69) 6.03 ▲0.1% (6.02) 200=58346

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Core Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/: Core 3.54 ▲11.7% (3.17) 2329.06 ▼1.6% (2367.87) 2878.36 ▼8.5% (3146.48) 200=113
/client_side_hello_world: Core 780.07 ▲17.8% (662.39) 8.61 ▼5.2% (9.09) 16.68 ▼5.9% (17.72) 200=23568
/client_side_rescript_hello_world: Core 734.49 ▲10.6% (664.26) 8.49 ▼10.0% (9.43) 17.51 ▼1.7% (17.81) 200=22192
/client_side_hello_world_shared_store: Core 682.39 ▲6.9% (638.6) 8.9 ▼8.0% (9.68) 19.04 ▼2.1% (19.45) 200=20618
/client_side_hello_world_shared_store_controller: Core 492.75 ▼23.0% (639.79) 11.62 ▲21.2% (9.59) 16.19 ▼21.1% (20.52) 200=14888
/client_side_hello_world_shared_store_defer: Core 707.05 ▲8.4% (652.39) 4.34 ▼55.2% (9.69) 16.11 ▼20.7% (20.3) 200=21508
/server_side_hello_world_shared_store: Core 12.59 ▼7.9% (13.67) 468.52 ▼17.7% (568.98) 818.07 ▲8.9% (751.32) 200=386
/server_side_hello_world_shared_store_controller: Core 15.1 ▲10.4% (13.67) 598.98 ▲9.1% (548.97) 745.53 ▲0.7% (740.57) 200=461
/server_side_hello_world_shared_store_defer: Core 14.87 ▲7.8% (13.8) 212.54 ▼62.5% (566.64) 691.83 ▼9.3% (762.95) 200=461
/server_side_hello_world: Core 30.43 ▲9.8% (27.72) 277.6 ▲1.8% (272.81) 311.37 ▼10.3% (347.25) 200=928
/server_side_hello_world_hooks: Core 29.52 ▲6.4% (27.74) 305.96 ▲10.0% (278.2) 338.31 ▼3.8% (351.77) 200=896
/server_side_hello_world_props: Core 28.69 ▲3.1% (27.83) 294.91 ▲4.6% (282.01) 359.84 ▲3.9% (346.46) 200=873
/client_side_log_throw: Core 711.51 ▲7.8% (660.18) 8.52 ▼11.8% (9.66) 21.13 ▲13.6% (18.6) 200=21496
/server_side_log_throw: Core 29.69 ▲10.1% (26.98) 280.24 ▼3.3% (289.76) 324.19 ▼8.1% (352.89) 200=905
/server_side_log_throw_plain_js: Core 30.33 ▲10.3% (27.5) 280.48 ▲2.7% (273.02) 334.15 ▼4.7% (350.47) 200=921
/server_side_log_throw_raise: Core 25.33 ▼8.1% (27.55) 232.31 ▼17.0% (279.96) 344.76 ▼0.4% (345.99) 3xx=770
/server_side_log_throw_raise_invoker: Core 902.58 ▲15.1% (784.42) 7.68 ▼8.7% (8.41) 14.07 ▼10.1% (15.66) 200=27267
/server_side_hello_world_es5: Core 30.99 ▲15.9% (26.75) 276.19 ▼0.7% (278.02) 316.22 ▼9.9% (351.02) 200=942
/server_side_redux_app: Core 29.68 ▲10.3% (26.92) 288.46 ▼2.9% (296.98) 332.14 ▼8.1% (361.5) 200=903
/server_side_hello_world_with_options: Core 31.08 ▲12.1% (27.73) 290.64 ▲2.6% (283.15) 316.4 ▼9.3% (348.72) 200=941
/server_side_redux_app_cached: Core 510.12 ▼22.5% (658.3) 11.31 ▲12.0% (10.1) 17.69 ▼3.2% (18.28) 200=15415
/client_side_manual_render: Core 747.13 ▲9.2% (684.28) 8.25 ▼11.2% (9.29) 19.64 ▲6.4% (18.46) 200=22573
/render_js: Core 32.21 ▲9.9% (29.3) 259.75 ▼1.3% (263.06) 298.29 ▼9.1% (328.12) 200=982
/react_router: Core 28.28 ▲8.9% (25.96) 299.72 ▼1.8% (305.27) 342.8 ▼7.3% (369.72) 200=859
/pure_component: Core 23.79 ▼15.2% (28.07) 250.42 ▼11.4% (282.55) 337.34 ▼3.1% (348.1) 200=726
/css_modules_images_fonts_example: Core 21.7 ▼18.9% (26.77) 282.95 ▲4.2% (271.66) 379.16 ▲8.3% (349.99) 200=661
/turbolinks_cache_disabled: Core 773.62 ▲13.0% (684.68) 8.38 ▼9.9% (9.3) 16.95 ▼8.9% (18.61) 200=23373
/rendered_html: Core 30.43 ▲9.8% (27.72) 282.63 0.0% (282.64) 327.33 ▼5.8% (347.6) 200=923
/xhr_refresh: Core 15.28 ▲7.4% (14.22) 558.0 ▲4.4% (534.26) 737.8 ▲1.4% (727.37) 200=470
/react_helmet: Core 30.08 ▲11.1% (27.07) 300.95 ▲5.8% (284.37) 325.61 ▼8.1% (354.33) 200=912
/broken_app: Core 29.86 ▲9.5% (27.28) 303.61 ▲10.7% (274.18) 330.06 ▼5.1% (347.95) 200=905
/image_example: Core 29.99 ▲9.7% (27.35) 278.49 ▼3.0% (286.97) 319.84 ▼8.6% (349.84) 200=915
/turbo_frame_tag_hello_world: Core 822.47 ▲11.0% (740.66) 8.2 ▼5.3% (8.66) 15.77 ▼5.1% (16.62) 200=24849
/manual_render_test: Core 501.92 ▼26.2% (679.81) 11.36 ▲20.6% (9.42) 20.58 ▲10.3% (18.66) 200=15164

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Pro (shard 1/2) Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/: Pro 174.26 ▼3.0% (179.73) 51.26 ▲19.4% (42.93) 64.71 ▲4.9% (61.66) 200=5235
/error_scenarios_hub: Pro 360.34 ▲2.0% (353.23) 21.1 ▲5.8% (19.94) 31.99 ▲4.5% (30.63) 200=10891
/ssr_async_error: Pro 341.28 ▲1.1% (337.46) 22.8 ▲8.8% (20.95) 34.38 ▼2.7% (35.34) 200=10310
/ssr_async_prop_error: Pro 311.56 ▼4.1% (324.77) 24.34 ▲14.1% (21.33) 37.54 ▲4.5% (35.93) 200=9417
/non_existing_react_component: Pro 347.81 ▼0.3% (348.87) 22.53 ▲10.6% (20.37) 36.37 ▲12.4% (32.36) 200=10509
/non_existing_rsc_payload: Pro 363.23 ▼0.2% (363.93) 21.26 ▲7.0% (19.88) 34.78 ▲3.8% (33.49) 200=10976
/cached_react_helmet: Pro 374.41 ▲1.8% (367.77) 21.04 ▲10.9% (18.97) 32.64 ▲6.3% (30.69) 200=11310
/cached_redux_component: Pro 369.93 ▼3.3% (382.59) 23.6 ▲21.1% (19.49) 31.63 ▲2.7% (30.79) 200=11175
/lazy_apollo_graphql: Pro 145.15 ▼3.0% (149.69) 45.05 ▼7.4% (48.63) 71.02 ▼6.6% (76.01) 200=4389
/redis_receiver: Pro 93.06 ▲8.0% (86.18) 70.7 ▲3.2% (68.53) 134.36 ▼11.6% (152.02) 200=2783,3xx=32
/stream_shell_error_demo: Pro 328.79 ▼0.8% (331.44) 23.71 ▲14.9% (20.64) 35.5 ▲2.5% (34.64) 200=9936
/test_incremental_rendering: Pro 321.42 ▼5.1% (338.64) 23.63 ▲10.9% (21.32) 36.73 ▲6.4% (34.52) 200=9715
/rsc_posts_page_over_redis: Pro 94.56 ▼2.6% (97.03) 78.12 ▲11.0% (70.39) 127.86 ▲11.3% (114.91) 200=2862
/async_on_server_sync_on_client: Pro 253.95 ▼20.9% (320.9) 22.97 ▲1.5% (22.63) 74.25 ▲86.4% (39.84) 200=7674
/server_router: Pro 336.19 ▲1.0% (332.95) 23.01 ▲7.0% (21.5) 34.66 ▼2.1% (35.39) 200=10157
/unwrapped_rsc_route_client_render: Pro 364.4 ▼4.0% (379.45) 15.41 ▼19.1% (19.06) 28.66 ▼4.6% (30.03) 200=11083
/async_render_function_returns_string: Pro 263.16 ▼22.0% (337.34) 21.47 ▲1.8% (21.1) 27.24 ▼18.6% (33.46) 200=8004
/async_components_demo: Pro 210.68 ▲3.7% (203.14) 42.21 ▲14.5% (36.85) 53.4 ▲5.0% (50.88) 200=6337
/stream_native_metadata: Pro 339.6 ▲0.9% (336.42) 22.89 ▲9.4% (20.93) 34.29 ▼5.1% (36.14) 200=10268
/rsc_native_metadata: Pro 333.03 ▲3.1% (323.14) 22.98 ▲3.6% (22.19) 39.19 ▲3.4% (37.91) 200=10064
/react_intl_rsc_demo: Pro 321.1 ▲3.4% (310.5) 17.31 ▼15.9% (20.57) 33.25 ▼31.2% (48.31) 200=9706
/client_side_hello_world_shared_store: Pro 327.79 ▼3.1% (338.11) 23.18 ▲9.5% (21.16) 37.94 ▲15.9% (32.73) 200=9905
/client_side_hello_world_shared_store_defer: Pro 354.38 ▲6.3% (333.38) 24.47 ▲14.5% (21.37) 33.37 ▼3.5% (34.56) 200=10705
/server_side_hello_world_shared_store_controller: Pro 232.71 ▼18.2% (284.51) 24.32 ▼8.6% (26.62) 30.07 ▼27.1% (41.26) 200=7084
/server_side_hello_world: Pro 332.02 ▼0.2% (332.58) 23.52 ▲6.0% (22.19) 34.63 ▼0.3% (34.74) 200=10031
/client_side_log_throw: Pro 345.16 ▼6.8% (370.47) 22.4 ▲15.3% (19.42) 33.26 ▲6.7% (31.19) 200=10430
/server_side_log_throw_plain_js: Pro 293.44 ▼21.9% (375.66) 19.43 ▲0.6% (19.32) 24.91 ▼21.8% (31.84) 200=8870
/server_side_log_throw_raise_invoker: Pro 411.91 ▲0.9% (408.09) 13.44 ▼21.0% (17.0) 25.31 ▼12.1% (28.78) 200=12527
/server_side_redux_app: Pro 273.83 ▼17.5% (331.86) 20.87 ▼6.8% (22.38) 26.24 ▼26.9% (35.89) 200=8277
/server_side_redux_app_cached: Pro 354.87 ▼3.7% (368.48) 15.82 ▼20.4% (19.86) 29.94 ▼4.4% (31.31) 200=10730
/render_js: Pro 309.65 ▼17.7% (376.28) 18.71 ▼4.4% (19.57) 50.38 ▲57.2% (32.04) 200=9357
/pure_component: Pro 349.18 ▲4.7% (333.43) 22.21 ▲4.2% (21.32) 32.8 ▼7.1% (35.31) 200=10551
/turbolinks_cache_disabled: Pro 355.25 ▼2.7% (364.99) 21.78 ▲9.7% (19.86) 32.01 ▼1.7% (32.55) 200=10735
/xhr_refresh: Pro 264.13 ▼8.2% (287.86) 29.76 ▲16.8% (25.48) 42.62 ▲8.7% (39.2) 200=7984
/broken_app: Pro 256.88 ▼24.2% (338.93) 22.67 ▲5.3% (21.53) 52.47 ▲49.5% (35.11) 200=7764
/server_render_with_timeout: Pro 305.7 ▼8.2% (332.88) 28.89 ▲34.9% (21.42) 38.16 ▲9.2% (34.96) 200=9178

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Pro (shard 2/2) Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/empty: Pro 1284.01 ▲2.9% (1247.45) 4.06 ▼30.9% (5.87) 8.48 ▼5.5% (8.98) 200=39046
/ssr_shell_error: Pro 352.17 ▲3.4% (340.54) 22.02 ▲0.2% (21.98) 35.68 ▲2.2% (34.91) 200=10642
/ssr_sync_error: Pro 356.93 ▲6.2% (336.18) 21.85 ▲5.9% (20.63) 32.3 ▼6.2% (34.42) 200=10788
/rsc_component_error: Pro 342.57 ▲2.5% (334.1) 22.64 ▲6.2% (21.32) 33.65 ▼6.0% (35.81) 200=10351
/non_existing_stream_react_component: Pro 320.38 ▼6.5% (342.59) 17.91 ▼12.5% (20.48) 21.69 ▼37.3% (34.6) 200=9683
/server_side_redux_app_cached: Pro 416.82 ▲13.1% (368.48) 19.09 ▼3.9% (19.86) 27.99 ▼10.6% (31.31) 200=12593
/loadable: Pro 356.51 ▲16.4% (306.28) 15.52 ▼34.0% (23.51) 30.28 ▼15.7% (35.9) 200=10779
/apollo_graphql: Pro 151.66 ▲8.2% (140.2) 46.88 ▼6.2% (49.96) 70.66 ▼15.4% (83.5) 200=4587
/console_logs_in_async_server: Pro 3.56 ▲11.1% (3.2) 2119.59 ▼0.1% (2121.52) 2154.04 ▼9.8% (2386.88) 200=122
/stream_error_demo: Pro 367.34 ▲12.7% (326.01) 14.9 ▼28.6% (20.88) 28.55 ▼19.2% (35.33) 200=11174
/stream_async_components: Pro 368.95 ▲12.1% (329.01) 21.1 ▼3.3% (21.83) 33.43 ▼8.3% (36.45) 200=11150
/rsc_posts_page_over_http: Pro 309.93 ▼5.5% (328.04) 18.65 ▼16.4% (22.3) 31.6 ▼14.0% (36.73) 200=9377
/rsc_echo_props: Pro 255.01 ▲14.2% (223.36) 31.7 ▼2.5% (32.51) 48.1 ▼6.5% (51.47) 200=7707
/async_on_server_sync_on_client_client_render: Pro 398.01 ▲13.6% (350.33) 19.31 ▼7.4% (20.84) 31.65 ▼4.8% (33.26) 200=12027
/server_router_client_render: Pro 396.44 ▲14.2% (347.24) 19.15 ▼6.1% (20.39) 28.56 ▼12.3% (32.57) 200=11981
/unwrapped_rsc_route_stream_render: Pro 383.45 ▲14.1% (336.14) 19.66 ▼5.8% (20.86) 34.07 ▼6.1% (36.29) 200=11590
/async_render_function_returns_component: Pro 386.62 ▲15.3% (335.23) 20.43 ▼0.3% (20.49) 32.0 ▼5.6% (33.92) 200=11684
/native_metadata: Pro 375.0 ▲11.2% (337.23) 21.22 ▼3.6% (22.0) 31.76 ▼5.0% (33.44) 200=11333
/hybrid_metadata_streaming: Pro 317.99 ▼4.6% (333.44) 18.2 ▼16.4% (21.77) 61.99 ▲74.6% (35.5) 200=9608
/cache_demo: Pro 364.82 ▲12.8% (323.4) 20.98 ▼6.2% (22.36) 34.82 ▼6.6% (37.3) 200=11024
/client_side_hello_world: Pro 401.02 ▲9.8% (365.12) 19.22 ▼0.3% (19.28) 28.12 ▼10.7% (31.49) 200=12117
/client_side_hello_world_shared_store_controller: Pro 379.66 ▲11.6% (340.31) 16.68 ▼19.5% (20.72) 27.38 ▼17.9% (33.36) 200=11473
/server_side_hello_world_shared_store: Pro 297.09 ▲3.1% (288.11) 26.43 ▲1.7% (25.99) 37.53 ▼4.1% (39.14) 200=8984
/server_side_hello_world_shared_store_defer: Pro 290.82 ▲1.5% (286.4) 27.55 ▲8.6% (25.37) 41.95 ▲7.5% (39.04) 200=8789
/server_side_hello_world_hooks: Pro 376.81 ▲7.3% (351.19) 20.24 ▼3.8% (21.04) 34.12 ▼4.4% (35.68) 200=11386
/server_side_log_throw: Pro 370.7 ▲8.9% (340.55) 21.18 ▼1.4% (21.49) 30.23 ▼14.4% (35.33) 200=11203
/server_side_log_throw_raise: Pro 727.62 ▲12.0% (649.73) 3.78 ▼65.3% (10.89) 16.96 ▼9.9% (18.83) 3xx=22131
/server_side_hello_world_es5: Pro 369.7 ▲9.0% (339.28) 15.22 ▼28.1% (21.17) 28.14 ▼22.6% (36.35) 200=11246
/server_side_hello_world_with_options: Pro 380.94 ▲13.2% (336.47) 20.58 ▼5.2% (21.7) 30.6 ▼12.1% (34.83) 200=11513
/client_side_manual_render: Pro 402.06 ▲8.2% (371.65) 15.66 ▼19.6% (19.48) 26.05 ▼18.4% (31.91) 200=12151
/react_router: Pro 301.48 ▼23.7% (395.12) 19.44 ▲10.7% (17.56) 22.47 ▼22.9% (29.14) 200=9113
/css_modules_images_fonts_example: Pro 372.76 ▲8.7% (342.92) 21.11 ▼1.9% (21.51) 33.36 ▲0.3% (33.24) 200=11263
/rendered_html: Pro 381.14 ▲10.2% (345.91) 20.64 ▼0.3% (20.69) 32.9 ▼2.8% (33.86) 200=11516
/react_helmet: Pro 371.41 ▲11.3% (333.74) 21.33 ▼0.9% (21.51) 33.03 ▼6.5% (35.32) 200=11222
/image_example: Pro 376.45 ▲10.5% (340.64) 21.09 ▼5.7% (22.36) 31.64 ▼7.2% (34.08) 200=11373
/posts_page: Pro 257.98 ▲5.2% (245.22) 30.34 ▼50.6% (61.4) 42.16 ▼51.4% (86.72) 200=7798

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@justin808

Copy link
Copy Markdown
Member Author

Worker B review-thread triage for c84a4be36a94906ada15115dfabde33df51145ad.

Unresolved review threads triaged:

Thread Reviewer Triage Decision
r3377147372 Claude Optional filtered_alert? naming is readability-only; no behavior or merge risk found. Waived to avoid churn.
r3377147524 Claude Optional Duplicate boundary values can repeat the same Boundary in the measure-less path, but any? is idempotent and side-effect-free. No correctness blocker. Waived.
r3377147839 Claude Optional normalized_bencher_exit_code guard can be written more positively, but current condition is correct and covered. Waived.
r3377151207 Greptile Optional threshold_side? returns numeric-or-nil for truthiness. Current use is correct, including 0.0 truthiness. Waived.
r3377151231 Greptile Noise duplicate Same positive-guard readability point as Claude r3377147839. No code change.
r3377151283 Greptile Noise duplicate Same duplicate-boundary point as Claude r3377147524. No code change.

Validation run from the PR head:

  • script/ci-changes-detector origin/main -> Benchmark scripts; recommends Lint (Ruby + JS).
  • bundle exec rubocop benchmarks/lib/bencher_report.rb benchmarks/track_benchmarks.rb benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb -> 4 files inspected, no offenses. RuboCop printed existing rubocop-ast deprecation warnings only.
  • bundle exec rspec benchmarks/spec/bencher_report_spec.rb benchmarks/spec/track_benchmarks_spec.rb -> 79 examples, 0 failures.
  • git diff --check -> passed.

Live CI/review state observed:

  • Release mode: strict-rc from release gate Release gate: react_on_rails 17.0.0 #3823. No auto-merge from this lane.
  • PR state: open, not draft, mergeStateStatus: CLEAN, head c84a4be36a94906ada15115dfabde33df51145ad.
  • Required checks: none reported by gh pr checks --required.
  • Visible checks: complete; claude-review, Greptile Review, Cursor Bugbot, CodeRabbit, CodeQL/build/analyze checks, and Core/Pro benchmark suites passed; path-skipped jobs are consistent with the change detector and benchmark selection.

Full-CI decision: not requested. This PR is benchmark-related, the benchmark label is present, benchmark suites passed, and the live change detector only recommends lint. Full CI would be extra churn unless the coordinator or maintainer wants a stricter final gate.

Worker B lane result: merge-qualified from this lane after waiving optional/noise advisory comments. Coordinator owns final merge sequencing.

justin808 added a commit that referenced this pull request Jun 9, 2026
…sitives) (#3829)

## Problem

~99% of recent benchmark "performance-regression" alerts on `main` are
false positives from a single **orphaned server-side Bencher
threshold**.

`p90_latency` (and `p99_latency`) were intentionally dropped from
`track_benchmarks.rb` `THRESHOLDS` as too noisy ("tail noise can't meet
the 1/20 target"). But Bencher thresholds are persistent server-side
objects keyed on (branch, testbed, measure) — removing a measure from
the CLI `--threshold-*` args stops *updating* the threshold, it never
*deletes* it. The p90 metric is still submitted (for the dashboard), so
the orphaned p90 threshold keeps evaluating p90 tail noise and firing
alerts on nearly every run.

These alerts are doubly bad:
- They file a `performance-regression` issue (exit≠0), and
- They are **invisible in the summary table** — the p90 column has no
`:direction`, so it is never flagged 🔴. The filed issue names nothing
actionable (this is the #3782 / #3795 "🔴 only appears in the legend"
symptom).

### Evidence (public Bencher API, project `react-on-rails-t8a9ncxo`)

- 255 most-recent active alerts: **248 p90-latency, 4 rps, 3
p50-latency**.
- `main` branch: **164 active alerts → 162 p90-latency, 2 rps** (98.8%
p90).
- #3782 (`dec4b8c`) filed on exactly two p90 alerts —
`/client_side_hello_world: Core` p90 `24.20 > upper 23.61` and
`/rendered_html: Core` p90 — with no rps/p50/failed_pct alert.
- `main`/`github-actions` carries thresholds for `rps, p50-latency,
p90-latency, p99-latency, failed-pct` — two more than the code manages.
(p99 is dormant: no p99 metric is submitted.)

## Fix

Thread the measures we actually track (`THRESHOLDS` names) into
`BencherReport`. An active alert on any **other** measure is classified
as *filtered* rather than a regression. This reuses #3822's existing
`filtered_alert?` exit-normalization, so a p90-only run no longer writes
a candidate or files an issue. The fix is fully in-repo and makes the
orphaned threshold harmless regardless of its server-side state.

- Tracked measures (`rps`, `p50_latency`, `failed_pct`) are unaffected —
including the "hidden" `failed_pct` regressions #3822 added coverage
for.
- Measure-less and benchmark-less alerts keep their existing fail-safe
(still counted).
- `tracked_measures` defaults to `nil` (track every measure) so
non-production callers (`BenchmarkTable`, specs) are unchanged.

## Validation

- `bundle exec rspec benchmarks/spec` — **234 examples, 0 failures** (5
new specs for tracked-measure filtering).
- `bundle exec rubocop benchmarks/...` — no offenses.
- **End-to-end against the real #3782 reports** (fetched from the
Bencher API): `regression?` flips `true → false` (`filtered_alert? =
true`) for both the Core and Pro p90 alerts, while rps/p50 alerts are
untouched.
- `script/ci-changes-detector origin/main` → Benchmark scripts → Lint
(Ruby + JS).

## Companion cleanup (separate, manual — not in this PR)

This neutralizes the orphaned threshold in code. To also stop it
polluting the Bencher dashboard (162 cosmetic active alerts) and being
cloned to every new branch, delete the server-side thresholds (needs
`BENCHER_API_TOKEN`):

- `main` p90 `51fb6a47-0083-4e84-a745-60ee42e3bba4`, p99
`6faa7a68-1835-4cd7-96dc-959220737172`
- `master` p90 `d4ad2066-74cb-41f7-93bb-f0885358c56c`, p99
`61bf5ae6-77fa-474f-b6d8-ded630bd0c20`

## Relationship to other work

Completes the noise fix that #3810 (fresh-runner confirmation) and #3822
(stale-alert filtering) started: #3822 only filters alerts whose metric
*recovered*; a **live** p90 crossing (the dominant case) still passed
through. Substantially removes the p90 tail noise behind #3169 and the
issue explosion researched in #3755.

Refs #3755, #3169, #3795

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Changes which Bencher alerts count as regressions and can affect
main-push candidate filing and CI exit behavior, though scoped to
benchmark reporting with explicit specs and a nil default for other
callers.
> 
> **Overview**
> Adds optional **`tracked_measures`** to `BencherReport.parse` /
`#initialize` so active Bencher alerts on measures the repo no longer
tracks (e.g. orphaned server-side **p90_latency** thresholds) are moved
to **`filtered_alert?`** instead of **`regression?`**, reusing the
existing exit-code normalization for filtered-only runs.
> 
> **`track_benchmarks.rb`** now passes `THRESHOLDS.map(&:first)` when
parsing the JSON report, so p90-only false positives no longer write
regression candidates or file issues the summary table cannot flag.
Tracked measures, measure-less fail-safe alerts, and callers that omit
`tracked_measures` stay backward compatible.
> 
> Specs cover orphaned p90 filtering, slug/name normalization, and a
small fix for `BencherReport.new` with a Hash root in a perf-links test.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
7693f64. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Bug Fixes**
* Fixed false regression alerts for measures no longer tracked in
performance benchmarks. The system now filters regression reports to
only flag alerts for actively monitored metrics, preventing issue
creation based on orphaned or retired threshold measurements that are no
longer part of the current tracking configuration.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  Docs: document PR batch operational lessons (#3789)
  Document dummy Redux state indexing rationale (#3781)
  Pro RSC: avoid caching failed Flight renders (#3775)

# Conflicts:
#	packages/react-on-rails-pro/tests/getReactServerComponent.client.test.ts
justin808 added a commit that referenced this pull request Jun 9, 2026
…o-rsc-rspack-ci

* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)

# Conflicts:
#	react_on_rails_pro/spec/dummy/config/webpack/clientWebpackConfig.js
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main: (23 commits)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  ...
justin808 added a commit that referenced this pull request Jun 9, 2026
…-floor-fix

* origin/main: (29 commits)
  Docs: align pr-batch closeout confidence handoff (#3835)
  Align adversarial review CI polling guidance (#3794)
  CI: add Pro RSC rspack runtime gate (#3817)
  Make RSCRoute refetch failures recoverable in production (#3786)
  Fix Pro node renderer license headers (#3834)
  Docs: fix anti-patterns in RSC tutorials (#3801)
  fix(pro): add RSC peer compatibility gate (#3831)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  ...

# Conflicts:
#	.github/workflows/benchmark.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

performance-regression issues are being posted when no significant regressions are detected

1 participant