Skip to content

Filter Bencher alerts to tracked measures (drop orphaned p90 false positives)#3829

Merged
justin808 merged 1 commit into
mainfrom
jg/benchmark-ignore-untracked-measure-alerts
Jun 9, 2026
Merged

Filter Bencher alerts to tracked measures (drop orphaned p90 false positives)#3829
justin808 merged 1 commit into
mainfrom
jg/benchmark-ignore-untracked-measure-alerts

Conversation

@justin808

@justin808 justin808 commented Jun 9, 2026

Copy link
Copy Markdown
Member

Problem

~99% of recent benchmark "performance-regression" alerts on main are false positives from a single orphaned server-side Bencher threshold.

p90_latency (and p99_latency) were intentionally dropped from track_benchmarks.rb THRESHOLDS as too noisy ("tail noise can't meet the 1/20 target"). But Bencher thresholds are persistent server-side objects keyed on (branch, testbed, measure) — removing a measure from the CLI --threshold-* args stops updating the threshold, it never deletes it. The p90 metric is still submitted (for the dashboard), so the orphaned p90 threshold keeps evaluating p90 tail noise and firing alerts on nearly every run.

These alerts are doubly bad:

Evidence (public Bencher API, project react-on-rails-t8a9ncxo)

  • 255 most-recent active alerts: 248 p90-latency, 4 rps, 3 p50-latency.
  • main branch: 164 active alerts → 162 p90-latency, 2 rps (98.8% p90).
  • Performance Regression Detected on main (dec4b8c) #3782 (dec4b8c) filed on exactly two p90 alerts — /client_side_hello_world: Core p90 24.20 > upper 23.61 and /rendered_html: Core p90 — with no rps/p50/failed_pct alert.
  • main/github-actions carries thresholds for rps, p50-latency, p90-latency, p99-latency, failed-pct — two more than the code manages. (p99 is dormant: no p99 metric is submitted.)

Fix

Thread the measures we actually track (THRESHOLDS names) into BencherReport. An active alert on any other measure is classified as filtered rather than a regression. This reuses #3822's existing filtered_alert? exit-normalization, so a p90-only run no longer writes a candidate or files an issue. The fix is fully in-repo and makes the orphaned threshold harmless regardless of its server-side state.

  • Tracked measures (rps, p50_latency, failed_pct) are unaffected — including the "hidden" failed_pct regressions Filter stale Bencher alerts before reporting #3822 added coverage for.
  • Measure-less and benchmark-less alerts keep their existing fail-safe (still counted).
  • tracked_measures defaults to nil (track every measure) so non-production callers (BenchmarkTable, specs) are unchanged.

Validation

  • bundle exec rspec benchmarks/spec234 examples, 0 failures (5 new specs for tracked-measure filtering).
  • bundle exec rubocop benchmarks/... — no offenses.
  • End-to-end against the real Performance Regression Detected on main (dec4b8c) #3782 reports (fetched from the Bencher API): regression? flips true → false (filtered_alert? = true) for both the Core and Pro p90 alerts, while rps/p50 alerts are untouched.
  • script/ci-changes-detector origin/main → Benchmark scripts → Lint (Ruby + JS).

Companion cleanup (separate, manual — not in this PR)

This neutralizes the orphaned threshold in code. To also stop it polluting the Bencher dashboard (162 cosmetic active alerts) and being cloned to every new branch, delete the server-side thresholds (needs BENCHER_API_TOKEN):

  • main p90 51fb6a47-0083-4e84-a745-60ee42e3bba4, p99 6faa7a68-1835-4cd7-96dc-959220737172
  • master p90 d4ad2066-74cb-41f7-93bb-f0885358c56c, p99 61bf5ae6-77fa-474f-b6d8-ded630bd0c20

Relationship to other work

Completes the noise fix that #3810 (fresh-runner confirmation) and #3822 (stale-alert filtering) started: #3822 only filters alerts whose metric recovered; a live p90 crossing (the dominant case) still passed through. Substantially removes the p90 tail noise behind #3169 and the issue explosion researched in #3755.

Refs #3755, #3169, #3795

🤖 Generated with Claude Code


Note

Medium Risk
Changes which Bencher alerts count as regressions and can affect main-push candidate filing and CI exit behavior, though scoped to benchmark reporting with explicit specs and a nil default for other callers.

Overview
Adds optional tracked_measures to BencherReport.parse / #initialize so active Bencher alerts on measures the repo no longer tracks (e.g. orphaned server-side p90_latency thresholds) are moved to filtered_alert? instead of regression?, reusing the existing exit-code normalization for filtered-only runs.

track_benchmarks.rb now passes THRESHOLDS.map(&:first) when parsing the JSON report, so p90-only false positives no longer write regression candidates or file issues the summary table cannot flag. Tracked measures, measure-less fail-safe alerts, and callers that omit tracked_measures stay backward compatible.

Specs cover orphaned p90 filtering, slug/name normalization, and a small fix for BencherReport.new with a Hash root in a perf-links test.

Reviewed by Cursor Bugbot for commit 7693f64. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed false regression alerts for measures no longer tracked in performance benchmarks. The system now filters regression reports to only flag alerts for actively monitored metrics, preventing issue creation based on orphaned or retired threshold measurements that are no longer part of the current tracking configuration.

…sitives)

An orphaned server-side p90_latency threshold in Bencher (dropped from
track_benchmarks.rb THRESHOLDS but never deleted) keeps firing alerts on p90
tail noise. Those alerts file a "performance regression" issue on main yet are
invisible in the summary table (the p90 column has no :direction), so the issue
names nothing actionable. They account for ~99% of recent main alerts (162/164).

Thread the tracked-measure set (THRESHOLDS names) into BencherReport so an active
alert on any other measure is classified as filtered, not a regression. This
reuses the #3822 filtered-alert exit-normalization, so a p90-only run no longer
writes a candidate or files an issue. Tracked measures (rps, p50, failed_pct) are
unaffected; measure-less and benchmark-less alerts keep their existing fail-safe.

Verified against the real #3782 reports: regression? flips true -> false
(filtered) for both the Core and Pro p90 alerts.

Refs #3755, #3169, #3795

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 34a23436-a03d-4e65-9446-d20673d3a1dd

📥 Commits

Reviewing files that changed from the base of the PR and between 38835e9 and 7693f64.

📒 Files selected for processing (3)
  • benchmarks/lib/bencher_report.rb
  • benchmarks/spec/bencher_report_spec.rb
  • benchmarks/track_benchmarks.rb

Walkthrough

BencherReport now accepts an optional tracked_measures parameter to classify alerts on measures outside that list as non-regressions. The implementation normalizes measure slugs, adds filtering logic to regression detection, and the benchmark script passes measures from the current THRESHOLDS configuration to prevent issue filing on orphaned measures.

Changes

Tracked Measures Filtering Feature

Layer / File(s) Summary
Tracked measures API and filtering logic
benchmarks/lib/bencher_report.rb
BencherReport.parse and #initialize now accept tracked_measures: nil, normalizing and storing the list. During regression classification in current_regression_alert?, the new untracked_measure_alert? helper filters out active alerts referencing measures not in the tracked list.
Spec validation for tracked measures filtering
benchmarks/spec/bencher_report_spec.rb
New tracked-measure filtering describe block with helper and examples validates that untracked-measure alerts are filtered, tracked-measure regressions still surface, slug matching is normalized, backward compatibility holds without tracked_measures, and measure-less fail-safe alerts remain regressions. One existing test adjusted to use explicit hash literal.
Integration with benchmark tracking script
benchmarks/track_benchmarks.rb
run_bencher now passes the first column of THRESHOLDS as tracked_measures to BencherReport.parse, filtering alerts for orphaned or removed measures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • shakacode/react_on_rails#3586: Introduces the initial BencherReport JSON parser and regression detection; this PR extends it with tracked_measures-based filtering.
  • shakacode/react_on_rails#3822: Also modifies current_regression_alert? in the same file to filter out certain active alerts, using a complementary approach.

Suggested labels

enhancement, review-needed, benchmark, P2

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: filtering Bencher alerts to tracked measures and eliminating orphaned p90 false positives, which directly aligns with the primary objective of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jg/benchmark-ignore-untracked-measure-alerts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes a persistent false-positive alert problem where orphaned server-side Bencher thresholds for p90_latency (removed from THRESHOLDS but never deleted on the server) continued firing on nearly every CI run. The fix threads the set of actually-tracked measures into BencherReport, so any active alert on a non-tracked measure is silently moved to filtered_alerts rather than being treated as a regression.

  • bencher_report.rb: Adds an optional tracked_measures parameter to parse/initialize; a new untracked_measure_alert? guard is inserted at the top of current_regression_alert? so orphaned-measure alerts short-circuit to filtered before any boundary math runs.
  • track_benchmarks.rb: Passes THRESHOLDS.map(&:first) as tracked_measures, tying the filter list directly to the source-of-truth constant so it stays in sync automatically.
  • bencher_report_spec.rb: Adds 5 targeted specs covering the new filter path, tracked-measure passthrough, slug/name normalization parity, backward compatibility with no tracked_measures, and the measure-less fail-safe.

Confidence Score: 5/5

Safe to merge — the change is additive and backward-compatible; callers that don't pass tracked_measures see identical behaviour.

The filter is additive: tracked_measures defaults to nil, which preserves the existing code path for every caller except track_benchmarks.rb. The new untracked_measure_alert? guard fires only when an alert's measure is present and absent from the tracked list, so the benchmark-less and measure-less fail-safes are unaffected. The filter list is derived directly from THRESHOLDS.map(&:first), so it can't drift. Five targeted specs cover the new path, normalization parity, backward compatibility, and the measure-less edge case.

No files require special attention.

Important Files Changed

Filename Overview
benchmarks/lib/bencher_report.rb Adds optional tracked_measures parameter; untracked_measure_alert? correctly short-circuits before boundary math, preserves measure-less fail-safe, and normalizes slugs consistently.
benchmarks/track_benchmarks.rb Passes THRESHOLDS.map(&:first) as tracked_measures — keeps filter list in sync with the constant automatically; change is minimal and only affects the parse call.
benchmarks/spec/bencher_report_spec.rb Adds 5 new specs covering the tracked-measure filter path and edge cases; minor fix wraps bare hash in braces to avoid keyword-argument ambiguity with the new tracked_measures: keyword parameter.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Bencher CLI emits JSON report] --> B[BencherReport.parse\ntracked_measures: THRESHOLDS.map first]
    B --> C[parse_alerts: collect active alerts]
    C --> D{For each alert:\ncurrent_regression_alert?}
    D -->|no benchmark name| E[→ @alerts\nfail-safe regression]
    D -->|has benchmark name| F{untracked_measure_alert?}
    F -->|tracked_measures nil| G{Existing boundary math}
    F -->|measure is nil| G
    F -->|measure NOT in tracked_measures| H[→ @filtered_alerts\norphaned threshold]
    F -->|measure in tracked_measures| G
    G -->|no direction / no boundary| E
    G -->|boundary confirms regression| E
    G -->|boundary says recovered| H
    E --> I[regression? = true]
    H --> J[regression? = false\nfiltered_alert? = true]
    I --> K[Write candidate / file issue]
    J --> L[normalized_bencher_exit_code → 0]
Loading

Reviews (1): Last reviewed commit: "Filter Bencher alerts to tracked measure..." | Re-trigger Greptile

@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Code Review

Summary: This PR fixes a real, well-evidenced problem — 98%+ of false-positive benchmark alerts were coming from an orphaned server-side Bencher threshold for p90_latency that the codebase stopped managing but never deleted. The fix is minimal, in-repo, and makes the orphaned threshold harmless without requiring any Bencher API cleanup.

Code quality ✅

The implementation is clean and well-scoped:

  • tracked_measures is optional with a nil default, preserving full backward compatibility for BenchmarkTable and any other callers that only use parsing/significance and don't need the filter.
  • The normalization (normalize) is applied both at construction time (building @tracked_measures) and at comparison time (normalizing alert.measure), which correctly handles the slug-vs-name form mismatch ("p50-latency" from Bencher vs "p50_latency" from THRESHOLDS). The existing test at the "matches regardless of - / _ / case" case confirms this.
  • The ordering of guards in current_regression_alert? is correct: benchmark-less alerts hit the conservative fail-safe (return true) before the measure filter, so they still flow through the existing logic.
  • untracked_measure_alert? correctly skips filtering when alert.measure is nil, letting measure-less alerts fall through to the pre-existing benchmark-level fail-safe. The new spec for this case is solid.

Tests ✅

Five new specs cover the key scenarios well:

  1. Orphaned p90 alert is filtered → regression? flips to false
  2. Tracked measure regression still fires
  3. Slug (p50-latency) vs. name (p50_latency) normalization
  4. Backward-compatible nil default
  5. Measure-less alert fail-safe preserved with tracked_measures set

The :unset sentinel in p90_report(tracked_measures: :unset) is the right idiom for distinguishing "argument not passed" from nil in a spec helper.

The fix at bencher_report_spec.rb:493 (adding {} around the hash) is a valid Ruby 3.0+ cleanup — described_class.new("results" => [], "alerts" => []) was ambiguous; the explicit braces make it clearly a positional hash.

One edge case to watch ⚠️

If tracked_measures: is ever passed as an empty array ([]), @tracked_measures becomes [] (truthy), and untracked_measure_alert? returns true for every alert that has a measure — silently filtering all regressions. The current caller (THRESHOLDS.map(&:first), always 3 entries) can't hit this, but it's a non-obvious footgun worth a short guard or comment. See inline note.

Bottom line

Solid fix with good test coverage. The root-cause analysis in the PR description is thorough and the evidence (Bencher API, 248/255 p90 alerts) makes the problem unambiguous. Approving with one minor suggestion.

# An active alert on a measure the caller does not track (an orphaned server-side
# threshold). Only applies when tracked_measures was given; a measure-less alert can't be
# classified here, so it falls through to the existing benchmark-level fail-safe.
def untracked_measure_alert?(alert)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor footgun: if tracked_measures: is ever passed as [], @tracked_measures is [] (truthy), so !@tracked_measures.include?(...) is true for every measure — all alerts get silently filtered and regression? is always false. The current caller can't hit this (THRESHOLDS has 3 entries), but a future caller could pass an empty list accidentally.

A small guard in initialize would make the invariant explicit:

Suggested change
def untracked_measure_alert?(alert)
def untracked_measure_alert?(alert)
return false if @tracked_measures.nil? || @tracked_measures.empty?
return false unless alert.measure
!@tracked_measures.include?(normalize(alert.measure))
end

Or equivalently, coerce an empty list to nil at construction time:

@tracked_measures = tracked_measures&.then { |m| m.empty? ? nil : m.map { |k| normalize(k) } }

Either makes it impossible to silently disable regression detection by passing [].

@justin808 justin808 merged commit 6bf27ed into main Jun 9, 2026
43 checks passed
@justin808 justin808 deleted the jg/benchmark-ignore-untracked-measure-alerts branch June 9, 2026 03:16
justin808 added a commit that referenced this pull request Jun 9, 2026
## Problem

A confirmed `performance-regression` issue currently only shows the
per-suite summary tables. Those tables flag 🔴 on **rps** and **p50** but
never on **failed_pct** (it has no column — it lives in the "Status"
cell) and never on any untracked measure. So a reader has to open the
tables to see what regressed, and a `failed_pct`-only regression renders
with no visible 🔴 at all — the same "the issue names nothing actionable"
problem behind #3782/#3795.

## Fix

Add a **"What regressed"** section to the issue body, built from the
confirmed `ALERTS` payload (#3810's structured `{benchmark, measure}`
pairs), aggregated across all suites and computed once in
`report_regressions`:

```
### What regressed

The fresh-runner confirmation re-alerted on these benchmark + measure pairs:

- `/client_side_log_throw: Pro` — **rps**
- `/server_side_redux_app: Core` — **failed_pct**
```

- Threaded through `report_suite` → `RegressionIssueReporter`; only the
reporter that *creates* the issue renders the body, so the cross-suite
list is complete.
- `regressed_overview` defaults to `""` → the section is omitted and the
body is byte-for-byte unchanged, so it composes with the existing flow
and with older payloads that lack `ALERTS`.

## Validation

- `bundle exec rspec benchmarks/spec` — **233 examples, 0 failures** (4
new: helper rendering/dedup + issue-body inclusion/omission).
- `bundle exec rubocop benchmarks/...` — no offenses.
- Rendered the body locally with and without pairs to confirm clean
Markdown spacing and that the no-pairs body is unchanged.

## Relationship to other work

Complements #3829 (filter alerts to tracked measures): #3829 stops the
p90 false positives from filing at all; this makes the *real* confirmed
regressions self-explanatory in the issue body. Refs #3755.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> CI reporting and GitHub issue markdown only; no runtime app or auth
changes.
> 
> **Overview**
> Adds a **"What regressed"** section to auto-filed
`performance-regression` issue bodies, listing confirmed benchmark +
measure pairs as markdown bullets (including measures like
**failed_pct** that the per-suite tables do not surface clearly).
> 
> `report_regressions` builds the list once from each confirmed
payload's structured `ALERTS` via new `regressed_overview_markdown`
(deduped across suites) and passes `regressed_overview` into
`RegressionIssueReporter`. The section is omitted when the overview is
empty, so older payloads without `ALERTS` keep the previous body shape.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
3068c93. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Performance regression reports now include a "What regressed" overview
section, displaying all confirmed benchmark and measure pairs that
regressed across test suites in a single consolidated view.

* **Tests**
* Added comprehensive test coverage for the new regression overview
markdown rendering logic and deduplication.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  Docs: document PR batch operational lessons (#3789)
  Document dummy Redux state indexing rationale (#3781)
  Pro RSC: avoid caching failed Flight renders (#3775)

# Conflicts:
#	packages/react-on-rails-pro/tests/getReactServerComponent.client.test.ts
justin808 added a commit that referenced this pull request Jun 9, 2026
…o-rsc-rspack-ci

* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)

# Conflicts:
#	react_on_rails_pro/spec/dummy/config/webpack/clientWebpackConfig.js
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main: (23 commits)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  ...
justin808 added a commit that referenced this pull request Jun 9, 2026
…-floor-fix

* origin/main: (29 commits)
  Docs: align pr-batch closeout confidence handoff (#3835)
  Align adversarial review CI polling guidance (#3794)
  CI: add Pro RSC rspack runtime gate (#3817)
  Make RSCRoute refetch failures recoverable in production (#3786)
  Fix Pro node renderer license headers (#3834)
  Docs: fix anti-patterns in RSC tutorials (#3801)
  fix(pro): add RSC peer compatibility gate (#3831)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  ...

# Conflicts:
#	.github/workflows/benchmark.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant