Name the regressed benchmark+measure pairs in the issue body by justin808 · Pull Request #3830 · shakacode/react_on_rails

justin808 · 2026-06-09T02:55:19Z

Problem

A confirmed performance-regression issue currently only shows the per-suite summary tables. Those tables flag 🔴 on rps and p50 but never on failed_pct (it has no column — it lives in the "Status" cell) and never on any untracked measure. So a reader has to open the tables to see what regressed, and a failed_pct-only regression renders with no visible 🔴 at all — the same "the issue names nothing actionable" problem behind #3782/#3795.

Fix

Add a "What regressed" section to the issue body, built from the confirmed ALERTS payload (#3810's structured {benchmark, measure} pairs), aggregated across all suites and computed once in report_regressions:

### What regressed

The fresh-runner confirmation re-alerted on these benchmark + measure pairs:

- `/client_side_log_throw: Pro` — **rps**
- `/server_side_redux_app: Core` — **failed_pct**

Threaded through report_suite → RegressionIssueReporter; only the reporter that creates the issue renders the body, so the cross-suite list is complete.
regressed_overview defaults to "" → the section is omitted and the body is byte-for-byte unchanged, so it composes with the existing flow and with older payloads that lack ALERTS.

Validation

bundle exec rspec benchmarks/spec — 233 examples, 0 failures (4 new: helper rendering/dedup + issue-body inclusion/omission).
bundle exec rubocop benchmarks/... — no offenses.
Rendered the body locally with and without pairs to confirm clean Markdown spacing and that the no-pairs body is unchanged.

Relationship to other work

Complements #3829 (filter alerts to tracked measures): #3829 stops the p90 false positives from filing at all; this makes the real confirmed regressions self-explanatory in the issue body. Refs #3755.

🤖 Generated with Claude Code

Note

Low Risk
CI reporting and GitHub issue markdown only; no runtime app or auth changes.

Overview
Adds a "What regressed" section to auto-filed performance-regression issue bodies, listing confirmed benchmark + measure pairs as markdown bullets (including measures like failed_pct that the per-suite tables do not surface clearly).

report_regressions builds the list once from each confirmed payload's structured ALERTS via new regressed_overview_markdown (deduped across suites) and passes regressed_overview into RegressionIssueReporter. The section is omitted when the overview is empty, so older payloads without ALERTS keep the previous body shape.

^{Reviewed by Cursor Bugbot for commit 3068c93. Bugbot is set up for automated code reviews on this repo. Configure here.}

Summary by CodeRabbit

New Features
- Performance regression reports now include a "What regressed" overview section, displaying all confirmed benchmark and measure pairs that regressed across test suites in a single consolidated view.
Tests
- Added comprehensive test coverage for the new regression overview markdown rendering logic and deduplication.

Confirmed regression issues only showed the per-suite summary tables, whose 🔴 flag covers rps/p50 but never the column-less failed_pct — so a reader had to open the tables, and a failed_pct-only regression looked blank. Add a "What regressed" section to the issue body built from the confirmed ALERTS payload (benchmark + measure pairs), aggregated across all suites and computed once. Defaults to empty (body unchanged) when no structured pairs were handed off, so it composes with the existing flow and older payloads. Refs #3755 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-09T02:55:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e055a0cc-190f-4e2a-871e-2231da2f7969

📥 Commits

Reviewing files that changed from the base of the PR and between 20e1f3d and 3068c93.

📒 Files selected for processing (2)

benchmarks/report_regressions.rb
benchmarks/spec/report_regressions_spec.rb

Walkthrough

This PR extends performance regression reporting in benchmarks/report_regressions.rb and its test suite. It introduces a "What regressed" overview section that lists unique benchmark/measure pairs from regression payloads, extracts and deduplicates this overview once at the suite level, threads it through the reporter initialization, and conditionally renders it into created/updated GitHub issues.

Changes

Regression overview reporting

Layer / File(s)	Summary
Overview extraction helper `benchmarks/report_regressions.rb`	`regressed_overview_markdown(payloads)` builds a deduplicated newline-joined markdown list of confirmed benchmark/measure pairs extracted from each payload's ALERTS, returning empty string when no pairs exist.
Reporter initialization and rendering `benchmarks/report_regressions.rb`	`RegressionIssueReporter#initialize` accepts a `regressed_overview` keyword parameter and stores it with `attr_reader`; new `regressed_section` method conditionally renders the "What regressed" block based on whether the overview is non-empty; the issue template embeds this rendered section.
Integration in report_suite `benchmarks/report_regressions.rb`	The main script precomputes a single `regressed_overview` from all readable payloads; `report_suite` function signature now accepts this overview as a keyword argument; each call to `RegressionIssueReporter.report_suite` includes the precomputed overview.
Test coverage for overview and rendering `benchmarks/spec/report_regressions_spec.rb`	New RSpec examples verify that created issue bodies include the "What regressed" section when benchmark/measure pairs exist and omit it when empty; new `regressed_overview_markdown` unit test group confirms deduplication of bullet items from ALERTS and empty-string return when no alerts are present.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

shakacode/react_on_rails#3767: Both PRs modify benchmarks/report_regressions.rb by updating RegressionIssueReporter#initialize and report_suite signatures to pass additional data through the same reporter pipeline (this PR adds regressed_overview section rendering; related PR adds shared per-run issue/comment caching).
shakacode/react_on_rails#3540: Both PRs modify the benchmarks/report_regressions.rb reporting flow; the related PR introduced the initial reporter/script structure that this PR extends to render the "What regressed" overview section.

Suggested labels

enhancement, review-needed, benchmark, P3

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: adding a 'What regressed' section that names the regressed benchmark+measure pairs in the issue body.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jg/benchmark-issue-body-names-pairs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-06-09T03:00:50Z

Greptile Summary

This PR adds a "What regressed" section to the performance-regression GitHub issue body, listing the confirmed benchmark + measure pairs sourced from the structured ALERTS payload. The section is assembled once across all suites in report_regressions and threaded through report_suite → RegressionIssueReporter as regressed_overview, making the issue self-explanatory without requiring readers to open per-suite tables.

regressed_overview_markdown flat-maps, filters, deduplicates, and formats {benchmark, measure} pairs from all confirmed payloads; returns \"\" when none are present so the no-pairs body is byte-equivalent to the original.
regressed_section in RegressionIssueReporter wraps the overview in a ### What regressed block and returns \"\" when empty, keeping the existing issue body unchanged for callers using older payloads.
Four new RSpec examples cover dedup/empty output for regressed_overview_markdown and inclusion/omission of the section in the created issue body.

Confidence Score: 5/5

Safe to merge; the change is purely additive and backward-compatible, with the no-pairs path producing output byte-identical to the original.

The implementation is narrowly scoped: a new helper function, a new private method, and two wired-up keyword arguments. The regressed_overview defaults to "" everywhere it propagates, so every existing call site is unaffected. The regressed_section guard on empty/blank strings prevents any body change when no structured alerts are present. The only observations are cosmetic (extra trailing blank line in raw Markdown that collapses in GitHub rendering) and a small test coverage gap for multi-line overviews in the issue body.

No files require special attention.

Important Files Changed

Filename	Overview
benchmarks/report_regressions.rb	Adds `regressed_overview_markdown` helper and threads `regressed_overview` through `report_suite` → `RegressionIssueReporter`; backward-compatible (defaults to `""`), `regressed_section` guards on empty string so the no-pairs body is byte-equivalent to the original.
benchmarks/spec/report_regressions_spec.rb	Adds four new examples: `regressed_overview_markdown` dedup/empty, and `RegressionIssueReporter` issue-body inclusion/omission; existing suite is unchanged.

Sequence Diagram

sequenceDiagram
    participant S as report_regressions
    participant R as regressed_overview_markdown
    participant RS as report_suite
    participant I as RegressionIssueReporter

    S->>S: load confirmed payloads (readable)
    S->>R: regressed_overview_markdown(readable)
    R->>R: flat_map ALERTS across all payloads
    R->>R: select valid pairs, .uniq, .map to bullets
    R-->>S: "- `/x: Core` — **rps**\n..."
    loop each suite (sorted)
        S->>RS: report_suite(suite_name, payloads, regressed_overview:)
        RS->>I: RegressionIssueReporter.report(..., regressed_overview:)
        I->>I: regressed_section → "" when empty
        I->>I: issue_body interpolates regressed_section
        I-->>RS: issue_number
    end

_{Reviews (1): Last reviewed commit: "Name the regressed benchmark+measure pai..." | Re-trigger Greptile}

greptile-apps · 2026-06-09T03:00:53Z

+      it "names the confirmed benchmark+measure pairs in the created issue body" do
+        described_class.report(
+          summary: "core regressed",
+          suite_name: "core",
+          github_run_url: "https://github.com/run/1",
+          bencher_url: "https://bencher.dev/dash",
+          regressed_overview: "- `/x: Core` — **rps**"
+        )
+        body = posted_bodies.fetch("issue create")
+        expect(body).to include("### What regressed")
+        expect(body).to include("- `/x: Core` — **rps**")
+      end


Missing multi-line overview coverage in issue-body test

The regressed_overview_markdown unit test confirms that the helper produces a multi-line string for multiple pairs (e.g. "- \/a: Core` — rps\n- `/b: Pro`"), but the RegressionIssueReporterbody test only exercises a single-lineregressed_overview. Because the overview is interpolated inside a <<~MARKDOWNheredoc inside another<<~MARKDOWN` heredoc, the indentation-stripping of the outer heredoc applies to each logical line of the interpolated value; lines beyond the first have zero leading spaces, so the strip is effectively a no-op on them, but this behaviour is untested end-to-end. A test passing a two-pair overview and asserting both bullets appear in the created issue body would close that gap.

greptile-apps · 2026-06-09T03:00:54Z

+      #{regressed_overview}
+
+    MARKDOWN


The trailing blank line inside regressed_section's heredoc means the method returns a string ending with `

. The outer issue_bodyheredoc's own line-ending
then appends a third newline, producing two blank lines between the last bullet and### What to do`. Dropping the trailing blank line inside the inner heredoc leaves exactly one blank line, matching standard Markdown section spacing.

Suggested change

#{regressed_overview}

MARKDOWN

#{regressed_overview}

MARKDOWN

claude · 2026-06-09T03:03:49Z

Code Review

Clean, focused PR that solves a real gap: failed_pct regressions and any measure not surfaced by the per-suite tables are now self-explanatory in the filed issue. The implementation is backward-compatible (empty default, no-op body when no ALERTS), the threading through report_suite → RegressionIssueReporter is straightforward, and the new tests cover the essential paths.

Three nits below (two as inline comments, one here):

E2E script test gap — write_payload in describe "report_regressions.rb (script)" doesn't include RegressionReport::ALERTS in any test fixture, so the full script path (report_regressions → regressed_overview_markdown → issue body) is untested at the subprocess level. The unit tests for regressed_overview_markdown and the RegressionIssueReporter body inclusion tests are solid, so this is low-risk — but it would be good to add RegressionReport::ALERTS to at least the "renders side by side" fixture and assert that "### What regressed" appears in the captured issue body.

Overall: ship-ready as-is; the E2E gap is a nice-to-have, not a blocker.

claude · 2026-06-09T03:03:53Z

+  payloads
+    .flat_map { |payload| payload[RegressionReport::ALERTS] || [] }
+    .select { |pair| pair.is_a?(Hash) && pair[RegressionReport::ALERT_BENCHMARK] }
+    .uniq


.uniq deduplicates by full Hash equality, which works today because RegressionReport.alert_hash always creates exactly {benchmark, measure} — no extra keys. If the ALERTS format ever gains a third key (e.g. a timestamp or suite tag), identical benchmark+measure pairs from different shards would no longer be deduped.

Suggested change

.uniq

.uniq { |pair| [pair[RegressionReport::ALERT_BENCHMARK], pair[RegressionReport::ALERT_MEASURE]] }

This locks the dedup key to the two semantically meaningful fields regardless of future payload growth.

claude · 2026-06-09T03:03:55Z

+  # column-less failed_pct). Empty when no structured pairs were handed off, leaving the
+  # body unchanged.
+  def regressed_section
+    return "" if regressed_overview.to_s.strip.empty?


regressed_overview is always a String — initialized to "" and only accepted as a keyword arg with that default — so .to_s is a no-op. Minor nit:

Suggested change

return "" if regressed_overview.to_s.strip.empty?

return "" if regressed_overview.strip.empty?

claude · 2026-06-09T03:04:03Z

      | **Workflow run** | [Run ##{github_run_number}](#{github_run_url}) |
      | **Bencher dashboard** | [View history](#{bencher_url}) |
-
+      #{regressed_section}


When regressed_section is non-empty it ends with \n\n (from the <<~MARKDOWN trailing blank line), and this heredoc line contributes another \n, producing two blank lines before ### What to do instead of one. GitHub Markdown collapses them on render so there's no visual difference, but it's worth noting if the raw markdown ever needs to be exact.

If you want a single blank line consistently, you can strip the trailing newline from regressed_section:

#{regressed_section.chomp}

or trim the trailing blank line inside the regressed_section heredoc. Not a blocker.

* origin/main: Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) Confirm benchmark regressions on a fresh runner before filing the main issue (#3810) Define agent scope and accelerated RC auto-merge policy (#3808) Replace custom MockClient with async-http Mock::Endpoint (#3703) Docs: per-request data sharing in RSC with React.cache() (#3769) Pro RSC: share unstable_cache across renderer workers via Redis (#3705) [codex] Add PR batch planning skill (#3792) Docs: document PR batch operational lessons (#3789) Document dummy Redux state indexing rationale (#3781) Pro RSC: avoid caching failed Flight renders (#3775) # Conflicts: # packages/react-on-rails-pro/tests/getReactServerComponent.client.test.ts

…o-rsc-rspack-ci * origin/main: Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) # Conflicts: # react_on_rails_pro/spec/dummy/config/webpack/clientWebpackConfig.js

* origin/main: (23 commits) Enforce Pro license headers in CI and pre-commit (#3821) Add RSC payload route-data helper (#3783) [Pro] Fix React.cache request dedupe in generated RSC configs (#3813) Docs: clarify RuboCop autofix ownership (#3827) Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) Confirm benchmark regressions on a fresh runner before filing the main issue (#3810) Define agent scope and accelerated RC auto-merge policy (#3808) Replace custom MockClient with async-http Mock::Endpoint (#3703) Docs: per-request data sharing in RSC with React.cache() (#3769) Pro RSC: share unstable_cache across renderer workers via Redis (#3705) [codex] Add PR batch planning skill (#3792) ...

…-floor-fix * origin/main: (29 commits) Docs: align pr-batch closeout confidence handoff (#3835) Align adversarial review CI polling guidance (#3794) CI: add Pro RSC rspack runtime gate (#3817) Make RSCRoute refetch failures recoverable in production (#3786) Fix Pro node renderer license headers (#3834) Docs: fix anti-patterns in RSC tutorials (#3801) fix(pro): add RSC peer compatibility gate (#3831) Enforce Pro license headers in CI and pre-commit (#3821) Add RSC payload route-data helper (#3783) [Pro] Fix React.cache request dedupe in generated RSC configs (#3813) Docs: clarify RuboCop autofix ownership (#3827) Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) ... # Conflicts: # .github/workflows/benchmark.yml

coderabbitai Bot approved these changes Jun 9, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

claude Bot reviewed Jun 9, 2026

View reviewed changes

justin808 merged commit 6104dd9 into main Jun 9, 2026
50 checks passed

justin808 deleted the jg/benchmark-issue-body-names-pairs branch June 9, 2026 03:52

This was referenced Jun 9, 2026

Performance Regression Detected on main (8027078) #3799

Closed

Research performance-regression issue posting explosion #3755

Closed

coderabbitai Bot mentioned this pull request Jun 22, 2026

Split benchmark tracking script #4176

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Name the regressed benchmark+measure pairs in the issue body#3830

Name the regressed benchmark+measure pairs in the issue body#3830
justin808 merged 1 commit into
mainfrom
jg/benchmark-issue-body-names-pairs

justin808 commented Jun 9, 2026 •

edited by cursor Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Uh oh!

greptile-apps Bot Jun 9, 2026

Uh oh!

greptile-apps Bot Jun 9, 2026

Uh oh!

claude Bot commented Jun 9, 2026

Uh oh!

claude Bot Jun 9, 2026

Uh oh!

claude Bot Jun 9, 2026

Uh oh!

claude Bot Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	.uniq
	.uniq { \|pair\| [pair[RegressionReport::ALERT_BENCHMARK], pair[RegressionReport::ALERT_MEASURE]] }

	return "" if regressed_overview.to_s.strip.empty?
	return "" if regressed_overview.strip.empty?

Uh oh!

Uh oh!

Conversation

justin808 commented Jun 9, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Validation

Relationship to other work

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 9, 2026

Code Review

Uh oh!

claude Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

justin808 commented Jun 9, 2026 •

edited by cursor Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading