Name the regressed benchmark+measure pairs in the issue body#3830
Conversation
Confirmed regression issues only showed the per-suite summary tables, whose 🔴 flag covers rps/p50 but never the column-less failed_pct — so a reader had to open the tables, and a failed_pct-only regression looked blank. Add a "What regressed" section to the issue body built from the confirmed ALERTS payload (benchmark + measure pairs), aggregated across all suites and computed once. Defaults to empty (body unchanged) when no structured pairs were handed off, so it composes with the existing flow and older payloads. Refs #3755 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR extends performance regression reporting in ChangesRegression overview reporting
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a "What regressed" section to the performance-regression GitHub issue body, listing the confirmed benchmark + measure pairs sourced from the structured
Confidence Score: 5/5Safe to merge; the change is purely additive and backward-compatible, with the no-pairs path producing output byte-identical to the original. The implementation is narrowly scoped: a new helper function, a new private method, and two wired-up keyword arguments. The No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant S as report_regressions
participant R as regressed_overview_markdown
participant RS as report_suite
participant I as RegressionIssueReporter
S->>S: load confirmed payloads (readable)
S->>R: regressed_overview_markdown(readable)
R->>R: flat_map ALERTS across all payloads
R->>R: select valid pairs, .uniq, .map to bullets
R-->>S: "- `/x: Core` — **rps**\n..."
loop each suite (sorted)
S->>RS: report_suite(suite_name, payloads, regressed_overview:)
RS->>I: RegressionIssueReporter.report(..., regressed_overview:)
I->>I: regressed_section → "" when empty
I->>I: issue_body interpolates regressed_section
I-->>RS: issue_number
end
Reviews (1): Last reviewed commit: "Name the regressed benchmark+measure pai..." | Re-trigger Greptile |
| it "names the confirmed benchmark+measure pairs in the created issue body" do | ||
| described_class.report( | ||
| summary: "core regressed", | ||
| suite_name: "core", | ||
| github_run_url: "https://github.com/run/1", | ||
| bencher_url: "https://bencher.dev/dash", | ||
| regressed_overview: "- `/x: Core` — **rps**" | ||
| ) | ||
| body = posted_bodies.fetch("issue create") | ||
| expect(body).to include("### What regressed") | ||
| expect(body).to include("- `/x: Core` — **rps**") | ||
| end |
There was a problem hiding this comment.
Missing multi-line overview coverage in issue-body test
The regressed_overview_markdown unit test confirms that the helper produces a multi-line string for multiple pairs (e.g. "- \/a: Core` — rps\n- `/b: Pro`"), but the RegressionIssueReporterbody test only exercises a single-lineregressed_overview. Because the overview is interpolated inside a <<~MARKDOWNheredoc inside another<<~MARKDOWN` heredoc, the indentation-stripping of the outer heredoc applies to each logical line of the interpolated value; lines beyond the first have zero leading spaces, so the strip is effectively a no-op on them, but this behaviour is untested end-to-end. A test passing a two-pair overview and asserting both bullets appear in the created issue body would close that gap.
| #{regressed_overview} | ||
|
|
||
| MARKDOWN |
There was a problem hiding this comment.
The trailing blank line inside
regressed_section's heredoc means the method returns a string ending with `
. The outer issue_bodyheredoc's own line-ending
then appends a third newline, producing two blank lines between the last bullet and### What to do`. Dropping the trailing blank line inside the inner heredoc leaves exactly one blank line, matching standard Markdown section spacing.
| #{regressed_overview} | |
| MARKDOWN | |
| #{regressed_overview} | |
| MARKDOWN |
Code ReviewClean, focused PR that solves a real gap: Three nits below (two as inline comments, one here): E2E script test gap — Overall: ship-ready as-is; the E2E gap is a nice-to-have, not a blocker. |
| payloads | ||
| .flat_map { |payload| payload[RegressionReport::ALERTS] || [] } | ||
| .select { |pair| pair.is_a?(Hash) && pair[RegressionReport::ALERT_BENCHMARK] } | ||
| .uniq |
There was a problem hiding this comment.
.uniq deduplicates by full Hash equality, which works today because RegressionReport.alert_hash always creates exactly {benchmark, measure} — no extra keys. If the ALERTS format ever gains a third key (e.g. a timestamp or suite tag), identical benchmark+measure pairs from different shards would no longer be deduped.
| .uniq | |
| .uniq { |pair| [pair[RegressionReport::ALERT_BENCHMARK], pair[RegressionReport::ALERT_MEASURE]] } |
This locks the dedup key to the two semantically meaningful fields regardless of future payload growth.
| # column-less failed_pct). Empty when no structured pairs were handed off, leaving the | ||
| # body unchanged. | ||
| def regressed_section | ||
| return "" if regressed_overview.to_s.strip.empty? |
There was a problem hiding this comment.
regressed_overview is always a String — initialized to "" and only accepted as a keyword arg with that default — so .to_s is a no-op. Minor nit:
| return "" if regressed_overview.to_s.strip.empty? | |
| return "" if regressed_overview.strip.empty? |
| | **Workflow run** | [Run ##{github_run_number}](#{github_run_url}) | | ||
| | **Bencher dashboard** | [View history](#{bencher_url}) | | ||
|
|
||
| #{regressed_section} |
There was a problem hiding this comment.
When regressed_section is non-empty it ends with \n\n (from the <<~MARKDOWN trailing blank line), and this heredoc line contributes another \n, producing two blank lines before ### What to do instead of one. GitHub Markdown collapses them on render so there's no visual difference, but it's worth noting if the raw markdown ever needs to be exact.
If you want a single blank line consistently, you can strip the trailing newline from regressed_section:
#{regressed_section.chomp}or trim the trailing blank line inside the regressed_section heredoc. Not a blocker.
* origin/main: Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) Confirm benchmark regressions on a fresh runner before filing the main issue (#3810) Define agent scope and accelerated RC auto-merge policy (#3808) Replace custom MockClient with async-http Mock::Endpoint (#3703) Docs: per-request data sharing in RSC with React.cache() (#3769) Pro RSC: share unstable_cache across renderer workers via Redis (#3705) [codex] Add PR batch planning skill (#3792) Docs: document PR batch operational lessons (#3789) Document dummy Redux state indexing rationale (#3781) Pro RSC: avoid caching failed Flight renders (#3775) # Conflicts: # packages/react-on-rails-pro/tests/getReactServerComponent.client.test.ts
…o-rsc-rspack-ci * origin/main: Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) # Conflicts: # react_on_rails_pro/spec/dummy/config/webpack/clientWebpackConfig.js
* origin/main: (23 commits) Enforce Pro license headers in CI and pre-commit (#3821) Add RSC payload route-data helper (#3783) [Pro] Fix React.cache request dedupe in generated RSC configs (#3813) Docs: clarify RuboCop autofix ownership (#3827) Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) Add issue evaluation skill (#3816) Confirm benchmark regressions on a fresh runner before filing the main issue (#3810) Define agent scope and accelerated RC auto-merge policy (#3808) Replace custom MockClient with async-http Mock::Endpoint (#3703) Docs: per-request data sharing in RSC with React.cache() (#3769) Pro RSC: share unstable_cache across renderer workers via Redis (#3705) [codex] Add PR batch planning skill (#3792) ...
…-floor-fix * origin/main: (29 commits) Docs: align pr-batch closeout confidence handoff (#3835) Align adversarial review CI polling guidance (#3794) CI: add Pro RSC rspack runtime gate (#3817) Make RSCRoute refetch failures recoverable in production (#3786) Fix Pro node renderer license headers (#3834) Docs: fix anti-patterns in RSC tutorials (#3801) fix(pro): add RSC peer compatibility gate (#3831) Enforce Pro license headers in CI and pre-commit (#3821) Add RSC payload route-data helper (#3783) [Pro] Fix React.cache request dedupe in generated RSC configs (#3813) Docs: clarify RuboCop autofix ownership (#3827) Add Pro license header checker RSC: stop serializing props into embedded payload cache key (#3800) Make PR batches skip customer-feedback issues (#3826) Name the regressed benchmark+measure pairs in the issue body (#3830) Clarify agent batch policy handoffs (#3824) Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829) Fix auto-bundled component pack normalization (#3818) Filter stale Bencher alerts before reporting (#3822) Tighten benchmark confirmation workflow permissions (#3819) ... # Conflicts: # .github/workflows/benchmark.yml
Problem
A confirmed
performance-regressionissue currently only shows the per-suite summary tables. Those tables flag 🔴 on rps and p50 but never on failed_pct (it has no column — it lives in the "Status" cell) and never on any untracked measure. So a reader has to open the tables to see what regressed, and afailed_pct-only regression renders with no visible 🔴 at all — the same "the issue names nothing actionable" problem behind #3782/#3795.Fix
Add a "What regressed" section to the issue body, built from the confirmed
ALERTSpayload (#3810's structured{benchmark, measure}pairs), aggregated across all suites and computed once inreport_regressions:report_suite→RegressionIssueReporter; only the reporter that creates the issue renders the body, so the cross-suite list is complete.regressed_overviewdefaults to""→ the section is omitted and the body is byte-for-byte unchanged, so it composes with the existing flow and with older payloads that lackALERTS.Validation
bundle exec rspec benchmarks/spec— 233 examples, 0 failures (4 new: helper rendering/dedup + issue-body inclusion/omission).bundle exec rubocop benchmarks/...— no offenses.Relationship to other work
Complements #3829 (filter alerts to tracked measures): #3829 stops the p90 false positives from filing at all; this makes the real confirmed regressions self-explanatory in the issue body. Refs #3755.
🤖 Generated with Claude Code
Note
Low Risk
CI reporting and GitHub issue markdown only; no runtime app or auth changes.
Overview
Adds a "What regressed" section to auto-filed
performance-regressionissue bodies, listing confirmed benchmark + measure pairs as markdown bullets (including measures like failed_pct that the per-suite tables do not surface clearly).report_regressionsbuilds the list once from each confirmed payload's structuredALERTSvia newregressed_overview_markdown(deduped across suites) and passesregressed_overviewintoRegressionIssueReporter. The section is omitted when the overview is empty, so older payloads withoutALERTSkeep the previous body shape.Reviewed by Cursor Bugbot for commit 3068c93. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
New Features
Tests