Skip to content

Name the regressed benchmark+measure pairs in the issue body#3830

Merged
justin808 merged 1 commit into
mainfrom
jg/benchmark-issue-body-names-pairs
Jun 9, 2026
Merged

Name the regressed benchmark+measure pairs in the issue body#3830
justin808 merged 1 commit into
mainfrom
jg/benchmark-issue-body-names-pairs

Conversation

@justin808

@justin808 justin808 commented Jun 9, 2026

Copy link
Copy Markdown
Member

Problem

A confirmed performance-regression issue currently only shows the per-suite summary tables. Those tables flag 🔴 on rps and p50 but never on failed_pct (it has no column — it lives in the "Status" cell) and never on any untracked measure. So a reader has to open the tables to see what regressed, and a failed_pct-only regression renders with no visible 🔴 at all — the same "the issue names nothing actionable" problem behind #3782/#3795.

Fix

Add a "What regressed" section to the issue body, built from the confirmed ALERTS payload (#3810's structured {benchmark, measure} pairs), aggregated across all suites and computed once in report_regressions:

### What regressed

The fresh-runner confirmation re-alerted on these benchmark + measure pairs:

- `/client_side_log_throw: Pro` — **rps**
- `/server_side_redux_app: Core` — **failed_pct**
  • Threaded through report_suiteRegressionIssueReporter; only the reporter that creates the issue renders the body, so the cross-suite list is complete.
  • regressed_overview defaults to "" → the section is omitted and the body is byte-for-byte unchanged, so it composes with the existing flow and with older payloads that lack ALERTS.

Validation

  • bundle exec rspec benchmarks/spec233 examples, 0 failures (4 new: helper rendering/dedup + issue-body inclusion/omission).
  • bundle exec rubocop benchmarks/... — no offenses.
  • Rendered the body locally with and without pairs to confirm clean Markdown spacing and that the no-pairs body is unchanged.

Relationship to other work

Complements #3829 (filter alerts to tracked measures): #3829 stops the p90 false positives from filing at all; this makes the real confirmed regressions self-explanatory in the issue body. Refs #3755.

🤖 Generated with Claude Code


Note

Low Risk
CI reporting and GitHub issue markdown only; no runtime app or auth changes.

Overview
Adds a "What regressed" section to auto-filed performance-regression issue bodies, listing confirmed benchmark + measure pairs as markdown bullets (including measures like failed_pct that the per-suite tables do not surface clearly).

report_regressions builds the list once from each confirmed payload's structured ALERTS via new regressed_overview_markdown (deduped across suites) and passes regressed_overview into RegressionIssueReporter. The section is omitted when the overview is empty, so older payloads without ALERTS keep the previous body shape.

Reviewed by Cursor Bugbot for commit 3068c93. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Performance regression reports now include a "What regressed" overview section, displaying all confirmed benchmark and measure pairs that regressed across test suites in a single consolidated view.
  • Tests

    • Added comprehensive test coverage for the new regression overview markdown rendering logic and deduplication.

Confirmed regression issues only showed the per-suite summary tables, whose 🔴
flag covers rps/p50 but never the column-less failed_pct — so a reader had to
open the tables, and a failed_pct-only regression looked blank.

Add a "What regressed" section to the issue body built from the confirmed ALERTS
payload (benchmark + measure pairs), aggregated across all suites and computed
once. Defaults to empty (body unchanged) when no structured pairs were handed off,
so it composes with the existing flow and older payloads.

Refs #3755

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e055a0cc-190f-4e2a-871e-2231da2f7969

📥 Commits

Reviewing files that changed from the base of the PR and between 20e1f3d and 3068c93.

📒 Files selected for processing (2)
  • benchmarks/report_regressions.rb
  • benchmarks/spec/report_regressions_spec.rb

Walkthrough

This PR extends performance regression reporting in benchmarks/report_regressions.rb and its test suite. It introduces a "What regressed" overview section that lists unique benchmark/measure pairs from regression payloads, extracts and deduplicates this overview once at the suite level, threads it through the reporter initialization, and conditionally renders it into created/updated GitHub issues.

Changes

Regression overview reporting

Layer / File(s) Summary
Overview extraction helper
benchmarks/report_regressions.rb
regressed_overview_markdown(payloads) builds a deduplicated newline-joined markdown list of confirmed benchmark/measure pairs extracted from each payload's ALERTS, returning empty string when no pairs exist.
Reporter initialization and rendering
benchmarks/report_regressions.rb
RegressionIssueReporter#initialize accepts a regressed_overview keyword parameter and stores it with attr_reader; new regressed_section method conditionally renders the "What regressed" block based on whether the overview is non-empty; the issue template embeds this rendered section.
Integration in report_suite
benchmarks/report_regressions.rb
The main script precomputes a single regressed_overview from all readable payloads; report_suite function signature now accepts this overview as a keyword argument; each call to RegressionIssueReporter.report_suite includes the precomputed overview.
Test coverage for overview and rendering
benchmarks/spec/report_regressions_spec.rb
New RSpec examples verify that created issue bodies include the "What regressed" section when benchmark/measure pairs exist and omit it when empty; new regressed_overview_markdown unit test group confirms deduplication of bullet items from ALERTS and empty-string return when no alerts are present.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • shakacode/react_on_rails#3767: Both PRs modify benchmarks/report_regressions.rb by updating RegressionIssueReporter#initialize and report_suite signatures to pass additional data through the same reporter pipeline (this PR adds regressed_overview section rendering; related PR adds shared per-run issue/comment caching).
  • shakacode/react_on_rails#3540: Both PRs modify the benchmarks/report_regressions.rb reporting flow; the related PR introduced the initial reporter/script structure that this PR extends to render the "What regressed" overview section.

Suggested labels

enhancement, review-needed, benchmark, P3

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding a 'What regressed' section that names the regressed benchmark+measure pairs in the issue body.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jg/benchmark-issue-body-names-pairs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds a "What regressed" section to the performance-regression GitHub issue body, listing the confirmed benchmark + measure pairs sourced from the structured ALERTS payload. The section is assembled once across all suites in report_regressions and threaded through report_suiteRegressionIssueReporter as regressed_overview, making the issue self-explanatory without requiring readers to open per-suite tables.

  • regressed_overview_markdown flat-maps, filters, deduplicates, and formats {benchmark, measure} pairs from all confirmed payloads; returns \"\" when none are present so the no-pairs body is byte-equivalent to the original.
  • regressed_section in RegressionIssueReporter wraps the overview in a ### What regressed block and returns \"\" when empty, keeping the existing issue body unchanged for callers using older payloads.
  • Four new RSpec examples cover dedup/empty output for regressed_overview_markdown and inclusion/omission of the section in the created issue body.

Confidence Score: 5/5

Safe to merge; the change is purely additive and backward-compatible, with the no-pairs path producing output byte-identical to the original.

The implementation is narrowly scoped: a new helper function, a new private method, and two wired-up keyword arguments. The regressed_overview defaults to "" everywhere it propagates, so every existing call site is unaffected. The regressed_section guard on empty/blank strings prevents any body change when no structured alerts are present. The only observations are cosmetic (extra trailing blank line in raw Markdown that collapses in GitHub rendering) and a small test coverage gap for multi-line overviews in the issue body.

No files require special attention.

Important Files Changed

Filename Overview
benchmarks/report_regressions.rb Adds regressed_overview_markdown helper and threads regressed_overview through report_suiteRegressionIssueReporter; backward-compatible (defaults to ""), regressed_section guards on empty string so the no-pairs body is byte-equivalent to the original.
benchmarks/spec/report_regressions_spec.rb Adds four new examples: regressed_overview_markdown dedup/empty, and RegressionIssueReporter issue-body inclusion/omission; existing suite is unchanged.

Sequence Diagram

sequenceDiagram
    participant S as report_regressions
    participant R as regressed_overview_markdown
    participant RS as report_suite
    participant I as RegressionIssueReporter

    S->>S: load confirmed payloads (readable)
    S->>R: regressed_overview_markdown(readable)
    R->>R: flat_map ALERTS across all payloads
    R->>R: select valid pairs, .uniq, .map to bullets
    R-->>S: "- `/x: Core` — **rps**\n..."
    loop each suite (sorted)
        S->>RS: report_suite(suite_name, payloads, regressed_overview:)
        RS->>I: RegressionIssueReporter.report(..., regressed_overview:)
        I->>I: regressed_section → "" when empty
        I->>I: issue_body interpolates regressed_section
        I-->>RS: issue_number
    end
Loading

Reviews (1): Last reviewed commit: "Name the regressed benchmark+measure pai..." | Re-trigger Greptile

Comment on lines +120 to +131
it "names the confirmed benchmark+measure pairs in the created issue body" do
described_class.report(
summary: "core regressed",
suite_name: "core",
github_run_url: "https://github.com/run/1",
bencher_url: "https://bencher.dev/dash",
regressed_overview: "- `/x: Core` — **rps**"
)
body = posted_bodies.fetch("issue create")
expect(body).to include("### What regressed")
expect(body).to include("- `/x: Core` — **rps**")
end

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing multi-line overview coverage in issue-body test

The regressed_overview_markdown unit test confirms that the helper produces a multi-line string for multiple pairs (e.g. "- \/a: Core` — rps\n- `/b: Pro`"), but the RegressionIssueReporterbody test only exercises a single-lineregressed_overview. Because the overview is interpolated inside a <<~MARKDOWNheredoc inside another<<~MARKDOWN` heredoc, the indentation-stripping of the outer heredoc applies to each logical line of the interpolated value; lines beyond the first have zero leading spaces, so the strip is effectively a no-op on them, but this behaviour is untested end-to-end. A test passing a two-pair overview and asserting both bullets appear in the created issue body would close that gap.

Comment on lines +354 to +356
#{regressed_overview}

MARKDOWN

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The trailing blank line inside regressed_section's heredoc means the method returns a string ending with `

. The outer issue_bodyheredoc's own line-ending
then appends a third newline, producing two blank lines between the last bullet and### What to do`. Dropping the trailing blank line inside the inner heredoc leaves exactly one blank line, matching standard Markdown section spacing.

Suggested change
#{regressed_overview}
MARKDOWN
#{regressed_overview}
MARKDOWN

@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Code Review

Clean, focused PR that solves a real gap: failed_pct regressions and any measure not surfaced by the per-suite tables are now self-explanatory in the filed issue. The implementation is backward-compatible (empty default, no-op body when no ALERTS), the threading through report_suiteRegressionIssueReporter is straightforward, and the new tests cover the essential paths.

Three nits below (two as inline comments, one here):

E2E script test gapwrite_payload in describe "report_regressions.rb (script)" doesn't include RegressionReport::ALERTS in any test fixture, so the full script path (report_regressionsregressed_overview_markdown → issue body) is untested at the subprocess level. The unit tests for regressed_overview_markdown and the RegressionIssueReporter body inclusion tests are solid, so this is low-risk — but it would be good to add RegressionReport::ALERTS to at least the "renders side by side" fixture and assert that "### What regressed" appears in the captured issue body.

Overall: ship-ready as-is; the E2E gap is a nice-to-have, not a blocker.

payloads
.flat_map { |payload| payload[RegressionReport::ALERTS] || [] }
.select { |pair| pair.is_a?(Hash) && pair[RegressionReport::ALERT_BENCHMARK] }
.uniq

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.uniq deduplicates by full Hash equality, which works today because RegressionReport.alert_hash always creates exactly {benchmark, measure} — no extra keys. If the ALERTS format ever gains a third key (e.g. a timestamp or suite tag), identical benchmark+measure pairs from different shards would no longer be deduped.

Suggested change
.uniq
.uniq { |pair| [pair[RegressionReport::ALERT_BENCHMARK], pair[RegressionReport::ALERT_MEASURE]] }

This locks the dedup key to the two semantically meaningful fields regardless of future payload growth.

# column-less failed_pct). Empty when no structured pairs were handed off, leaving the
# body unchanged.
def regressed_section
return "" if regressed_overview.to_s.strip.empty?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regressed_overview is always a String — initialized to "" and only accepted as a keyword arg with that default — so .to_s is a no-op. Minor nit:

Suggested change
return "" if regressed_overview.to_s.strip.empty?
return "" if regressed_overview.strip.empty?

| **Workflow run** | [Run ##{github_run_number}](#{github_run_url}) |
| **Bencher dashboard** | [View history](#{bencher_url}) |

#{regressed_section}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When regressed_section is non-empty it ends with \n\n (from the <<~MARKDOWN trailing blank line), and this heredoc line contributes another \n, producing two blank lines before ### What to do instead of one. GitHub Markdown collapses them on render so there's no visual difference, but it's worth noting if the raw markdown ever needs to be exact.

If you want a single blank line consistently, you can strip the trailing newline from regressed_section:

      #{regressed_section.chomp}

or trim the trailing blank line inside the regressed_section heredoc. Not a blocker.

@justin808 justin808 merged commit 6104dd9 into main Jun 9, 2026
50 checks passed
@justin808 justin808 deleted the jg/benchmark-issue-body-names-pairs branch June 9, 2026 03:52
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  Docs: document PR batch operational lessons (#3789)
  Document dummy Redux state indexing rationale (#3781)
  Pro RSC: avoid caching failed Flight renders (#3775)

# Conflicts:
#	packages/react-on-rails-pro/tests/getReactServerComponent.client.test.ts
justin808 added a commit that referenced this pull request Jun 9, 2026
…o-rsc-rspack-ci

* origin/main:
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)

# Conflicts:
#	react_on_rails_pro/spec/dummy/config/webpack/clientWebpackConfig.js
justin808 added a commit that referenced this pull request Jun 9, 2026
* origin/main: (23 commits)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  Add issue evaluation skill (#3816)
  Confirm benchmark regressions on a fresh runner before filing the main issue (#3810)
  Define agent scope and accelerated RC auto-merge policy (#3808)
  Replace custom MockClient with async-http Mock::Endpoint (#3703)
  Docs: per-request data sharing in RSC with React.cache() (#3769)
  Pro RSC: share unstable_cache across renderer workers via Redis (#3705)
  [codex] Add PR batch planning skill (#3792)
  ...
justin808 added a commit that referenced this pull request Jun 9, 2026
…-floor-fix

* origin/main: (29 commits)
  Docs: align pr-batch closeout confidence handoff (#3835)
  Align adversarial review CI polling guidance (#3794)
  CI: add Pro RSC rspack runtime gate (#3817)
  Make RSCRoute refetch failures recoverable in production (#3786)
  Fix Pro node renderer license headers (#3834)
  Docs: fix anti-patterns in RSC tutorials (#3801)
  fix(pro): add RSC peer compatibility gate (#3831)
  Enforce Pro license headers in CI and pre-commit (#3821)
  Add RSC payload route-data helper (#3783)
  [Pro] Fix React.cache request dedupe in generated RSC configs (#3813)
  Docs: clarify RuboCop autofix ownership (#3827)
  Add Pro license header checker
  RSC: stop serializing props into embedded payload cache key (#3800)
  Make PR batches skip customer-feedback issues (#3826)
  Name the regressed benchmark+measure pairs in the issue body (#3830)
  Clarify agent batch policy handoffs (#3824)
  Filter Bencher alerts to tracked measures (drop orphaned p90 false positives) (#3829)
  Fix auto-bundled component pack normalization (#3818)
  Filter stale Bencher alerts before reporting (#3822)
  Tighten benchmark confirmation workflow permissions (#3819)
  ...

# Conflicts:
#	.github/workflows/benchmark.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant