Commit 5c8cdcb
authored
[codex] Split benchmark tracking helpers (#3791)
## Summary
- Extract Bencher CLI command/report handling from
`benchmarks/track_benchmarks.rb` into `BencherRunner`.
- Extract PR benchmark report comment posting and stale-comment cleanup
into `PrReportPoster`.
- Add focused specs for the extracted helpers so the benchmark tracking
workflow can be changed with less script-level risk.
- Harden stale benchmark comment cleanup by filtering jq output to
numeric GitHub comment IDs and warning on unexpected tokens before
deleting anything.
## Scope
Part of #3459. This intentionally does not close the issue because the
workflow-matrix and k6/server-monitoring follow-ups remain.
No `.github/workflows/*` files were changed, so this avoids the active
#3785 / #3149 workflow lane.
## Validation
- `bundle exec rspec benchmarks/spec`
- `bundle exec rubocop benchmarks`
- `(cd react_on_rails && bundle exec rubocop)`
- `codex review --base origin/main`
- `git diff --check`
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Touches CI benchmark reporting and GitHub PR comment deletion;
behavior is largely preserved but report I/O and gh API paths changed,
with regression risk mitigated by extensive specs and no workflow YAML
edits.
>
> **Overview**
> Refactors benchmark tracking by moving Bencher CLI execution and
report handling out of `track_benchmarks.rb` into **`BencherRunner`**,
and PR Markdown comment posting plus stale-comment cleanup into
**`PrReportPoster`**.
>
> **`BencherRunner`** owns threshold/CLI args, runs `bencher run`,
persists JSON via **tmp → atomic move**, and returns a structured
result. Parse or persistence failures raise typed errors; malformed
output is logged with **`Github.debug`** and bad report files are
removed. **`run_bencher!`** in the script maps those failures to
`::error::` and exits.
>
> **`PrReportPoster`** posts suite reports over **stdin** (avoids arg
limits), sweeps older marker-matched comments using a pre-post cutoff,
and **validates `owner/repo` and numeric PR numbers** before calling
`gh`. Stale-ID listing now **keeps only numeric comment IDs** and warns
on unexpected jq tokens so cleanup cannot delete with bogus paths.
>
> Non-PR runs avoid requiring `PR_NUMBER` via lazy poster init and
`GITHUB_EVENT_NAME` defaulting to nil. Specs for the extracted helpers
replace inlined script tests; **`BencherRunner::THRESHOLDS`** is the
single source for threshold/table alignment checks.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
7867e98. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Automated benchmark runs now produce and persist JSON reports, post
per-suite Markdown to pull requests, and remove stale benchmark
comments.
* **Bug Fixes**
* CI now surfaces clear errors for malformed benchmark output, warns
when perf-link context is missing, and reports failures when comment
deletions fail.
* **Tests**
* New specs cover benchmark run/report parsing, CLI behaviors, retry
paths, and PR comment lifecycle.
* **Refactor**
* Benchmark execution, report parsing, and PR-commenting
responsibilities have been separated for clearer flow.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Agent Merge Confidence
Mode: accelerated-rc
Score: 9/10
Auto-merge recommendation: yes
Affected areas: benchmarks, CI benchmark reporting
CI detector: `script/ci-changes-detector origin/main` -> Benchmark
scripts; recommended lint.
Validation run:
- `bundle exec rspec benchmarks/spec/bencher_runner_spec.rb
benchmarks/spec/pr_report_poster_spec.rb
benchmarks/spec/track_benchmarks_spec.rb` -> 71 examples, 0 failures
- `bundle exec rubocop benchmarks/lib/bencher_runner.rb
benchmarks/lib/pr_report_poster.rb benchmarks/track_benchmarks.rb
benchmarks/spec/bencher_runner_spec.rb
benchmarks/spec/pr_report_poster_spec.rb
benchmarks/spec/track_benchmarks_spec.rb` -> no offenses
- `git diff --check` -> passed
- `script/ci-changes-detector origin/main` -> benchmark scripts;
recommended lint
Review/check gate:
- Claude review: complete for
`7867e984dbe919d1d3faee4f6ac69120bdc6925b`, no unresolved current
threads
- CodeRabbit: approved/no actionable current blockers; docstring
coverage warning treated as non-blocking for benchmark helper scripts
- GitHub checks: complete for
`7867e984dbe919d1d3faee4f6ac69120bdc6925b`; expected skips are
non-selected/confirmation-only jobs
Known residual risk: low benchmark-comment lifecycle risk, mitigated by
focused helper specs and full current-head CI.
Finalized by: Claude Code Review check `claude-review` for head
`7867e984dbe919d1d3faee4f6ac69120bdc6925b`1 parent e73221d commit 5c8cdcb
8 files changed
Lines changed: 934 additions & 243 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
0 commit comments