Skip to content

Fix Pro dummy integration test gating#3697

Merged
justin808 merged 1 commit into
mainfrom
fix/3633-pro-dummy-integration-gating
Jun 6, 2026
Merged

Fix Pro dummy integration test gating#3697
justin808 merged 1 commit into
mainfrom
fix/3633-pro-dummy-integration-gating

Conversation

@frullah

@frullah frullah commented Jun 6, 2026

Copy link
Copy Markdown
Member

Summary

  • expose the existing run_pro_dummy_tests detector output from pro-integration-tests.yml
  • run the Pro dummy app bundle build, RSpec integration job, and Playwright job when Pro-dummy-only changes are detected
  • add a detector harness regression for Pro dummy-only changes so run_pro_dummy_tests stays true while Pro unit tests remain false

Closes #3633.

Validation

  • rg -q 'run_pro_dummy_tests' .github/workflows/pro-integration-tests.yml (failed before the workflow change, passes after)
  • bash script/ci-changes-detector-test.bash (38 run, 0 failed)
  • actionlint .github/workflows/pro-integration-tests.yml
  • script/ci-changes-detector origin/main (workflow change recommends broad CI)
  • git diff --check
  • bash -n script/ci-changes-detector-test.bash
  • pnpm exec prettier --check .github/workflows/pro-integration-tests.yml
  • /Users/justin/.agents/skills/autoreview/scripts/autoreview --mode local (clean)
  • bundle exec rubocop was attempted; it fails on pre-existing react_on_rails/spike/3313_prism_gemfile_rewriter/* offenses unrelated to this branch.

Summary by CodeRabbit

  • Chores
    • Updated continuous integration infrastructure and testing configurations to improve build reliability and test coverage detection.

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Need the big picture first? Review this PR in Change Stack to see what changed before going file by file.

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f21d92fa-a0f8-42c2-9237-c1a54d32b7e6

📥 Commits

Reviewing files that changed from the base of the PR and between f8c4678 and 8ae34b4.

📒 Files selected for processing (2)
  • .github/workflows/pro-integration-tests.yml
  • script/ci-changes-detector-test.bash

Walkthrough

The PR wires the run_pro_dummy_tests change-detector flag through the CI workflow. The detect-changes job now outputs this flag, and three dummy-app integration test jobs update their gating conditions to also run when this output is true. Test harness updates verify the detector correctly identifies Pro-dummy-only changes and triggers the appropriate flag without triggering Pro unit tests.

Changes

Pro dummy tests detection and gating

Layer / File(s) Summary
Workflow output declaration and downstream job gating
.github/workflows/pro-integration-tests.yml
The detect-changes job declares run_pro_dummy_tests output forwarded from the detect step. The build-dummy-app-webpack-test-bundles, rspec-dummy-app-node-renderer, and dummy-app-node-renderer-e2e-tests jobs update their if conditions to also run when needs.detect-changes.outputs.run_pro_dummy_tests == 'true'.
Detector test fixture and test case
script/ci-changes-detector-test.bash
Repository setup creates the Pro dummy requests directory and provisions a posts_page_spec.rb fixture. A new test case (test_pro_dummy_only_change_runs_pro_dummy_tests_without_pro_unit_tests) modifies the fixture, commits the change, and asserts the detector sets run_pro_dummy_tests=true while keeping run_pro_tests false. The test is registered in the test runner sequence.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • shakacode/react_on_rails#3615: Both PRs center on Pro dummy /posts_page seeding behavior in react_on_rails_pro/spec/dummy/spec/requests/posts_page_spec.rb—the main PR adds a detector test fixture that edits this file to ensure run_pro_dummy_tests is triggered, while the retrieved PR updates the same spec to seed and assert the seeded "sentinel" posts.
  • shakacode/react_on_rails#3447: Both PRs modify the CI changes-detector/test harness and workflow outputs to introduce new downstream gating flags (run_pro_dummy_tests vs run_e2e_tests) so the detector-driven jobs can run conditionally.

Suggested labels

enhancement, review-needed, full-ci, P3

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: exposing and gating Pro dummy integration tests on the correct detector flag.
Linked Issues check ✅ Passed All three coding objectives from issue #3633 are met: run_pro_dummy_tests is exposed in pro-integration-tests.yml, dummy-app jobs are gated on it, and a regression test is added.
Out of Scope Changes check ✅ Passed All changes directly address the linked issue requirements: workflow output exposure, job gating updates, and detector regression test.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3633-pro-dummy-integration-gating

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude

claude Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Code Review: Fix Pro dummy integration test gating

Overview:

This PR fixes a real CI gating gap. The ci-changes-detector script was already computing and emitting run_pro_dummy_tests to GITHUB_OUTPUT, but the detect-changes job in pro-integration-tests.yml never declared it as a job-level output, so the three downstream integration jobs (webpack build, RSpec, Playwright) could never see it. Changes scoped to react_on_rails_pro/spec/dummy/** silently skipped all Pro integration tests.

Correctness:

The fix is correct and minimal. It exposes the missing output (run_pro_dummy_tests) in the detect-changes job outputs block, sets it to true in the force-run path (consistent with all other flags), and ORs it into the job conditions for all three Pro integration jobs. The pattern exactly mirrors how run_pro_tests is used. The detector script itself needed no changes.

Test Coverage:

The new test test_pro_dummy_only_change_runs_pro_dummy_tests_without_pro_unit_tests covers the key regression: a Pro dummy-only change sets run_pro_dummy_tests=true while keeping run_pro_tests=false.

Minor observation: Comparable tests (e.g. test_pro_app_comment_only_change_runs_pro_lint_only) assert several outputs to prevent false negatives. The new test only asserts two. Since PRO_DUMMY_CHANGED=true also sets RUN_PRO_LINT=true in the detector, asserting 'run_pro_lint: true' in the new test would complete the coverage picture. This is a polish nit, not a bug.

No other issues found:

  • No security implications (pure CI routing logic, no secrets or permission changes)
  • No performance risk (additional condition only triggers expensive jobs when actually needed)
  • The paths-ignore in the workflow does not exclude react_on_rails_pro/spec/dummy/**, so the detect-changes job correctly fires on Pro dummy-only PRs

Verdict: Approved. Clean, focused fix with a targeted regression test.

@greptile-apps

greptile-apps Bot commented Jun 6, 2026

Copy link
Copy Markdown

Greptile Summary

This PR wires up an already-computed run_pro_dummy_tests flag that was never declared as a detect-changes job output, so downstream workflow jobs (bundle build, RSpec, Playwright) were never gated on it. A companion test regression is added to keep that flag from regressing silently.

  • Workflow fix: declares run_pro_dummy_tests as a job output in detect-changes and adds it to the if: condition of all three Pro dummy app jobs; the force-run / full-CI-label fast-path now also emits the flag.
  • Test harness: adds the react_on_rails_pro/spec/dummy/spec/requests/posts_page_spec.rb fixture to setup_repo and a new test case that asserts Pro-dummy-only changes set run_pro_dummy_tests=true while keeping run_pro_tests=false.

Confidence Score: 5/5

Safe to merge. The change is a one-line output declaration and symmetric condition additions in the workflow; the underlying run_pro_dummy_tests logic in ci-changes-detector was already correct.

The workflow fix is mechanical: the output was already computed but never forwarded to downstream jobs. All three job conditions receive the same treatment in the same pattern as the existing run_pro_tests guard. The test harness regression validates the exact scenario described in the issue.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/pro-integration-tests.yml Adds run_pro_dummy_tests as a declared job output and appends it to the if: guard of all three downstream jobs; symmetric and consistent with the existing run_pro_tests pattern. Force-run path also updated.
script/ci-changes-detector-test.bash Extends setup_repo with the Pro dummy spec fixture and adds test_pro_dummy_only_change_runs_pro_dummy_tests_without_pro_unit_tests; the test covers the core regression but omits a few scope-narrowing assertions that peer tests include.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[detect-changes job] -->|outputs| B{run_pro_tests?}
    A -->|outputs NEW| C{run_pro_dummy_tests?}
    A -->|outputs| D{non_runtime_only?}

    B -->|true| E[build-dummy-app-webpack-test-bundles]
    C -->|true| E
    D -->|true on main push| F[skip all downstream jobs]

    E --> G[rspec-dummy-app-node-renderer]
    E --> H[dummy-app-node-renderer-e2e-tests]

    B -->|true| G
    C -->|true| G
    B -->|true| H
    C -->|true| H

    style C fill:#90EE90,stroke:#228B22
Loading

Reviews (1): Last reviewed commit: "fix pro dummy integration test gating" | Re-trigger Greptile

Comment on lines +465 to +474
test_pro_dummy_only_change_runs_pro_dummy_tests_without_pro_unit_tests() {
setup_repo
perl -0pi -e 's/renders/renders seeded data/' react_on_rails_pro/spec/dummy/spec/requests/posts_page_spec.rb
commit_change "pro dummy spec"

local out
out="$(detector_output)"
assert_contains "$out" '"run_pro_dummy_tests": true' "pro dummy output"
assert_contains "$out" '"run_pro_tests": false' "pro dummy output"
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing scope-narrowing assertions

The test confirms the two key flags but doesn't assert that a Pro-dummy-only change doesn't bleed into the core gem suite. The analogous test test_pro_app_source_change_runs_pro_tests_only explicitly checks run_ruby_tests: false and run_js_tests: false with the comment "Scoped to Pro: a Pro app change must not drag in the core gem suite." Adding the same guards here would catch a future regression where the Pro-dummy path accidentally triggers core-gem jobs.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Pro Node Renderer Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
Pro Node Renderer: simple_eval (non-RSC) 2124.03 ▼3.1% (2191.63) 4.25 ▲2.3% (4.16) 5.27 ▼30.5% (7.58) 200=63722
Pro Node Renderer: react_ssr (non-RSC) 1912.04 ▼0.6% (1923.72) 4.72 ▼0.1% (4.72) 6.05 ▼23.6% (7.92) 200=57364

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Core Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/: Core 3.84 ▲37.3% (2.8) 2114.04 ▼21.8% (2703.24) 2692.09 ▼24.8% (3581.82) 200=123
/client_side_hello_world: Core 759.3 ▲15.1% (659.55) 8.32 ▼9.3% (9.17) 17.17 ▼22.1% (22.04) 200=22943
/client_side_rescript_hello_world: Core 805.48 ▲18.4% (680.09) 7.72 ▼15.6% (9.15) 12.34 ▼42.0% (21.28) 200=24336
/client_side_hello_world_shared_store: Core 731.17 ▲15.2% (634.78) 4.18 ▼57.2% (9.76) 15.59 ▼30.6% (22.48) 200=22239
/client_side_hello_world_shared_store_controller: Core 760.02 ▲21.5% (625.78) 8.25 ▼16.2% (9.85) 13.34 ▼42.4% (23.16) 200=22963
/client_side_hello_world_shared_store_defer: Core 753.58 ▲17.9% (639.05) 8.82 ▼10.7% (9.88) 17.0 ▼22.7% (21.98) 200=22764
/server_side_hello_world_shared_store: Core 16.59 ▲38.1% (12.01) 505.28 ▼23.1% (656.92) 688.07 ▼17.5% (834.29) 200=505
/server_side_hello_world_shared_store_controller: Core 16.45 ▲37.0% (12.0) 334.02 ▼47.1% (631.18) 631.94 ▼24.2% (834.04) 200=507
/server_side_hello_world_shared_store_defer: Core 16.21 ▲35.9% (11.93) 501.18 ▼20.9% (633.28) 648.76 ▼24.2% (855.65) 200=500
/server_side_hello_world: Core 33.15 ▲36.2% (24.33) 250.63 ▼19.5% (311.28) 305.15 ▼21.8% (390.02) 200=1007
/server_side_hello_world_hooks: Core 33.36 ▲37.5% (24.26) 246.44 ▼21.4% (313.45) 295.78 ▼24.7% (392.71) 200=1016
/server_side_hello_world_props: Core 33.17 ▲38.0% (24.04) 250.47 ▼22.3% (322.32) 305.37 ▼21.6% (389.65) 200=1009
/client_side_log_throw: Core 796.88 ▲20.6% (661.0) 7.8 ▼18.1% (9.53) 12.54 ▼35.8% (19.52) 200=24079
/server_side_log_throw: Core 28.34 ▲19.2% (23.78) 207.98 ▼36.8% (329.11) 272.22 ▼31.7% (398.45) 200=863
/server_side_log_throw_plain_js: Core 28.62 ▲18.7% (24.11) 207.37 ▼33.2% (310.57) 270.27 ▼31.5% (394.81) 200=872
/server_side_log_throw_raise: Core 33.08 ▲36.9% (24.17) 254.05 ▼18.6% (312.03) 302.68 ▼23.0% (392.94) 3xx=1006
/server_side_log_throw_raise_invoker: Core 878.65 ▲13.0% (777.78) 7.31 ▼14.4% (8.54) 16.46 ▲0.9% (16.32) 200=26544
/server_side_hello_world_es5: Core 33.6 🟢 41.9% (23.69) 195.17 ▼38.2% (315.56) 288.43 ▼26.9% (394.31) 200=1024
/server_side_redux_app: Core 27.96 ▲18.6% (23.57) 211.05 ▼35.7% (328.34) 284.52 ▼29.0% (400.81) 200=852
/server_side_hello_world_with_options: Core 33.68 ▲39.0% (24.23) 241.02 ▼23.9% (316.68) 294.06 ▼24.9% (391.41) 200=1025
/server_side_redux_app_cached: Core 552.25 ▼14.3% (644.56) 10.56 ▲1.9% (10.36) 16.77 ▼8.7% (18.37) 200=16685
/client_side_manual_render: Core 770.39 ▲12.1% (686.97) 8.05 ▼13.7% (9.33) 19.62 ▲2.2% (19.2) 200=23279
/render_js: Core 35.83 ▲42.4% (25.15) 230.69 ▼23.8% (302.56) 279.21 ▼25.5% (374.83) 200=1088
/react_router: Core 31.4 ▲37.5% (22.83) 263.06 ▼23.7% (344.91) 322.25 ▼21.6% (411.04) 200=954
/pure_component: Core 33.85 ▲36.4% (24.81) 242.82 ▼23.3% (316.44) 294.14 ▼25.4% (394.53) 200=1030
/css_modules_images_fonts_example: Core 33.33 ▲39.1% (23.97) 248.34 ▼20.9% (313.77) 301.43 ▼23.5% (394.04) 200=1014
/turbolinks_cache_disabled: Core 794.03 ▲17.9% (673.22) 10.21 ▲7.7% (9.48) 17.27 ▼10.3% (19.25) 200=23987
/rendered_html: Core 33.97 ▲42.2% (23.88) 247.25 ▼23.1% (321.7) 299.9 ▼23.7% (393.22) 200=1031
/xhr_refresh: Core 14.21 ▲13.5% (12.52) 420.3 ▼31.2% (610.52) 483.13 ▼41.9% (831.17) 200=437
/react_helmet: Core 32.72 ▲36.4% (23.99) 249.22 ▼22.4% (321.02) 304.37 ▼23.9% (399.94) 200=996
/broken_app: Core 32.7 ▲37.1% (23.85) 248.19 ▼22.3% (319.41) 311.41 ▼20.6% (392.4) 200=994
/image_example: Core 33.29 ▲41.6% (23.51) 250.38 ▼22.2% (321.88) 301.47 ▼23.6% (394.43) 200=1011
/turbo_frame_tag_hello_world: Core 882.78 ▲21.5% (726.79) 7.49 ▼14.6% (8.77) 17.61 ▼1.5% (17.88) 200=26668
/manual_render_test: Core 553.51 ▼17.6% (671.54) 10.47 ▲11.9% (9.36) 14.57 ▼25.5% (19.55) 200=16724

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Pro (shard 2/2) Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/empty: Pro 1232.51 ▲0.5% (1226.59) 7.08 ▲19.4% (5.93) 9.01 ▲0.9% (8.93) 200=36982
/ssr_shell_error: Pro 335.8 ▲2.3% (328.33) 23.68 ▲7.0% (22.13) 36.97 ▲4.9% (35.24) 200=10144
/ssr_sync_error: Pro 287.76 ▼11.5% (325.28) 20.22 ▼8.1% (22.0) 46.49 ▲25.4% (37.07) 200=8699
/rsc_component_error: Pro 342.09 ▲5.0% (325.76) 23.14 ▲3.1% (22.44) 36.17 ▲1.0% (35.82) 200=10335
/non_existing_stream_react_component: Pro 295.16 ▼11.9% (335.13) 19.72 ▼6.9% (21.18) 55.41 ▲61.6% (34.29) 200=8920
/server_side_redux_app_cached: Pro 366.33 ▲0.8% (363.48) 24.0 ▲15.1% (20.85) 31.66 ▼2.3% (32.42) 200=11069
/loadable: Pro 312.52 ▲4.6% (298.67) 27.96 ▲13.4% (24.66) 37.01 ▲3.4% (35.8) 200=9445
/apollo_graphql: Pro 142.72 ▲2.4% (139.31) 60.93 ▲18.4% (51.47) 86.18 ▲0.8% (85.52) 200=4314
/console_logs_in_async_server: Pro 2.58 ▼20.3% (3.24) 2121.19 0.0% (2121.98) 2141.23 ▼2.0% (2184.37) 200=94
/stream_error_demo: Pro 340.82 ▲5.9% (321.71) 23.03 ▲4.3% (22.09) 34.18 ▼5.0% (35.99) 200=10300
/stream_async_components: Pro 345.91 ▲6.6% (324.5) 22.79 ▲1.2% (22.53) 36.81 ▼0.8% (37.1) 200=10452
/rsc_posts_page_over_http: Pro 341.78 ▲6.3% (321.56) 22.81 ▲0.7% (22.65) 36.79 ▲2.8% (35.79) 200=10333
/rsc_echo_props: Pro 236.09 ▲4.5% (226.01) 34.51 ▲4.0% (33.18) 52.37 ▲5.8% (49.51) 200=7136
/async_on_server_sync_on_client_client_render: Pro 360.48 ▲3.7% (347.69) 21.82 ▲4.5% (20.87) 34.19 ▲0.2% (34.11) 200=10894
/server_router_client_render: Pro 365.59 ▲5.6% (346.28) 20.96 0.0% (20.97) 31.47 ▼2.2% (32.19) 200=11049
/unwrapped_rsc_route_stream_render: Pro 356.71 ▲6.8% (334.15) 20.91 ▼1.3% (21.19) 33.6 ▼9.1% (36.97) 200=10778
/async_render_function_returns_component: Pro 351.64 ▲7.9% (326.01) 22.45 ▲5.3% (21.33) 34.38 ▼1.0% (34.74) 200=10623
/native_metadata: Pro 280.58 ▼14.7% (328.89) 20.79 ▼6.9% (22.34) 49.61 ▲41.5% (35.06) 200=8479
/hybrid_metadata_streaming: Pro 293.79 ▼10.7% (329.16) 19.76 ▼10.8% (22.16) 41.28 ▲10.7% (37.3) 200=8879
/cache_demo: Pro 335.48 ▲4.8% (320.03) 22.99 ▲0.6% (22.84) 35.51 ▼6.6% (38.03) 200=10140
/client_side_hello_world_shared_store: Pro 356.87 ▲9.2% (326.91) 21.9 ▼0.3% (21.98) 34.06 ▲0.3% (33.97) 200=10782
/client_side_hello_world_shared_store_defer: Pro 352.01 ▲7.4% (327.82) 24.8 ▲11.0% (22.34) 33.18 ▼3.2% (34.27) 200=10635
/server_side_hello_world_shared_store_controller: Pro 233.06 ▼14.8% (273.53) 24.74 ▼6.7% (26.52) 44.51 ▲9.7% (40.57) 200=7044
/server_side_hello_world: Pro 344.94 ▲8.0% (319.32) 22.89 ▲2.0% (22.45) 33.81 ▼4.0% (35.23) 200=10426
/client_side_log_throw: Pro 384.91 ▲5.7% (364.2) 22.79 ▲12.2% (20.32) 30.56 ▼0.6% (30.75) 200=11555
/server_side_log_throw_plain_js: Pro 377.28 ▲3.0% (366.37) 23.17 ▲16.7% (19.85) 30.82 ▼1.9% (31.41) 200=11398
/server_side_log_throw_raise_invoker: Pro 417.98 ▲4.9% (398.6) 18.48 ▲6.4% (17.36) 27.11 ▼3.3% (28.04) 200=12629
/server_side_redux_app: Pro 269.79 ▼16.4% (322.65) 21.55 ▼3.7% (22.37) 56.65 ▲60.9% (35.21) 200=8152
/server_side_redux_app_cached: Pro 374.2 ▲2.9% (363.48) 16.81 ▼19.4% (20.85) 27.78 ▼14.3% (32.42) 200=11308
/render_js: Pro 378.24 ▲3.3% (366.24) 20.74 ▲4.0% (19.95) 32.77 ▲1.4% (32.31) 200=11429
/pure_component: Pro 351.47 ▲7.2% (327.78) 22.63 ▲2.5% (22.09) 34.65 ▼8.1% (37.69) 200=10619
/turbolinks_cache_disabled: Pro 376.33 ▲4.7% (359.29) 16.78 ▼16.2% (20.03) 27.74 ▼11.8% (31.44) 200=11376
/xhr_refresh: Pro 287.91 ▲4.8% (274.83) 30.35 ▲15.8% (26.22) 40.57 ▲5.2% (38.56) 200=8646
/broken_app: Pro 348.71 ▲2.7% (339.55) 25.22 ▲15.6% (21.82) 33.36 ▼6.2% (35.56) 200=10535
/server_render_with_timeout: Pro 350.11 ▲3.9% (337.1) 25.26 ▲14.9% (21.98) 33.48 ▲0.2% (33.42) 200=10511

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Pro (shard 1/2) Benchmark Summary

Benchmark RPS p50(ms) p90(ms) Status
/: Pro 178.74 ▲1.9% (175.43) 45.3 ▲7.9% (42.0) 61.14 ▼2.6% (62.74) 200=5404
/error_scenarios_hub: Pro 307.46 ▼12.2% (350.04) 18.71 ▼7.2% (20.16) 22.3 ▼29.0% (31.41) 200=9294
/ssr_async_error: Pro 347.44 ▲5.1% (330.6) 22.39 ▲7.8% (20.78) 36.57 ▲7.5% (34.03) 200=10499
/ssr_async_prop_error: Pro 338.51 ▲5.1% (322.01) 25.89 ▲16.9% (22.14) 35.97 ▼2.1% (36.75) 200=10226
/non_existing_react_component: Pro 291.37 ▼14.7% (341.73) 19.99 ▼2.1% (20.42) 45.26 ▲28.4% (35.25) 200=8805
/non_existing_rsc_payload: Pro 376.12 ▲5.5% (356.63) 20.79 ▲2.4% (20.31) 30.84 ▼13.7% (35.74) 200=11365
/cached_react_helmet: Pro 377.29 ▲2.8% (366.99) 21.06 ▲7.3% (19.62) 32.16 ▲5.3% (30.53) 200=11398
/cached_redux_component: Pro 315.91 ▼17.6% (383.49) 18.42 ▼4.1% (19.22) 45.84 ▲45.4% (31.52) 200=9545
/lazy_apollo_graphql: Pro 155.38 ▲3.8% (149.69) 55.9 ▲15.3% (48.5) 78.82 ▼2.7% (81.05) 200=4695
/redis_receiver: Pro 60.91 ▼30.0% (86.98) 58.69 ▼14.0% (68.22) 125.93 ▼13.9% (146.23) 200=2660,3xx=5
/stream_shell_error_demo: Pro 345.76 ▲4.9% (329.75) 25.34 ▲22.4% (20.7) 34.7 ▼0.1% (34.75) 200=10447
/test_incremental_rendering: Pro 347.49 ▲3.4% (335.94) 16.29 ▼23.7% (21.35) 29.86 ▼17.6% (36.24) 200=10569
/rsc_posts_page_over_redis: Pro 66.47 ▼36.7% (104.93) 87.55 ▲38.6% (63.15) 128.02 ▲17.2% (109.24) 200=2837,3xx=20
/async_on_server_sync_on_client: Pro 326.44 ▲4.4% (312.81) 23.93 ▲6.8% (22.4) 35.6 ▼11.9% (40.42) 200=9865
/server_router: Pro 340.19 ▲3.8% (327.81) 18.79 ▼12.2% (21.41) 30.99 ▼11.5% (35.03) 200=10283
/unwrapped_rsc_route_client_render: Pro 371.35 ▼0.8% (374.4) 16.85 ▼14.8% (19.78) 28.48 ▼7.1% (30.66) 200=11223
/async_render_function_returns_string: Pro 357.76 ▲5.8% (338.01) 17.63 ▼15.4% (20.83) 29.22 ▼13.9% (33.94) 200=10812
/async_components_demo: Pro 175.05 ▼13.3% (201.96) 33.29 ▼9.2% (36.67) 52.22 ▲0.8% (51.78) 200=5291
/stream_native_metadata: Pro 348.38 ▲4.9% (331.98) 25.19 ▲19.3% (21.12) 35.01 ▲0.2% (34.93) 200=10461
/rsc_native_metadata: Pro 335.48 ▲3.7% (323.4) 22.76 ▲1.2% (22.5) 35.55 ▼0.5% (35.72) 200=10139
/client_side_hello_world: Pro 366.0 ▲2.7% (356.39) 21.33 ▲7.3% (19.88) 33.36 ▲8.5% (30.73) 200=11057
/client_side_hello_world_shared_store_controller: Pro 350.59 ▲3.9% (337.47) 21.69 ▲2.5% (21.15) 32.6 ▲0.6% (32.42) 200=10596
/server_side_hello_world_shared_store: Pro 236.37 ▼17.1% (285.27) 24.37 ▼5.9% (25.89) 40.32 ▲4.2% (38.69) 200=7144
/server_side_hello_world_shared_store_defer: Pro 302.22 ▲5.4% (286.77) 26.78 ▲2.5% (26.12) 38.24 ▲1.9% (37.53) 200=9134
/server_side_hello_world_hooks: Pro 357.61 ▲3.6% (345.02) 24.29 ▲16.2% (20.9) 35.01 ▼1.2% (35.44) 200=10804
/server_side_log_throw: Pro 339.65 ▼0.3% (340.65) 25.5 ▲17.8% (21.64) 34.56 ▼0.1% (34.6) 200=10261
/server_side_log_throw_raise: Pro 705.09 ▲9.3% (644.85) 11.18 ▼1.7% (11.37) 17.86 ▼2.7% (18.36) 3xx=21303
/server_side_hello_world_es5: Pro 345.18 ▲2.2% (337.88) 16.49 ▼23.1% (21.43) 30.41 ▼11.7% (34.44) 200=10501
/server_side_hello_world_with_options: Pro 347.39 ▲4.9% (331.01) 22.85 ▲5.0% (21.77) 35.19 ▲3.7% (33.94) 200=10497
/client_side_manual_render: Pro 379.56 ▲4.5% (363.33) 19.79 ▲1.1% (19.58) 30.26 ▼1.1% (30.59) 200=11470
/react_router: Pro 419.47 ▲7.5% (390.24) 17.41 ▼2.1% (17.77) 30.35 ▲2.7% (29.56) 200=12592
/css_modules_images_fonts_example: Pro 352.44 ▲4.8% (336.42) 25.01 ▲15.6% (21.63) 33.27 ▼0.9% (33.56) 200=10582
/rendered_html: Pro 286.34 ▼15.4% (338.45) 20.1 ▼5.6% (21.29) 62.61 ▲89.4% (33.06) 200=8653
/react_helmet: Pro 337.62 ▲2.0% (331.06) 23.33 ▲4.3% (22.38) 33.55 ▲0.8% (33.3) 200=10201
/image_example: Pro 353.88 ▲9.0% (324.6) 16.08 ▼28.3% (22.44) 29.87 ▼14.3% (34.85) 200=10696
/posts_page: Pro 64.91 ▼89.4% (615.2) 87.63 ▲177.0% (31.63) 154.17 ▲246.6% (44.48) 200=1965

▲/▼ non-zero change vs baseline · 0.0% exact/near-zero match · 🔴 significant regression · 🟢 significant improvement (tracked measures) · (n) = baseline

@justin808 justin808 merged commit 15ecd1e into main Jun 6, 2026
47 checks passed
@justin808 justin808 deleted the fix/3633-pro-dummy-integration-gating branch June 6, 2026 10:06
justin808 added a commit that referenced this pull request Jun 6, 2026
Bencher posted benchmark results on PR #3697, a CI-config-only change
(.github/workflows/pro-integration-tests.yml + the detector's own test
harness). Root cause: benchmark.yml derived which suites to run from the
test run_* flags, and those flags are deliberately inflated by CI
plumbing — the CI-infrastructure catch-all (script/, bin/, actions) ran
the full suite "for safety," and the suite-specific workflow YAML files
(pro-integration-tests.yml → PRO_DUMMY, node-renderer-tests.yml →
PRO_NODE_RENDERER, …) map to test categories. None of that can move
runtime performance, so the benchmark runs were pure noise.

Decouple benchmark selection from test selection:

- The detector now tracks BENCH_CORE / BENCH_PRO / BENCH_PRO_NODE_RENDERER,
  set ONLY by genuine product-source arms (gem/app/package/dummy code) and
  the uncategorized catch-all. CI plumbing (new CI_INFRA_CHANGED flag) and
  the suite-workflow YAML files (is_benchmark_irrelevant_path guard) set no
  BENCH flag, so they still run their TESTS but never benchmarks. The
  targeted per-workflow test mapping (ci-optimization.md) is preserved.
- The detector emits run_core_benchmarks / run_pro_benchmarks /
  run_pro_node_renderer_benchmarks; benchmark.yml reads those directly
  instead of re-deriving from run_* test flags. The mapping mirrors the
  old derivation exactly for every genuine-source category, so legit
  benchmark runs are unchanged — only CI-plumbing-triggered runs are
  dropped. push-to-main, workflow_dispatch, and the 'benchmark' PR label
  still run suites regardless.

Adds regression tests for the #3697 file set, a workflow-only change
(tests yes, benchmarks no), CI-infra + real source (benchmarks yes), and
node-renderer source (node-renderer benchmark yes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
justin808 added a commit that referenced this pull request Jun 7, 2026
## Why

Bencher kept posting benchmark results on PRs that can't possibly affect
runtime performance. Two reported cases:

-
**[#3717](#3717 (comment)
— a markdown/agent-tooling PR whose only non-`.md` file was a new
`.claude/skills` symlink.
-
**[#3697](#3697 (comment)
— a CI-config-only PR (`.github/workflows/pro-integration-tests.yml` +
the detector's own test harness).

Both slipped through because the change detector
(`script/ci-changes-detector`) is the single gate feeding every test
workflow **and** the benchmark matrix.

### Root cause 1 — non-`.md` repo metadata hits the catch-all (#3717)
The extensionless `.claude/skills` symlink matched no detector category
→ fell into the `*` catch-all ("run everything for safety") →
`non_runtime_only=false` → full suite + benchmarks. Same class as #3597.

### Root cause 2 — benchmarks piggyback on the test run-flags (#3697)
`benchmark.yml` derived which suites to run from the test `run_*` flags.
Those flags are intentionally inflated by CI plumbing: the
CI-infrastructure catch-all (`script/`, `bin/`, actions) runs the full
suite, and the suite-specific workflow YAMLs map to test categories
(`pro-integration-tests.yml` → `PRO_DUMMY`, `node-renderer-tests.yml` →
`PRO_NODE_RENDERER`, …). None of that moves runtime performance, but it
all dragged in benchmarks.

## Fix

**Commit 1 — agent/editor tooling is non-runtime.** Categorize
`.claude/**`, `.agents/**`, `.cursor/**` alongside
docs/`internal/**`/issue-templates, so agent-tooling PRs are
`non_runtime_only` (no tests, no benchmarks).

**Commit 2 — decouple benchmark selection from test selection.**
- The detector tracks `BENCH_CORE` / `BENCH_PRO` /
`BENCH_PRO_NODE_RENDERER`, set **only** by genuine product-source arms
(gem/app/package/dummy code) and the uncategorized catch-all. CI
plumbing (new `CI_INFRA_CHANGED` flag) and the suite-workflow YAMLs
(`is_benchmark_irrelevant_path` guard) set no `BENCH` flag — they still
run their **tests** but never benchmarks. The targeted per-workflow test
mapping (`ci-optimization.md`) is preserved.
- The detector emits `run_core_benchmarks` / `run_pro_benchmarks` /
`run_pro_node_renderer_benchmarks`; `benchmark.yml` reads them directly
instead of re-deriving from `run_*`. The mapping mirrors the old
derivation **exactly** for every genuine-source category, so legitimate
benchmark runs are unchanged — only CI-plumbing-triggered runs are
dropped.
- `push`-to-main, `workflow_dispatch`, and the `benchmark` PR label
still run suites regardless (the label is the escape hatch for
validating `benchmark.yml`/workflow edits).

## Verification

- Replayed both PRs' exact file sets against the patched detector: #3717
→ `non_runtime_only: true`; #3697 → all three `run_*_benchmarks: false`
while tests still run.
- Confirmed legit benchmark behavior is unchanged: core source → all 3
suites; node-renderer source → pro + node-renderer; pro-ruby → pro only;
a suite-workflow-only edit → its tests but no benchmark.
- `script/ci-changes-detector-test.bash`: **43 run, 0 failed** (added
regressions for both root causes). `actionlint` clean on
`benchmark.yml`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* CI now separates benchmark gating from test-selection, treats
additional agent/editor/tooling paths as docs/non-runtime-only, treats
CI-infrastructure edits as infra-only (forcing tests but suppressing
benchmarks), and exposes per-suite benchmark run flags in CI outputs
while preventing docs-only when CI infra changes.

* **Tests**
* Added regression tests covering docs/tooling paths, CI-infra-only
edits, suite-specific gating, combined infra+runtime edits, and
node-renderer benchmark behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Changes central CI gating logic for all workflows and benchmarks;
behavior is heavily regression-tested but misclassification could skip
needed benchmarks or still run Bencher on non-runtime PRs.
> 
> **Overview**
> Stops Bencher from running on PRs that only touch docs, agent tooling,
or CI config by tightening **what counts as non-runtime** and **when
benchmarks are selected**.
> 
> **`script/ci-changes-detector`** now treats `.claude/**`,
`.agents/**`, and `.cursor/**` like docs (fixes extensionless paths such
as `.claude/skills` hitting the catch-all). CI plumbing uses a dedicated
**`CI_INFRA_CHANGED`** flag: it still forces the full **test** matrix
but does **not** set benchmark flags. Benchmark gating is decoupled from
test `run_*` flags via **`BENCH_*`** flags and new outputs
**`run_core_benchmarks`**, **`run_pro_benchmarks`**, and
**`run_pro_node_renderer_benchmarks`**; workflow YAML under
`.github/workflows/*` is excluded from benchmark relevance via
**`is_benchmark_irrelevant_path`**.
> 
> **`.github/workflows/benchmark.yml`** drops inline benchmark
derivation and reads those detector outputs directly.
**`ci-commands.yml`** aligns the docs-only heuristic with the same
agent/editor paths.
> 
> Regression coverage in **`script/ci-changes-detector-test.bash`**
covers agent tooling, CI-only vs workflow-only edits, mixed CI + source,
and node-renderer benchmarks.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
92ba2bd. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

## Codex Decision Log

- Full-matrix validation: accepted CodeRabbit's current-head nit and
added the existing `full-ci` label so detector/benchmark-gating changes
receive full CI coverage before merge.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: Pro dummy integration tests skip on Pro-dummy-only changes (gated on run_pro_tests, not run_pro_dummy_tests)

2 participants