Skip to content

Spike: Prism-based Gemfile rewriter for Pro migration (#3313)#3349

Merged
justin808 merged 7 commits into
mainfrom
jg-conductor/3313-fix
Jun 2, 2026
Merged

Spike: Prism-based Gemfile rewriter for Pro migration (#3313)#3349
justin808 merged 7 commits into
mainfrom
jg-conductor/3313-fix

Conversation

@justin808

@justin808 justin808 commented May 21, 2026

Copy link
Copy Markdown
Member

Summary

Spike for #3313: a working Prism-based Gemfile rewriter prototype, a 32-case behavior matrix, a parse+rewrite benchmark vs. the current scanner, and a decision record.

All artifacts live under react_on_rails/spike/3313_prism_gemfile_rewriter/ and the directory is excluded from RuboCop — this is exploratory code, not production, and lib/ is untouched (per the issue's "Do not rewrite #3232" non-goal).

What the spike found

  • A Prism prototype is viable and produces cleaner output for the conditional case the issue calls out: if ENV["PRO"]; gem_pro; else; gem_base; end collapses to a single gem "react_on_rails_pro" declaration instead of leaving an ugly empty else.
  • The Prism implementation is also smaller (~210 lines) than the current scanner+rewrite path (~500 lines combined).
  • Performance is sub-millisecond either way (0.012ms vs. 0.015ms small Gemfile, 0.285ms vs. 0.304ms on 300 lines). Not a decision factor.
  • The one real cost is adding prism as a runtime dependency for the ~30% of users still on Ruby 3.0–3.2 (Prism is bundled with CRuby 3.3+).

Recommendation

Keep the current scanner. Revisit when react_on_rails drops Ruby 3.0–3.2 support — at that point Prism becomes a free runtime dep and the cutover is a 4–8 hour job using the prototype as the starting point.

Full reasoning, behavior matrix, performance numbers, conditional/empty-branch policy, and parse-failure policy in DECISION.md.

Not closing #3313

The issue is a discussion/enhancement asking for a decision record. This PR delivers that decision record. The maintainer can close #3313 once the recommendation is accepted (or push back / pick a different path).

Test plan

  • bundle exec rspec react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb — 32 examples, 0 failures
  • bundle exec ruby react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb — clean output, ratios within expected range
  • bundle exec rubocop react_on_rails/lib/react_on_rails/pro_migration.rb — RuboCop config change does not break linting of the production code path

🤖 Generated with Claude Code


Note

Low Risk
No production migration or lib changes; only spike artifacts plus gem/RuboCop exclusions for non-shipped code.

Overview
Adds an exploratory spike under react_on_rails/spike/3313_prism_gemfile_rewriter/ for issue #3313: a Prism-based Gemfile rewriter prototype, a large behavior-matrix spec, a benchmark vs. the existing text scanner, and DECISION.md recommending keep the current scanner until the gem’s Ruby floor is 3.3+ (avoid a runtime prism dep on 3.0–3.2).

Packaging / tooling: spike/ is excluded from the published gem (react_on_rails.gemspec) and from RuboCop (.rubocop.yml). lib/ and the shipped Pro migration rewrite are unchanged.

Reviewed by Cursor Bugbot for commit adef418. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features
    • Added an experimental Gemfile-rewriter prototype and benchmark to evaluate a Prism-based rewrite approach.
  • Chores
    • Excluded spike files from linting and from gem packaging.
  • Documentation
    • Added a DECISION record and README describing prototype behavior, compatibility notes, benchmarks, and next steps.
  • Tests
    • Added an extensive spike test suite covering many Gemfile rewrite scenarios, edge cases, and parse-failure handling.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a Prism-based Gemfile rewriter prototype, decision record, README, benchmark, RSpec spike tests covering the behavior matrix and parse-failure policy, and excludes spike files from linting and gem packaging.

Changes

Prism Gemfile Rewriter Spike Evaluation

Layer / File(s) Summary
Spike context, README, decision, and packaging
react_on_rails/.rubocop.yml, react_on_rails/spike/3313_prism_gemfile_rewriter/README.md, react_on_rails/spike/3313_prism_gemfile_rewriter/DECISION.md, react_on_rails/react_on_rails.gemspec
Adds RuboCop exclusion for spike/**/*, README describing the non-production prototype and run instructions, a comprehensive decision record (approach, behavior matrix, parse-failure policy, Ruby/Prism compatibility, performance notes, follow-up checklist), and excludes spike/ from gem packaging.
Core Prism-Based Rewriter Implementation
react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
PrismGemfileRewriter parses Gemfile with Prism, locates gem calls targeting react_on_rails, generates byte-offset edits to replace or remove base declarations (inserting default Pro version when appropriate), applies edits safely, and performs iterative collapse/removal of dead if/unless branches while preserving indentation and comments.
Test Coverage Matrix
react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
RSpec spike suite exercising single-line and multiline declarations, version pins and pessimistic pins, quote/paren preservation, options/kwargs, continuations and comments, parenthesized forms with postfix guards and trailing comments, nested parens in values, duplicates and cross-group cases, conditional collapse behavior, trailing-comma edge case, UTF-8/byte-offset correctness, and parse-failure policy.
Performance Benchmark and Comparison
react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb
Benchmark script comparing PrismGemfileRewriter to a simplified ScannerRewriter on small/medium/large Gemfile fixtures with warmup, total timings, ratios, and per-rewrite averages.
sequenceDiagram
  participant Rewriter as PrismGemfileRewriter
  participant Parser as Prism Parser
  participant Collector as Gem Call Collector
  participant Editor as Edit Generator
  participant Applier as Edit Applier
  participant Collapser as Conditional Collapser
  Rewriter->>Parser: parse(source)
  Parser-->>Rewriter: AST
  Rewriter->>Collector: collect gem nodes for react_on_rails / react_on_rails_pro
  Collector-->>Rewriter: node locations + literals
  Rewriter->>Editor: build replacement/removal edits
  Editor-->>Rewriter: edits (byte ranges + replacements)
  Rewriter->>Applier: apply_edits(source, edits)
  Applier-->>Rewriter: spliced source
  Rewriter->>Collapser: collapse_dead_conditionals(updated_source)
  Collapser-->>Rewriter: final source
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

enhancement

Poem

🐰 I hopped through Gemfiles, bytes in my paw,
Parsed with Prism, kept comments and law,
Benchmarks whispered, tests lined the track,
Collapsed empty branches, left nothing to hack.
A little spike hop — then back to the stack.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.13% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: a spike prototype for a Prism-based Gemfile rewriter for Pro migration, with clear reference to the issue number (#3313).
Linked Issues check ✅ Passed The PR comprehensively addresses all key objectives: working prototype with location-based source edits, behavior matrix with 32+ test cases covering all acceptance criteria, performance benchmarks vs. existing scanner, decision record with findings and recommendations, error-tolerance policy, and conditional-collapse policy.
Out of Scope Changes check ✅ Passed All changes are in-scope spike artifacts: prototype code in spike/ directory, RuboCop exclusion for spike/, gemspec exclusion of spike/, and supporting tests/benchmarks. No modifications to production code or #3232 implementation.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jg-conductor/3313-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c4851ca25c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

end

def statement_byte_range(source, node)
start_offset = line_start_offset(source, node.location.start_offset)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Limit stale-gem deletion to statement, not full line

When has_active_pro is true, statement_byte_range starts deletion at the beginning of the line and ends at newline, so a base gem on a shared line (e.g. semicolon-separated declarations or one-line conditionals) removes unrelated statements too, including react_on_rails_pro. For example, gem "react_on_rails_pro"; gem "react_on_rails" is rewritten to an empty line, which can silently drop required dependencies.

Useful? React with 👍 / 👎.


def has_user_version_pin?(call)
positional_args = call.arguments.arguments.drop(1).reject { |a| a.is_a?(Prism::KeywordHashNode) }
positional_args.any? { |a| a.is_a?(Prism::StringNode) }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat any positional arg as an existing version requirement

has_user_version_pin? only recognizes Prism::StringNode pins, so non-literal requirements (for example gem "react_on_rails", SOME_VERSION) are treated as unpinned and get an extra default version injected. That changes dependency semantics by adding an additional constraint and can produce conflicting or invalid requirements after rewrite.

Useful? React with 👍 / 👎.

@claude

claude Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Code Review

Summary: Well-executed research spike that directly answers #3313. The decision record is thorough, the recommendation is clearly reasoned, and the prototype does real work. Net change to the production surface is exactly one RuboCop exclusion line — appropriately scoped for exploratory work.


What works well

  • Architecture of the rewriter: Parse → collect CallNodes → reverse-sorted byte-offset splices is the standard correct approach for source rewriting without an AST formatter. apply_edits correctly sorts in descending start_offset order to avoid offset invalidation after each splice.
  • collapse_dead_conditionals loop: Re-parsing after each edit is heavier than a single-pass approach but entirely appropriate for a once-per-migration tool. The loop exit on parse_result.failure? is a reasonable safety net.
  • Behavior matrix: 32 cases including the exact edge cases that stress the current scanner (multiline continuations, inline-comment continuations, postfix guards, nested parens, duplicate declarations across groups). The expect_parseable_ruby helper validating output is a nice touch.
  • Parse-failure policy: Returning original content untouched with structured errors is the right call and is tested.
  • Decision record: The DECISION.md is genuinely useful — it covers approach, behavior differences, conditional-collapse policy, compatibility table, performance numbers, and an explicit migration checklist. Future maintainers can pick this up without re-running the spike.

Issues worth noting before promotion to lib/

A few things that are fine in spike context but should be addressed if/when this moves to production (per DECISION.md step 1–7):

1. enclosing_statement_node is a documented no-op

enclosing_statement_node returns call unconditionally. For the removal path this currently works because statement_byte_range walks forward from node.location.end_offset to the line's newline — which captures if ENV["X"] in postfix-modifier cases. But Prism wraps gem "foo" if cond in an IfNode, meaning the CallNode's end_offset stops at the closing quote of the gem name, not at the end of the condition. The forward-scan compensates, but only for single-line modifiers. A contrived multiline postfix (unusual but syntactically valid in some forms) could leave a dangling condition. See inline comment.

2. collapse_dead_conditionals silently returns partially-modified source on internal error

If an edit produces invalid Ruby mid-loop (shouldn't happen in practice, but), break source if parse_result.failure? returns the partially-modified content with parse_failed: false from the outer rewrite call. The outer result struct doesn't surface this. For production, this path should either set parse_failed: true on the outer result or raise.

3. Benchmark double-measures without warm-up on the second pass

The second measurement block (per-rewrite cost table, lines ~462–470) re-runs without the 3-iteration warm-up. On a cold JIT the first case's result will be inflated vs. the first table. Minor, but worth noting if the numbers are ever cited.

4. unless conditional path is untested

find_collapse_edit explicitly handles Prism::UnlessNode, but no spec exercises it. Easy to add before promotion.


Decision record feedback

The recommendation to wait is sound. One addition worth making explicit in the record before closing #3313: the Ruby floor milestone is also a signal to review the parser gem (whitequark) transitive dependency, since some projects bring it in through other tooling and a future Prism::Translation::Parser migration could land at the same time. Not a blocker, just a colocated decision.


Verdict: Approve as a spike/decision record. No production code is affected. The inline comments below flag specific lines for the future implementation PR, not for this one.

@greptile-apps

greptile-apps Bot commented May 21, 2026

Copy link
Copy Markdown

Greptile Summary

This PR delivers the decision record requested by #3313: a working Prism-based Gemfile rewriter prototype, a 32-case behavior matrix, a parse-and-rewrite benchmark, and the recommendation to keep the current scanner until react_on_rails drops Ruby 3.0–3.2 support (at which point Prism is a free stdlib dep). All new files live under spike/3313_prism_gemfile_rewriter/ and are excluded from RuboCop; lib/ is untouched.

  • prism_gemfile_rewriter.rb implements parse → locate gem CallNodes → byte-offset splices, plus a second-pass collapse_dead_conditionals loop that resolves the empty-else ugliness the current scanner leaves behind.
  • prism_gemfile_rewriter_spec.rb provides a 32-case matrix (single-line, multiline, parenthesized, kwargs, conditional collapse, parse-failure policy) that all pass.
  • benchmark.rb confirms both approaches are sub-millisecond and the performance difference is not a decision factor.

Confidence Score: 4/5

Safe to merge — all changes are in a clearly-isolated spike directory and the only production file touched is a one-line .rubocop.yml exclusion.

The prototype code is well-structured and the 32-case spec gives good confidence in its logic. Two minor gaps worth noting before a future production port: enclosing_statement_node is an unfulfilled stub whose misleading comment could confuse a future implementer, and has_user_version_pin? injects a default version alongside git:/path: source options. Neither affects the spike conclusions or the production lib/ code.

prism_gemfile_rewriter.rb lines 603-631 for the two clarity/logic notes; all other files are straightforward.

Important Files Changed

Filename Overview
react_on_rails/.rubocop.yml Adds spike/**/* glob to the RuboCop AllCops Exclude list so the exploratory prototype is not linted. Change is minimal and correct.
react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb Prism AST-based Gemfile rewriter prototype. Core logic is sound; two code clarity concerns: enclosing_statement_node is an unfulfilled stub whose comment describes unimplemented parent-pointer logic, and has_user_version_pin? does not treat source options (git:, path:, github:) as implicit pins, causing the default version to be injected alongside a git/path source.
react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb 32-case behavior matrix covering all shapes from issue #3313. Comprehensive coverage including postfix guards, multiline continuations, parenthesized calls, conditional collapse, and parse-failure policy.
react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb Compares scanner vs Prism parse+rewrite timing across three Gemfile sizes. The per-rewrite cost summary block re-runs measurements without its own warm-up iterations.
react_on_rails/spike/3313_prism_gemfile_rewriter/DECISION.md Well-structured decision record covering behavior matrix, conditional-collapse policy, parse-failure policy, compatibility notes, performance numbers, and a clear recommendation.
react_on_rails/spike/3313_prism_gemfile_rewriter/README.md Minimal orientation document explaining spike scope, listing files, and providing run commands. No issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[rewrite source] --> B{Prism.parse}
    B -- failure? --> C[Return original + parse_failed: true]
    B -- success --> D[collect_gem_calls\nbase_calls / pro_calls]
    D --> E{base_calls empty?}
    E -- yes --> F[Return source unchanged]
    E -- no --> G{pro_calls any?}
    G -- yes: has_active_pro --> H[removal_edit per base call\nline_start to next newline]
    G -- no --> I[replacement_edit per base call\nreplace first StringNode only]
    H --> J[apply_edits reverse order]
    I --> J
    J --> K{has_active_pro?}
    K -- yes --> L[collapse_dead_conditionals\nloop + re-parse]
    K -- no --> M[Return Result]
    L --> N{find_collapse_edit}
    N -- found --> O{single Pro gem\nin surviving branch?}
    O -- yes --> P[collapse entire conditional\nto gem declaration]
    O -- no --> Q[remove empty branch only]
    P --> L
    Q --> L
    N -- none --> M
Loading

Comments Outside Diff (3)

  1. react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb, line 625-631 (link)

    P2 enclosing_statement_node is a documented stub whose comment describes logic that was never implemented. The comment says "we need to detect a postfix modifier by looking at the source slice" because "Prism's location for the CallNode does NOT include the modifier" — but then the method just returns call unconditionally. Removal correctness is rescued only because statement_byte_range independently scans forward to the newline, picking up any trailing modifier text. For a future implementer reading this code, the gap between the comment and the body is misleading: it reads as incomplete work rather than an intentional design. Consider either completing the parent-pointer walk or replacing the comment with a clear note that statement_byte_range already handles postfix modifiers by scanning to EOL.

  2. react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb, line 603-606 (link)

    P2 has_user_version_pin? treats source options (git:, path:, github:) as "no pin", injecting the default version alongside them. For gem "react_on_rails", git: "https://...", the rewriter produces gem "react_on_rails_pro", "~> 16.7", git: "https://...". The version constraint is meaningless next to a git: source and is semantically odd next to a path: source. This matches the current scanner's behaviour, so it is not a regression, and the spec explicitly documents it. However, the production implementation listed in DECISION.md should handle this case: treat the presence of any of git:, path:, or github: as an implicit pin to avoid inserting a redundant version argument.

  3. react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb, line 463-470 (link)

    P2 "Per-rewrite cost" section re-runs measurements without its own warm-up. The three warm-up iterations in the first block do not carry over here, so the first entry in the second block could include first-call overhead. In practice the first block already warms up both rewriters for every fixture, so the effect is negligible — but if someone reads the two printed tables in isolation they may not realise the per-rewrite numbers come from an unwarmed second pass.

Reviews (1): Last reviewed commit: "docs(spike): evaluate Prism-based Gemfil..." | Re-trigger Greptile

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb`:
- Line 95: The long `%w[...]` iterator lines (the array literal followed by
`.each do |g|`) exceed the 120-char limit; split the array across multiple lines
or assign it to a local constant/variable (e.g., gems = %w[ ... ]) and then call
`.each do |g|` on the next line so the array elements are wrapped, and do the
same fix for the second occurrence around line 103 to keep each line under 120
characters.

In
`@react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb`:
- Line 21: The failing spec has overly long expectation strings (the line
building "expected parseable Ruby, got errors:
#{parse_result.errors.map(&:message).join(', ')}" and a similar long string at
line 300); break these into multiple concatenated string literals or use
multi-line string interpolation so each line stays under 120 chars. Locate the
expectation that references parse_result.errors.map(&:message).join(', ') in
prism_gemfile_rewriter_spec.rb and split the message into two or more quoted
pieces (or interpolate the joined errors into a shorter variable) so the
combined output is unchanged but no source line exceeds 120 characters; apply
the same wrap strategy to the other long expectation at line 300.

In `@react_on_rails/spike/3313_prism_gemfile_rewriter/README.md`:
- Around line 20-24: The fenced code block in README.md containing the shell
commands is missing a language identifier; update the opening triple-backticks
to include a language tag (e.g., change "```" to "```bash") for the block that
contains the three commands (cd react_on_rails, bundle exec rspec ..., bundle
exec ruby ...) so markdownlint MD040 is satisfied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a300e3c-bf9f-4384-bd9f-454676ab6a70

📥 Commits

Reviewing files that changed from the base of the PR and between 5a94d40 and c4851ca.

📒 Files selected for processing (6)
  • react_on_rails/.rubocop.yml
  • react_on_rails/spike/3313_prism_gemfile_rewriter/DECISION.md
  • react_on_rails/spike/3313_prism_gemfile_rewriter/README.md
  • react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb
  • react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
  • react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb

Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb Outdated
Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb Outdated
Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/README.md Outdated
Comment on lines +149 to +155
def enclosing_statement_node(call)
# We do not have an explicit parent pointer; the caller passes the CallNode, so
# we need to detect a postfix modifier by looking at the source slice. Prism's
# location for the CallNode does NOT include the modifier, so we re-locate by
# inspecting the source after the call's end_offset.
call
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stub works today because statement_byte_range does a forward-scan from node.location.end_offset to the next \n, which incidentally captures any postfix modifier text ( if ENV["X"]). The forward-scan compensates correctly as long as postfix modifiers are single-line, which is always true in practice.

For the production promotion, the right fix is to build a parent_map while collecting gem calls so you can walk upward to the actual IfNode / UnlessNode wrapper:

def collect_gem_calls(node, base_calls, pro_calls, parent_map = {})
  return unless node
  node.compact_child_nodes.each do |child|
    parent_map[child] = node
    if gem_call?(child)
      # ...
    end
    collect_gem_calls(child, base_calls, pro_calls, parent_map)
  end
end

Not a concern for the spike, but noting it for the DECISION.md step 3 implementation.

Comment on lines +195 to +204
def collapse_dead_conditionals(source)
loop do
parse_result = Prism.parse(source)
break source if parse_result.failure?

edit = find_collapse_edit(source, parse_result.value)
break source unless edit

source = source[0...edit[:start_offset]] + edit[:replacement] + source[edit[:end_offset]..]
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of the splice edits produce invalid Ruby (shouldn't happen with well-formed input, but defensive to consider), this loop exits on parse_result.failure? and returns the partially-modified source. The outer rewrite call then returns that content with parse_failed: false, which could mislead the caller.

For production, consider threading a failure flag back out:

def collapse_dead_conditionals(source)
  loop do
    parse_result = Prism.parse(source)
    return [source, true] if parse_result.failure?   # signal internal failure
    edit = find_collapse_edit(source, parse_result.value)
    return [source, false] unless edit
    source = source[0...edit[:start_offset]] + edit[:replacement] + source[edit[:end_offset]..]
  end
end

Then the outer rewrite can set parse_failed: true on the result. Fine to leave as-is for the spike.

Comment on lines +147 to +154
puts "Per-rewrite cost (scanner / prism):"
GEMFILES.each do |label, src|
scanner_seconds = measure(ITERATIONS) { scanner.rewrite(src) }
prism_seconds = measure(ITERATIONS) { prism.rewrite(src) }
printf(" %-50s %8.3fms / %8.3fms\n",
"#{label} (#{src.lines.size}l)",
(scanner_seconds / ITERATIONS) * 1000,
(prism_seconds / ITERATIONS) * 1000)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second measurement loop re-measures without the 3-iteration warm-up that the first loop uses (line 130), so the first Gemfile in this block will see first-call overhead that the first table avoided. The numbers in DECISION.md come from this block ("Per-rewrite cost"), so they may be slightly pessimistic for both implementations on the first entry.

Easy fix for when the numbers are formally cited:

puts "Per-rewrite cost (scanner / prism):"
GEMFILES.each do |label, src|
  3.times { scanner.rewrite(src); prism.rewrite(src) }  # warm up
  scanner_seconds = measure(ITERATIONS) { scanner.rewrite(src) }
  prism_seconds   = measure(ITERATIONS) { prism.rewrite(src) }
  printf("  %-50s %8.3fms / %8.3fms\n",
         "#{label} (#{src.lines.size}l)",
         (scanner_seconds / ITERATIONS) * 1000,
         (prism_seconds / ITERATIONS) * 1000)
end

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: UnlessNode lacks subsequent method, crashes at runtime
    • Added an IfNode/UnlessNode else-clause helper so conditional collapse no longer calls subsequent on UnlessNode.
  • ✅ Fixed: Spike files included in published gem package
    • Updated the gemspec file filter to exclude the spike/ directory from packaged gem files.
  • ✅ Fixed: Byte offsets used as character indices corrupts non-ASCII files
    • Replaced character-index slicing and newline scans with byte-based helpers using byteslice, bytesize, and getbyte.

Create PR

Or push these changes by commenting:

@cursor push e8672e795d
Preview (e8672e795d)
diff --git a/react_on_rails/react_on_rails.gemspec b/react_on_rails/react_on_rails.gemspec
--- a/react_on_rails/react_on_rails.gemspec
+++ b/react_on_rails/react_on_rails.gemspec
@@ -26,7 +26,7 @@
 
   s.files = Dir.chdir(__dir__) do
     `git ls-files -z`.split("\x0").reject do |f|
-      f.match(%r{^(spec|tmp)/})
+      f.match(%r{^(spec|tmp|spike)/})
     end
   end
   s.bindir        = "exe"

diff --git a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
--- a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
+++ b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
@@ -13,6 +13,7 @@
     class PrismGemfileRewriter
       BASE_GEM_NAME = "react_on_rails"
       PRO_GEM_NAME = "react_on_rails_pro"
+      NEWLINE_BYTE = "\n".ord
 
       Result = Struct.new(
         :content,
@@ -130,7 +131,7 @@
       end
 
       def quote_char(source, string_node)
-        source[string_node.location.start_offset]
+        source.byteslice(string_node.location.start_offset, 1)
       end
 
       def removal_edit(source, call)
@@ -160,18 +161,28 @@
         # Walk forward from the node's end_offset to the end of the source line so that
         # we also delete trailing modifier text (e.g. `if ENV["X"]`) and the newline.
         scan = node.location.end_offset
-        scan += 1 while scan < source.length && source[scan] != "\n"
-        scan += 1 if scan < source.length && source[scan] == "\n"
+        scan += 1 while scan < source.bytesize && source.getbyte(scan) != NEWLINE_BYTE
+        scan += 1 if scan < source.bytesize && source.getbyte(scan) == NEWLINE_BYTE
         [start_offset, scan]
       end
 
       def apply_edits(source, edits)
         sorted = edits.sort_by { |e| -e[:start_offset] }
         sorted.reduce(source) do |acc, edit|
-          acc[0...edit[:start_offset]] + edit[:replacement] + acc[edit[:end_offset]..]
+          splice_source(acc, edit[:start_offset], edit[:end_offset], edit[:replacement])
         end
       end
 
+      def byte_slice(source, start_offset, end_offset)
+        source.byteslice(start_offset, end_offset - start_offset) || ""
+      end
+
+      def splice_source(source, start_offset, end_offset, replacement)
+        prefix = source.byteslice(0, start_offset) || ""
+        suffix = source.byteslice(end_offset, source.bytesize - end_offset) || ""
+        prefix + replacement + suffix
+      end
+
       # After removing base gem statements, conditionals like
       #   if ENV["PRO"]
       #     gem "react_on_rails_pro", "16.0.0"
@@ -200,7 +211,7 @@
           edit = find_collapse_edit(source, parse_result.value)
           break source unless edit
 
-          source = source[0...edit[:start_offset]] + edit[:replacement] + source[edit[:end_offset]..]
+          source = splice_source(source, edit[:start_offset], edit[:end_offset], edit[:replacement])
         end
       end
 
@@ -226,8 +237,9 @@
         return nil unless node.respond_to?(:end_keyword_loc) && node.end_keyword_loc
 
         then_branch = node.statements
-        has_else_node = node.subsequent.is_a?(Prism::ElseNode)
-        else_statements = has_else_node ? node.subsequent.statements : nil
+        else_node = conditional_else_node(node)
+        has_else_node = !else_node.nil?
+        else_statements = has_else_node ? else_node.statements : nil
 
         then_empty = branch_empty?(then_branch)
         else_empty = has_else_node && branch_empty?(else_statements)
@@ -250,7 +262,7 @@
         return nil unless single_pro_gem_call?(branch_statements)
 
         gem_call = branch_statements.body.first
-        gem_text = source[gem_call.location.start_offset...gem_call.location.end_offset]
+        gem_text = byte_slice(source, gem_call.location.start_offset, gem_call.location.end_offset)
         # Preserve indentation of the original conditional.
         indent = leading_indentation(source, conditional.location.start_offset)
         {
@@ -270,9 +282,10 @@
 
       def remove_empty_else_branch(source, node)
         # Find the `else` keyword location and remove from there to just before `end`.
-        return nil unless node.subsequent.is_a?(Prism::ElseNode)
+        else_node = conditional_else_node(node)
+        return nil unless else_node
 
-        else_loc = node.subsequent.else_keyword_loc
+        else_loc = else_node.else_keyword_loc
         end_loc = node.end_keyword_loc
         return nil unless else_loc && end_loc
 
@@ -294,21 +307,29 @@
 
       def leading_indentation(source, offset)
         line_start = line_start_offset(source, offset)
-        slice = source[line_start...offset]
+        slice = byte_slice(source, line_start, offset)
         slice[/\A\s*/].to_s
       end
 
       def line_start_offset(source, offset)
         return 0 if offset.zero?
 
-        idx = source.rindex("\n", offset - 1)
+        idx = source.b.rindex("\n".b, offset - 1)
         idx ? idx + 1 : 0
       end
 
       def line_end_offset(source, offset)
-        idx = source.index("\n", offset)
-        idx ? idx + 1 : source.length
+        idx = source.b.index("\n".b, offset)
+        idx ? idx + 1 : source.bytesize
       end
+
+      def conditional_else_node(node)
+        if node.is_a?(Prism::IfNode)
+          node.subsequent if node.subsequent.is_a?(Prism::ElseNode)
+        elsif node.is_a?(Prism::UnlessNode)
+          node.else_clause
+        end
+      end
     end
   end
 end

diff --git a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
--- a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
+++ b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
@@ -54,6 +54,18 @@
         result = rewriter.rewrite(src)
         expect(result.content).to eq("gem \"react_on_rails_pro\", \"#{default_pro_version}\"\n")
       end
+
+      it "rewrites declarations after UTF-8 content using byte offsets" do
+        src = <<~RUBY
+          # Dépendances
+          gem "react_on_rails", "~> 16.0"
+        RUBY
+        result = rewriter.rewrite(src)
+        expect(result.content).to eq(<<~RUBY)
+          # Dépendances
+          gem "react_on_rails_pro", "~> 16.0"
+        RUBY
+      end
     end
 
     context "kwargs and options" do
@@ -286,6 +298,24 @@
         expect_parseable_ruby(result.content)
       end
 
+      it "does not crash on unrelated unless blocks when removing stale base" do
+        src = <<~RUBY
+          unless ENV["CI"]
+            gem "pry"
+          end
+          gem "react_on_rails", "~> 16.0"
+          gem "react_on_rails_pro", "~> 16.0"
+        RUBY
+        result = rewriter.rewrite(src)
+        expect(result.content).to eq(<<~RUBY)
+          unless ENV["CI"]
+            gem "pry"
+          end
+          gem "react_on_rails_pro", "~> 16.0"
+        RUBY
+        expect_parseable_ruby(result.content)
+      end
+
       it "rewrites base declarations in both conditional branches" do
         src = <<~RUBY
           if ENV["ROR_EDGE"]
@@ -318,6 +348,23 @@
         expect_parseable_ruby(result.content)
       end
 
+      it "collapses stale base conditionals after UTF-8 content using byte offsets" do
+        src = <<~RUBY
+          # Dépendances
+          if ENV["PRO"]
+            gem "react_on_rails_pro", "16.0.0"
+          else
+            gem "react_on_rails", "16.0.0"
+          end
+        RUBY
+        result = rewriter.rewrite(src)
+        expect(result.content).to eq(<<~RUBY)
+          # Dépendances
+          gem "react_on_rails_pro", "16.0.0"
+        RUBY
+        expect_parseable_ruby(result.content)
+      end
+
       it "preserves indentation when collapsing an indented conditional" do
         src = <<~RUBY
           group :development do

You can send follow-ups to the cloud agent here.

Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Semicolon scan corrupts gem removal
    • Updated statement-boundary detection to ignore semicolons inside quoted strings and comments, with regression specs covering stale base removal.

Create PR

Or push these changes by commenting:

@cursor push 9dc374d521
Preview (9dc374d521)
diff --git a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
--- a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
+++ b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
@@ -15,6 +15,10 @@
       PRO_GEM_NAME = "react_on_rails_pro"
       NEWLINE_BYTE = "\n".ord
       SEMICOLON_BYTE = ";".ord
+      COMMENT_BYTE = "#".ord
+      BACKSLASH_BYTE = "\\".ord
+      DOUBLE_QUOTE_BYTE = '"'.ord
+      SINGLE_QUOTE_BYTE = "'".ord
       SOURCE_OPTION_KEYS = %w[git github path].freeze
 
       Result = Struct.new(
@@ -156,20 +160,19 @@
       def statement_byte_range(source, node)
         line_start = line_start_offset(source, node.location.start_offset)
         line_end = line_end_offset(source, node.location.end_offset)
-        statement_end = node.location.end_offset
+        next_semicolon = next_semicolon_offset(
+          source,
+          line_start_offset(source, node.location.end_offset),
+          line_end,
+          node.location.end_offset
+        )
+        statement_end = next_semicolon || line_terminator_offset(source, line_end)
 
-        while statement_end < line_end
-          byte = source.getbyte(statement_end)
-          break if byte == NEWLINE_BYTE || byte == SEMICOLON_BYTE
-
-          statement_end += 1
-        end
-
         previous_semicolon = previous_semicolon_offset(source, line_start, node.location.start_offset)
         if previous_semicolon
           [previous_semicolon, statement_end]
-        elsif statement_end < line_end && source.getbyte(statement_end) == SEMICOLON_BYTE
-          end_offset = statement_end + 1
+        elsif next_semicolon
+          end_offset = next_semicolon + 1
           end_offset += 1 while end_offset < line_end && horizontal_space?(source.getbyte(end_offset))
           [line_start, end_offset]
         else
@@ -178,16 +181,45 @@
       end
 
       def previous_semicolon_offset(source, line_start, statement_start)
-        scan = statement_start - 1
-        while scan >= line_start
-          return scan if source.getbyte(scan) == SEMICOLON_BYTE
+        statement_separator_offsets(source, line_start, statement_start).last
+      end
 
-          scan -= 1
+      def next_semicolon_offset(source, line_start, line_end, statement_end)
+        statement_separator_offsets(source, line_start, line_end).find { |offset| offset >= statement_end }
+      end
+
+      def statement_separator_offsets(source, line_start, line_end)
+        offsets = []
+        quote_byte = nil
+        scan = line_start
+        while scan < line_end
+          byte = source.getbyte(scan)
+          break if byte == NEWLINE_BYTE || (!quote_byte && byte == COMMENT_BYTE)
+
+          if quote_byte
+            if byte == BACKSLASH_BYTE
+              scan += 2
+              next
+            end
+
+            quote_byte = nil if byte == quote_byte
+          elsif byte == DOUBLE_QUOTE_BYTE || byte == SINGLE_QUOTE_BYTE
+            quote_byte = byte
+          elsif byte == SEMICOLON_BYTE
+            offsets << scan
+          end
+
+          scan += 1
         end
 
-        nil
+        offsets
       end
 
+      def line_terminator_offset(source, line_end)
+        newline_offset = line_end - 1
+        newline_offset >= 0 && source.getbyte(newline_offset) == NEWLINE_BYTE ? newline_offset : line_end
+      end
+
       def horizontal_space?(byte)
         byte == " ".ord || byte == "\t".ord
       end

diff --git a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
--- a/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
+++ b/react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb
@@ -309,6 +309,23 @@
         expect_parseable_ruby(result.content)
       end
 
+      it "ignores semicolons in trailing comments when removing a stale base declaration" do
+        src = <<~RUBY
+          gem "react_on_rails", "~> 16.0" # note; keep removing the whole line
+          gem "react_on_rails_pro", "~> 16.0"
+        RUBY
+        result = rewriter.rewrite(src)
+        expect(result.content).to eq("gem \"react_on_rails_pro\", \"~> 16.0\"\n")
+        expect_parseable_ruby(result.content)
+      end
+
+      it "ignores semicolons in quoted trailing guards when removing a stale base declaration" do
+        src = "gem \"react_on_rails\" if ENV[\"ROR;ENABLED\"]; gem \"react_on_rails_pro\", \"~> 16.0\"\n"
+        result = rewriter.rewrite(src)
+        expect(result.content).to eq("gem \"react_on_rails_pro\", \"~> 16.0\"\n")
+        expect_parseable_ruby(result.content)
+      end
+
       it "removes stale base when Pro is parenthesized with comments" do
         src = <<~RUBY
           source "https://rubygems.org"

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 19d9605. Configure here.

Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb Outdated
@coderabbitai coderabbitai Bot added the P3 Parked priority label May 24, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 19d9605674

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +161 to +163
while statement_end < line_end
byte = source.getbyte(statement_end)
break if byte == NEWLINE_BYTE || byte == SEMICOLON_BYTE

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Ignore semicolons in comments/strings during stale-gem removal

When has_active_pro is true, statement_byte_range stops at the first ; byte after the base gem call, but that byte is not guaranteed to be a statement separator. If the stale line has a trailing comment or postfix expression containing ; (e.g. gem "react_on_rails", "~> 16.0" # old; remove), the splice removes only part of the line and leaves tail text behind (like remove), changing Gemfile semantics or producing invalid Ruby. This should use AST locations/token context (or at least comment/string-aware scanning) instead of raw semicolon scanning.

Useful? React with 👍 / 👎.

@claude

claude Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Code Review — Spike: Prism-based Gemfile rewriter (#3313)

Overall

This is a well-executed spike. The non-goal (no changes to lib/) is respected, the decision record is thorough, and the 32-case spec matrix is the right artifact for validating a rewriter prototype. The recommendation to wait until the Ruby floor reaches 3.3 is sound — adding a native-extension runtime dep for a once-per-project generator is a poor trade, and the decision record documents this clearly enough that a future maintainer can reverse the decision with confidence.

The configuration changes (.rubocop.yml, gemspec) are minimal and correct.


What's done well

  • Byte-offset splicing is the right foundation. Mixing character-level and byte-level ops via .b is deliberate and correct; UTF-8 test cases confirm it.
  • Right-to-left edit application in apply_edits avoids offset invalidation for multiple simultaneous edits — this is easy to get wrong and the prototype gets it right.
  • Parse-failure policy (return original untouched, don't fall back to the scanner) is the correct call and is tested with a malformed fixture.
  • collapse_dead_conditionals handles the key case (Spike: evaluate Prism-based Gemfile rewriting for Pro migration generator #3313's stated ugliness) cleanly and the edge cases that are intentionally left alone are documented.
  • Behavior matrix in DECISION.md is honest about the two observable differences vs. the current scanner (paren preservation, empty-else collapse), neither of which is a regression.
  • Production migration checklist in DECISION.md (7 steps) is specific enough to be actionable when the Ruby floor rises.

Issues

Minor bugs / inaccuracies:

  1. Benchmark fixture label mismatch. The "medium (80 lines, 1 ror entry, groups)" fixture actually generates ~45 lines (1 source + 10 base gems + 1 ror + 8 additional + 3 group headers + 3 ends + 3 blank lines + 16 grouped gems ≈ 45). The DECISION.md table correctly says "~45 lines". The benchmark key is the discrepancy. (See inline comment.)

  2. Benchmark runs two full measurement passes. The second GEMFILES.each block re-warms and re-measures every fixture just to print per-rewrite costs, which the first pass already has all the data for. Totals and per-rewrite costs could be computed from one measurement. (See inline comment.)

Style:

  1. has_user_version_pin? violates Ruby predicate naming convention. Ruby convention for boolean query methods drops the has_ prefix: user_version_pin?. Matters if this moves to production. (See inline comment.)

Design notes (non-blocking):

  1. collapse_dead_conditionals is O(N²) in the number of collapsible conditionals. Each iteration re-parses the entire source. For a Gemfile with k collapsible conditionals that's k full parse cycles. In practice k ≤ 2 and Gemfiles are small, so this is irrelevant. Worth a comment if it moves to lib/.

  2. remove_empty_then_branch is a documented stub. The nil return is intentional and the comment explains why. Fine for a spike; the production migration checklist should include implementing this case if it's decided to be in scope.


Verdict

Approve as a spike artifact. The decision record and prototype deliver exactly what #3313 asked for. No production code is touched. The two minor benchmark inaccuracies are worth a fix before merge so the numbers in the repo match the actual fixture sizes.

gem "jbuilder"
gem "bcrypt"
RUBY
"medium (80 lines, 1 ror entry, groups)" => begin

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label says 80 lines but this fixture generates ~45 lines (1 source + 10 base gems + 1 react_on_rails + 8 additional gems + 3 group headers + 3 ends + 3 blank lines + ~16 grouped gems). DECISION.md correctly says "~45 lines" in its performance table — the two should agree.

Suggested change
"medium (80 lines, 1 ror entry, groups)" => begin
"medium (~45 lines, 1 ror entry, groups)" => begin

end
end

def has_user_version_pin?(call)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruby convention for predicate methods drops the has_ prefix — user_version_pin? reads more naturally and matches the style of the rest of the class (e.g. gem_call?, branch_empty?, source_option_present?). Worth fixing if this moves to lib/.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0bc28000cd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +364 to +366
start_offset: line_start_offset(source, conditional.location.start_offset),
end_offset: line_end_offset(source, conditional.location.end_offset),
replacement: "#{indent}#{gem_text}\n"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve adjacent statements when collapsing a conditional

collapse_to_branch replaces from line_start_offset of the if to line_end_offset of the end, which assumes the conditional occupies the whole line; if there are sibling statements on the same line (for example end; gem "rails"), those statements are deleted during collapse. I verified this by running the rewriter on an if/else pro/base block followed by ; gem "rails", and the output dropped the rails dependency entirely.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb`:
- Around line 58-59: The current logic uses
preexisting_empty_conditional_fingerprints (from
empty_conditional_fingerprints(source, program)) and checks via include? which
only matches fingerprint values, causing duplicate identical conditionals to be
treated as already-collapsed; change this to track occurrences/counts instead of
simple membership: build a frequency map (or list of consumed indices) of
fingerprints from empty_conditional_fingerprints and then, when scanning
conditionals (the same places that call
preexisting_empty_conditional_fingerprints and use include?), consume/decrement
the count (or mark the specific occurrence) for each matched fingerprint so
identical fingerprints are only considered preexisting as many times as they
actually appear; apply the same occurrence-aware logic wherever the code
currently relies on include? (the variables/methods
preexisting_empty_conditional_fingerprints, empty_conditional_fingerprints, and
the surrounding conditional-collapsing checks).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a4b378e5-b8ff-49f3-a400-1acdeb5acf1e

📥 Commits

Reviewing files that changed from the base of the PR and between 19d9605 and 0bc2800.

📒 Files selected for processing (2)
  • react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb
  • react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter_spec.rb

Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/prism_gemfile_rewriter.rb Outdated
@claude

claude Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Code Review: Spike — Prism-based Gemfile rewriter (#3313)

Overall verdict: ship it. The spike is well-scoped, the decision record is thorough, and the recommendation (wait for the Ruby floor to rise to 3.3) is well-reasoned. Anyone who revisits this later won't have to redo the analysis.

What works well

  • Scope discipline is clean. lib/ is untouched, the spike directory is correctly excluded from RuboCop (spike/**/*) and gem packaging (f.match(%r{^(spec|tmp|spike)/})).
  • Decision record covers the right axes. Behavior differences, performance data, the prism dependency cost for Ruby 3.0–3.2, and a concrete migration checklist are all present. The recommendation includes the exact trigger condition (Ruby floor moves to 3.3) and a 4–8 hour effort estimate — that's everything a future implementer needs.
  • Test coverage is comprehensive. 32 cases cover the full matrix from issue Spike: evaluate Prism-based Gemfile rewriting for Pro migration generator #3313, including the empty-else collapse, percent-string-literal fallback (%q[...]), UTF-8 byte-offset correctness, semicolon-separated same-line statements, and parse-failure policy.
  • Byte-offset splice approach is the right call. Avoiding an AST-to-source formatter preserves comments, spacing, and heredocs by construction.

Issues worth noting before merging

1. elsif branches produce silently ugly output — not mentioned in DECISION.md

If a Gemfile uses an elsif guard:

if ENV["PRO"]
  gem "react_on_rails_pro"
elsif ENV["EDGE"]
  gem "react_on_rails", path: "../ror"
end

collect_gem_calls traverses the full AST and will correctly find and remove the base gem from the elsif branch. But collapse_dead_conditionals won't clean up the resulting empty elsif because conditional_else_node only matches Prism::ElseNode — an elsif is represented as a nested Prism::IfNode in the subsequent slot, not an ElseNode. The output is:

if ENV["PRO"]
  gem "react_on_rails_pro"
elsif ENV["EDGE"]
end

Valid Ruby, but an empty elsif is arguably worse than the empty else the issue was trying to fix. Acceptable scope for a spike, but DECISION.md's "Edge cases the prototype does not collapse" section should mention it so the future implementer knows to handle it.

2. Minor spec and benchmark issues — see inline comments.

Nit

collapse_dead_conditionals uses loop do and relies on find_collapse_edit returning nil to terminate. It's correct as written — each iteration removes one collapsible conditional, so the count is monotonically decreasing — but a max-iteration guard (100.times or a counter) would be a cheap safeguard against a future edit accidentally introducing a cycle.

gem "jbuilder"
gem "bcrypt"
RUBY
"medium (80 lines, 1 ror entry, groups)" => begin

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "80 lines" in this label is inaccurate — the generated body is ~45 lines (1 source line + 10 base gems + 1 ROR gem + 8 more gems + blank + group header + 6 dev/test gems + end/blank/group + 6 dev gems + end/blank/group + 4 test gems + end). The per-run printout uses #{src.lines.size}l so the output is correct, but the hash key itself misleads anyone reading the source.

Suggested change
"medium (80 lines, 1 ror entry, groups)" => begin
"medium (~45 lines, 1 ror entry, groups)" => begin

Comment on lines +242 to +251
def collapse_dead_conditionals(source, preexisting_empty_conditional_fingerprints)
loop do
parse_result = Prism.parse(source)
break source if parse_result.failure?

edit = find_collapse_edit(source, parse_result.value, preexisting_empty_conditional_fingerprints)
break source unless edit

source = splice_source(source, edit[:start_offset], edit[:end_offset], edit[:replacement])
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The termination argument is correct — each iteration either finds no edit (breaks) or splices out one collapsible conditional, so the editable-conditional count is monotonically decreasing. But for future-proofing, a hard iteration cap makes the safety guarantee explicit and catches any accidental cycle introduced later:

Suggested change
def collapse_dead_conditionals(source, preexisting_empty_conditional_fingerprints)
loop do
parse_result = Prism.parse(source)
break source if parse_result.failure?
edit = find_collapse_edit(source, parse_result.value, preexisting_empty_conditional_fingerprints)
break source unless edit
source = splice_source(source, edit[:start_offset], edit[:end_offset], edit[:replacement])
end
def collapse_dead_conditionals(source, preexisting_empty_conditional_fingerprints)
max_iterations = 100 # one per collapsible conditional; a Gemfile can't realistically have more
max_iterations.times do
parse_result = Prism.parse(source)
break if parse_result.failure?
edit = find_collapse_edit(source, parse_result.value, preexisting_empty_conditional_fingerprints)
break unless edit
source = splice_source(source, edit[:start_offset], edit[:end_offset], edit[:replacement])
end
source
end

Comment on lines +470 to +481
# Prism currently treats `gem("react_on_rails",)` as a syntax error in some Ruby
# versions but accepts it as valid in 3.3+. We document the behavior either way.
it "either rewrites cleanly or reports a parse failure" do
src = "gem(\"react_on_rails\",)\n"
result = rewriter.rewrite(src)
if result.parse_failed
expect(result.content).to eq(src)
else
expect(result.content).to include('"react_on_rails_pro"')
expect_parseable_ruby(result.content)
end
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "either/or" form is the right documentation choice for a spike running across Ruby versions, but if this test suite is ever run on Ruby 3.3+ exclusively (CI, or when promoting the spike to production), the ambiguity will mask a regression. Consider tagging the version boundary explicitly so the assertion tightens on modern Ruby:

it "either rewrites cleanly or reports a parse failure" do
  src = "gem(\"react_on_rails\",)\n"
  result = rewriter.rewrite(src)
  if result.parse_failed
    # Ruby < 3.3: trailing comma in parenthesised gem call is a syntax error
    expect(result.content).to eq(src)
  else
    # Ruby 3.3+: trailing comma is accepted
    expect(result.content).to include('"react_on_rails_pro"')
    expect_parseable_ruby(result.content)
  end
end

Or, if the suite always runs on a fixed Ruby version, pin the expectation directly:

if RUBY_VERSION >= "3.3"
  it "rewrites trailing-comma declarations on Ruby 3.3+" do
    src = "gem(\"react_on_rails\",)\n"
    result = rewriter.rewrite(src)
    expect(result.parse_failed).to be(false)
    expect(result.content).to include('"react_on_rails_pro"')
    expect_parseable_ruby(result.content)
  end
end

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 488b2ba310

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +344 to +346
seen_empty_conditional_fingerprints[fingerprint] += 1
return nil if seen_empty_conditional_fingerprints[fingerprint] <=
preexisting_empty_counts[fingerprint].to_i

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match preexisting empty branches by node identity

The preexisting-empty-branch guard is keyed only by fingerprint counts, so it can skip a newly emptied conditional and instead collapse a different preexisting conditional with the same fingerprint when their order is reversed. In practice, if conditional A becomes empty after stale-base removal and conditional B was already empty before rewrite, this counter logic can leave A unchanged and collapse B, which incorrectly removes B's runtime guard and makes gem "react_on_rails_pro" unconditional.

Useful? React with 👍 / 👎.

Comment on lines +217 to +220
elsif byte == DOUBLE_QUOTE_BYTE || byte == SINGLE_QUOTE_BYTE
quote_byte = byte
elsif byte == SEMICOLON_BYTE
offsets << scan

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat percent-string literals as quoted when scanning semicolons

Statement-boundary scanning only treats single/double quotes as string contexts, so semicolons inside percent-string literals are misread as separators. With gem "react_on_rails" if ENV[%q[a;b]]; gem "react_on_rails_pro", "~> 16.0", stale-base removal starts splicing at the semicolon inside %q[...], producing corrupted output (b]]; ...) instead of removing the base gem statement.

Useful? React with 👍 / 👎.

replacement: "#{new_name_literal}, #{quote}#{default_pro_version}#{quote}"
}
end
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruby convention discourages the has_ prefix for predicate methods. When this gets promoted to production, prefer user_version_pin?:

Suggested change
end
def user_version_pin?(call)

(Also update the call site at line 122: if user_version_pin?(call))


def next_semicolon_offset(source, line_start, line_end, statement_end)
statement_separator_offsets(source, line_start, line_end).find { |offset| offset >= statement_end }
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quote-tracking scanner handles "..." and '...' but not heredocs (<<~RUBY ... RUBY). A gem option value expressed as a heredoc (contrived but syntactically valid) would cause the scanner to scan past the heredoc delimiter into subsequent lines and potentially misidentify semicolons.

In practice this is extremely unlikely in a Gemfile, but worth documenting as a known gap before promoting this to production:

# NOTE: does not handle heredoc string delimiters; a gem kwarg value
# expressed as <<~HEREDOC would cause semicolons on subsequent lines to
# be misidentified as statement separators.
def statement_separator_offsets(source, line_start, line_end)

end

puts
puts "Per-rewrite cost (scanner / prism):"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warm-up has already run for these inputs before the "total" measurement block above. Re-warming here is harmless but redundant — the JIT/cache state carries over between the two GEMFILES.each loops. You can drop this second warm-up if you pull the script forward to production.

replacement: ""
}
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stub is documented and acceptable for the spike. When promoting to production, the unless ENV["PRO"]; (empty); else gem_pro; endif ENV["PRO"]; gem_pro; end transformation would be a natural follow-up — it's semantically equivalent (the empty unless branch is just dead code) and eliminates another category of post-migration ugliness the way collapse_to_branch does for the if/else case.

Comment thread react_on_rails/spike/3313_prism_gemfile_rewriter/benchmark.rb
justin808 added a commit that referenced this pull request May 26, 2026
…sh (#3349)

Addresses 11 review comments on the Prism Gemfile rewriter spike.

Correctness fixes (verified bugs reproduced before the change):
- statement_byte_range: when a stale base gem sits inside a same-line block
  conditional with a sibling branch (e.g. `if X then gem ror_pro else gem ror end`),
  remove only the call's own byte range so the surrounding conditional and
  the active Pro gem survive. The previous line-fallback dropped the whole
  conditional, silently corrupting the Gemfile.
- remove_empty_else_branch + collapse_dead_conditionals: return nil from the
  removal helper when the computed splice would be a no-op (inline conditional
  where `else` and `end` share a line). The collapse loop additionally bounds
  itself with MAX_COLLAPSE_PASSES and guards against zero-length empty edits.
  Previously `rewrite` hung indefinitely on inline conditionals whose then
  branch contained more than one statement.

Production-readiness polish (spike scope, for the future migration):
- empty_conditional_ranges: drop the mutable default-argument pattern.
- statement_separator_offsets: NOTE comment that parenthesis depth is not
  tracked (Bundler DSL does not produce ambiguous shapes in practice).
- collapse_edit_for_conditional: caller-side note explaining the
  remove_empty_then_branch stub asymmetry.
- benchmark.rb: single measurement pass; the totals/ratio table and the
  per-rewrite cost table now derive from the same timings.
- README: document the transitive prism dev dependency on Ruby < 3.3.

Spec additions:
- Regression specs for both correctness fixes (inline if/then/else with
  sibling Pro gem; inline conditional whose else becomes empty post-removal,
  with Timeout to fail loud if the hang regresses).
- unless/else collapse spec exercising the UnlessNode#else_clause path.
- Defensive spec for bare /regex/ literals in line preceding a stale base.
- Trailing-comma test split into RUBY_VERSION-gated tests instead of an
  either-or assertion that passes regardless of branch.

55 examples, 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@justin808

Copy link
Copy Markdown
Member Author

Re: the two new items in your latest review summary (#4531915486 — prism dev-dep prerequisite and the mutable default argument in empty_conditional_ranges), both addressed in 3bb18b8fd:

  • prism dev dependency: documented in spike/3313_prism_gemfile_rewriter/README.md — added a "prism dependency" section explaining that the spike currently relies on prism being present transitively through RuboCop-related dev deps, and what to do if a LoadError ever appears. I intentionally avoided promoting prism to a runtime dep in react_on_rails.gemspec, since the recommendation in DECISION.md is to defer that until the Ruby floor moves to 3.3.

  • Mutable default arg in empty_conditional_ranges: refactored to a thin wrapper that initializes ranges = [] and delegates to a private collect_empty_conditional_ranges(node, ranges) recursive helper. No behavior change; the recursion no longer shares default state across top-level calls.

The other items in the summary are tracked separately via the inline review comments and replied to in each thread.

@justin808

Copy link
Copy Markdown
Member Author

All 11 active review comments from the previous round (and the 2 duplicates) addressed in 3bb18b8fd.

Mattered (verified bugs reproduced before fixing)

Production-readiness polish (spike scope, applied now to keep the prototype clean)

Skipped

55 examples, 0 failures.

Scan window: comments posted after the previous address-review-summary checkpoint (2026-05-24T08:46:50Z). Future scans should start after this comment unless check all reviews is requested.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3bb18b8fd3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +589 to +590
start_offset: line_start,
end_offset: line_end,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve surrounding expression when collapsing conditionals

When collapse_replacement_edit takes the no-semicolon branch, it replaces from line_start to line_end, which assumes the if/unless ... end occupies the entire line. For valid Ruby like chosen = if ENV["PRO"] ... else ... end, conditional.location.start_offset begins at if, but line_start_offset rewinds to column 0, so collapse drops the chosen = prefix and rewrites the line to just gem "react_on_rails_pro".... This silently changes semantics for expression-form conditionals instead of only simplifying the conditional node.

Useful? React with 👍 / 👎.

justin808 and others added 7 commits May 25, 2026 17:18
Spike for #3313. Standalone Prism prototype + 32-case behavior matrix +
benchmark + decision record. Lives under react_on_rails/spike/ so it does
not touch the production scanner from #3232 (per the issue's non-goals).

Recommendation: keep the current text scanner. Revisit when the Ruby
floor rises to 3.3 and Prism becomes a free runtime dependency. The
empty-else case the issue calls out is solvable cleanly by the Prism
prototype (conditional collapse), but the cost of adding prism as a
runtime gem for Ruby 3.0-3.2 users outweighs the cleanup for a generator
that runs once per migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sh (#3349)

Addresses 11 review comments on the Prism Gemfile rewriter spike.

Correctness fixes (verified bugs reproduced before the change):
- statement_byte_range: when a stale base gem sits inside a same-line block
  conditional with a sibling branch (e.g. `if X then gem ror_pro else gem ror end`),
  remove only the call's own byte range so the surrounding conditional and
  the active Pro gem survive. The previous line-fallback dropped the whole
  conditional, silently corrupting the Gemfile.
- remove_empty_else_branch + collapse_dead_conditionals: return nil from the
  removal helper when the computed splice would be a no-op (inline conditional
  where `else` and `end` share a line). The collapse loop additionally bounds
  itself with MAX_COLLAPSE_PASSES and guards against zero-length empty edits.
  Previously `rewrite` hung indefinitely on inline conditionals whose then
  branch contained more than one statement.

Production-readiness polish (spike scope, for the future migration):
- empty_conditional_ranges: drop the mutable default-argument pattern.
- statement_separator_offsets: NOTE comment that parenthesis depth is not
  tracked (Bundler DSL does not produce ambiguous shapes in practice).
- collapse_edit_for_conditional: caller-side note explaining the
  remove_empty_then_branch stub asymmetry.
- benchmark.rb: single measurement pass; the totals/ratio table and the
  per-rewrite cost table now derive from the same timings.
- README: document the transitive prism dev dependency on Ruby < 3.3.

Spec additions:
- Regression specs for both correctness fixes (inline if/then/else with
  sibling Pro gem; inline conditional whose else becomes empty post-removal,
  with Timeout to fail loud if the hang regresses).
- unless/else collapse spec exercising the UnlessNode#else_clause path.
- Defensive spec for bare /regex/ literals in line preceding a stale base.
- Trailing-comma test split into RUBY_VERSION-gated tests instead of an
  either-or assertion that passes regardless of branch.

55 examples, 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@justin808 justin808 force-pushed the jg-conductor/3313-fix branch from 3bb18b8 to adef418 Compare May 26, 2026 03:19
@claude

claude Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Code Review: Spike — Prism-based Gemfile rewriter (#3313)

Overview

Well-executed spike. The prototype is viable, the 32-case test matrix is comprehensive, the documentation is thorough, and the recommendation to wait is well-reasoned. Infrastructure changes are minimal and correct.

Infrastructure Changes ✅

  • .rubocop.yml exclusion for spike/**/* is clean.
  • gemspec exclusion regex is correct (^(spec|tmp|spike)/).

Core Algorithm ✅

The byte-offset splice approach is sound — untouched bytes are preserved by construction, sidestepping an entire class of formatter-induced bugs. The two-phase design (direct edits — collapse pass) is a clean separation of concerns, and the preexisting_empty_ranges mechanism that prevents the collapse pass from touching conditionals that were already empty before migration is particularly well thought out.


Issues Found

1. statement_separator_offsets has an undocumented regex literal blind spot

The NOTE comment mentions parenthesis depth but not regex literals — a //re;gex/ on a semicolon-separated line would have its interior ; misidentified as a statement separator. The spec test "preserves a preceding sibling statement when a regex-guarded stale base follows it" passes only because removable_statement_node returns the full IfNode (putting the regex's ; inside the node's byte range, below the statement_end threshold). That safety coupling is not obvious from statement_separator_offsets alone -- the NOTE should mention it.

2. apply_edits silently tolerates overlapping edit ranges

apply_edits sorts edits by descending offset and applies them without checking for overlap. Two edits covering the same byte range -- possible if two react_on_rails gems appear on a single semicolon-separated line -- would produce garbled output silently. statement_byte_range prevents this in practice, but there is no assertion to catch a future regression.

3. prism transitive dependency is fragile on Ruby < 3.3

The README notes that prism is currently available via RuboCop's transitive dev deps, and that a future RuboCop upgrade could drop it. The fix (gem "prism" in Gemfile.development_dependencies) is documented but not applied. Contributors running Ruby < 3.3 will hit a confusing LoadError.

4. remove_empty_then_branch is a silent no-op

Returning nil means if X; (empty); else BODY end is left unchanged after migration -- valid but with a dead branch intact. collapse_dead_conditionals will silently re-evaluate and skip this conditional on every subsequent run. Production code should surface a "please update your Gemfile manually" message for this case rather than silently leaving debris.

5. Minor: has_user_version_pin? naming

Ruby predicate convention is verb? not has_verb?. Consider user_version_pin? or pinned_version?. Not blocking for a spike.


Test Quality 9❅

  • UTF-8 byte-offset tests are present (critical given byteslice vs [] distinction).
  • Timeout guard on the infinite-loop regression is a good defensive touch.
  • Ruby-version-gated test for trailing-comma syntax is exactly right.
  • Pre-existing empty conditional preservation tests (the three "matches pre-existing..." cases) are thorough.

DECISION.md ❅

Documented, well-reasoned, and the step-by-step cutover guide (items 1-7) makes a future implementation straightforward. The behavior matrix is a useful reference for anyone picking this up later.

Benchmark

One-pass design (same timing data feeds both summary tables) is correct and avoids JIT/cache drift between the two views. ScannerRewriter#strip_close_paren is a simplification, but it is noted inline and paren-heavy Gemfiles are not the common case.


Summary

This spike answers the feasibility question clearly, documents the trade-offs, and leaves a clean decision record. The main gaps to address before any production cutover are the regex literal context tracking in statement_separator_offsets and the remove_empty_then_branch stub.

Verdict: approve with minor notes. The recommendation to keep the current scanner until the Ruby floor rises to 3.3 is sound and well-supported by the evidence.

@claude

claude Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Code Review — Spike: Prism-based Gemfile rewriter (#3313)

Overview

This PR adds a well-scoped exploratory spike evaluating a Prism AST-based replacement for the text scanner in the Pro migration generator. All spike artifacts live under spike/3313_prism_gemfile_rewriter/, production code is untouched, the directory is excluded from the gem package and from RuboCop, and the decision record recommends keeping the current scanner until the gem's Ruby floor rises to 3.3.

The recommendation is sound and well-supported. The packaging changes (gemspec + RuboCop exclusion) are correct and minimal.


Strengths

  • Decision record is thorough. The behavior matrix, performance numbers, conditional-collapse policy, parse-failure policy, and migration path are all documented clearly. The "what to do if this flips" section is especially useful.
  • Byte-offset approach is the right call. Using Prism locations for source splicing rather than an AST→text formatter preserves comments, whitespace, and heredocs by construction.
  • Test coverage is comprehensive. 32 cases including UTF-8 content, semicolon-separated statements, percent literals, postfix guards, pre-existing empty conditionals, and regression guards for two previously discovered bugs (line-fallback and infinite-loop). Each regression test has a comment explaining the original failure mode — that's a good habit.
  • Infinite-loop guard is belt-and-suspenders. MAX_COLLAPSE_PASSES = 100 plus the zero-length splice bailout in collapse_dead_conditionals is well-reasoned.
  • Performance benchmark is reproducible. Single measurement pass shared across both output tables avoids cross-table JIT/cache drift.

Issues

Minor — worth fixing before production promotion

1. has_user_version_pin? naming (prism_gemfile_rewriter.rb:678)
Ruby convention (and RuboCop Naming/PredicateName) requires predicate methods not to start with has_. Should be user_version_pin?. Harmless in the spike since the file is RuboCop-excluded, but will need correcting before promotion to lib/.

2. remove_empty_then_branch is a silent no-op stub (prism_gemfile_rewriter.rb:1157–1160)
The method is named as if it does something, but it always returns nil. The caller in collapse_edit_for_conditional falls through silently when then_empty && has_else_node && !else_empty. The existing comment inside the stub is good, but there's no matching call-site comment, so a reader tracing collapse_edit_for_conditional will not understand why this branch produces no edit without digging into the stub body. See inline comment.

3. Top-level if/else in RSpec creates variable example counts (prism_gemfile_rewriter_spec.rb:1840–1855)
The if Gem::Version.new(RUBY_VERSION) >= Gem::Version.new("3.3.0") block at parse time creates two different test suites depending on Ruby version. This means 32 examples in the README/PR description is only true on Ruby 3.3+; on 3.0–3.2 CI would report 32 but one test is a different test with a different assertion. Use skip inside the test instead — see inline comment.

Informational — not blocking

4. Benchmark's strip_close_paren is simplified in a way that affects paren inputs
ScannerRewriter#strip_close_paren carries a comment saying it's "not equivalent to the production code, but adequate for benchmarking on inputs that do not stress this path." The medium and large fixtures don't include parenthesized gem("…") calls, so in practice the benchmark is fair. But if someone later adds a parenthesized fixture to compare the two approaches, the scanner timings would be artificially low. Worth a sentence noting which fixture types are affected.

5. collapse_dead_conditionals re-parses up to 100× (prism_gemfile_rewriter.rb:945–963)
Sub-millisecond per parse on a one-shot migration tool — not a performance problem now. But if promoted to lib/, the re-parse loop should be replaced with a single-pass approach (walk the post-edit AST once and collect all collapse candidates, then apply them right-to-left). The spike comment already flags this as a known simplification.


Packaging changes

Both packaging changes are correct:

  • .rubocop.yml: spike/**/* added to AllCops.Exclude — appropriate.
  • react_on_rails.gemspec: regex updated to ^(spec|tmp|spike)/ — correct; spike files will not be shipped.

Bottom line

The spike meets its stated goals. The decision record is the deliverable, and it's solid. The three issues above (#1#3) should be addressed before the prototype code is promoted to lib/, but none of them block merging the spike as exploratory reference material.

Comment on lines +374 to +379
def apply_edits(source, edits)
sorted = edits.sort_by { |e| -e[:start_offset] }
sorted.reduce(source) do |acc, edit|
splice_source(acc, edit[:start_offset], edit[:end_offset], edit[:replacement])
end
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_edits silently tolerates overlapping ranges. Applying two edits that cover overlapping byte ranges (e.g., two react_on_rails gems on a single semicolon-separated line sharing a statement_byte_range) would produce garbled output without any warning.

statement_byte_range prevents this in practice, but a defensive guard here would catch any future regression immediately:

Suggested change
def apply_edits(source, edits)
sorted = edits.sort_by { |e| -e[:start_offset] }
sorted.reduce(source) do |acc, edit|
splice_source(acc, edit[:start_offset], edit[:end_offset], edit[:replacement])
end
end
def apply_edits(source, edits)
sorted = edits.sort_by { |e| -e[:start_offset] }
sorted.each_cons(2) do |later, earlier|
raise ArgumentError, "overlapping edits at #{earlier[:start_offset]}..#{later[:end_offset]}" if earlier[:start_offset] < later[:end_offset]
end
sorted.reduce(source) do |acc, edit|
splice_source(acc, edit[:start_offset], edit[:end_offset], edit[:replacement])
end
end

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: adef418894

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +583 to +585
start_offset: conditional.location.start_offset,
end_offset: conditional.location.end_offset,
replacement: inline_replacement

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve expression context in semicolon collapse path

In collapse_replacement_edit, the semicolon path replaces if/unless ... end with the raw gem call text (inline_replacement) whenever a trailing separator exists, but this can break valid expression-form conditionals. For example, arr << if ENV['PRO'] then gem 'react_on_rails_pro' else gem 'react_on_rails' end; gem "rails" rewrites to arr << gem 'react_on_rails_pro'; gem "rails", which is no longer parseable Ruby. This means parseable Gemfiles can become syntactically invalid when a collapsible conditional is used as an expression and followed by a semicolon-separated statement.

Useful? React with 👍 / 👎.

end
end

def has_user_version_pin?(call)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruby predicate convention (and RuboCop Naming/PredicateName) says methods should not start with has_. This is safe while the file is RuboCop-excluded in the spike, but needs renaming before promotion to lib/.

Suggested change
def has_user_version_pin?(call)
def user_version_pin?(call)

(The call site at line 133 would need to be updated to match.)

# empty-`then` conditional is left in the output. See the stub itself
# for what production should do.
if then_empty && has_else_node && !else_empty
collapse_to_branch(source, node, else_statements) || remove_empty_then_branch(source, node)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove_empty_then_branch stub always returns nil (see line 629), so this whole branch produces no edit and the conditional is silently left unchanged. That's the correct spike policy, but without reading the stub body it looks like it might actually do something. A short note at the call site would make the intent explicit:

Suggested change
collapse_to_branch(source, node, else_statements) || remove_empty_then_branch(source, node)
collapse_to_branch(source, node, else_statements) ||
remove_empty_then_branch(source, node) # spike stub: always nil — conditional left as-is

}
end

def remove_empty_then_branch(_source, _node)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stub returning nil is intentional for the spike, but worth flagging for the production cutover: collapse_dead_conditionals will silently stop trying on if X; (empty); else BODY end each run with no user-visible feedback. Production code should return a structured error here rather than nil, so the generator can surface a "please update your Gemfile manually" message when this case is encountered.

# it as a syntax error and Prism reports a parse failure. Split into two gated
# tests so each Ruby tightens to a single expected outcome instead of a
# "does-not-crash" passthrough that hides a real regression.
if Gem::Version.new(RUBY_VERSION) >= Gem::Version.new("3.3.0")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A top-level if/else inside the describe block creates two structurally different test suites — a different total example count and a different assertion — depending on Ruby version. This makes the "32 examples, 0 failures" claim in the README version-dependent and can make CI output confusing across Ruby matrix runs.

Prefer a version guard inside the test using skip:

it "handles trailing comma in parenthesized call" do
  src = "gem(\"react_on_rails\",)\n"
  if Gem::Version.new(RUBY_VERSION) < Gem::Version.new("3.3.0")
    result = rewriter.rewrite(src)
    expect(result.parse_failed).to be(true)
    expect(result.content).to eq(src)
  else
    result = rewriter.rewrite(src)
    expect(result.parse_failed).to be(false)
    expect(result.content).to include('"react_on_rails_pro"')
    expect_parseable_ruby(result.content)
  end
end

This keeps the example count constant across Ruby versions and makes both branches visible in a single test run.

Comment on lines +282 to +286
# NOTE: tracks string and percent-literal contexts but does not track
# parenthesis depth. A semicolon inside an unquoted call argument
# (e.g. `gem "ror" if func(a; b)`) would be misidentified as a
# statement separator. Bundler's Gemfile DSL does not produce such
# patterns in practice; revisit if the rewriter moves outside Gemfiles.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NOTE covers parenthesis depth but not regex literal context. A bare /re;gex/ on a semicolon-separated line would have the interior ; misidentified as a statement separator.

This is safe in practice because removable_statement_node returns the full IfNode for postfix-guarded calls, keeping the regex's ; inside the node's byte range and below the statement_end threshold — but that coupling is non-obvious from this method alone. Suggest extending the NOTE:

Suggested change
# NOTE: tracks string and percent-literal contexts but does not track
# parenthesis depth. A semicolon inside an unquoted call argument
# (e.g. `gem "ror" if func(a; b)`) would be misidentified as a
# statement separator. Bundler's Gemfile DSL does not produce such
# patterns in practice; revisit if the rewriter moves outside Gemfiles.
# NOTE: tracks string and percent-literal contexts but does not track
# parenthesis or regex-literal depth. A semicolon inside a call argument
# (e.g. `gem "ror" if func(a; b)`) would be misidentified as a statement
# separator. Bare regex postfix guards (`gem "ror" if /a;b/.match?(x)`)
# are safe only because `removable_statement_node` returns the enclosing
# IfNode, placing the regex's `;` inside the node's byte range and below
# the `statement_end` threshold. Bundler Gemfile DSL does not produce
# bare-paren patterns; revisit if the rewriter moves outside Gemfiles.

Comment on lines +30 to +34
on `prism` being present transitively through RuboCop-related dev dependencies
in this monorepo's bundle — `bundle exec` works today, but a future RuboCop
upgrade could drop that transitive dep. If a `LoadError` for `prism` appears,
add `gem "prism"` to `react_on_rails/Gemfile.development_dependencies` (the
spike is intentionally not promoting `prism` to a production runtime dep until

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix for the fragile transitive dep is already documented here, but it's left for contributors to discover at LoadError time. Consider applying it proactively — add gem "prism" to Gemfile.development_dependencies in the same commit so Ruby < 3.3 contributors can run the spike without detective work.

@justin808 justin808 merged commit d8f7c36 into main Jun 2, 2026
53 checks passed
@justin808 justin808 deleted the jg-conductor/3313-fix branch June 2, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spike: evaluate Prism-based Gemfile rewriting for Pro migration generator

1 participant