Skip to content

chore(fsharp): align npm grammar with cargo at v0.3.0#1165

Merged
carlos-alm merged 2 commits into
fix/1114-fsharp-signature-grammarfrom
fix/1161-fsharp-grammar-skew
May 19, 2026
Merged

chore(fsharp): align npm grammar with cargo at v0.3.0#1165
carlos-alm merged 2 commits into
fix/1114-fsharp-signature-grammarfrom
fix/1161-fsharp-grammar-skew

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Bumps tree-sitter-fsharp to v0.3.0 (installed from the ionide GitHub tarball — npm has no 0.3.0 release) so the WASM engine matches the cargo crate the native engine already uses.
  • Removes the dual-shape detection in handleValueDefinition / handle_value_definition (both engines) — only the v0.3.0 shape (curried_specarguments_spec) is checked. The legacy v0.1.0 function_type branch was dead under both engines once they shared a grammar.
  • No behavioral change for callers; user-visible language list and supported extensions unchanged.

Why this lives on top of #1162

The dual-shape detection was introduced in #1162 as a parity workaround. This PR is stacked on fix/1114-fsharp-signature-grammar so the cleanup lands as soon as that PR merges. Base will be reparented to main if #1162 merges first.

Test plan

  • npx vitest run tests/parsers/fsharp.test.ts tests/parsers/fsharp-signature.test.ts — 9 passed
  • cargo test --lib extractors::fsharp (codegraph-core) — 3 passed
  • npx vitest run tests/parsers tests/integration — 587 passed
  • npx vitest run tests/benchmarks/resolution — 205 passed (F# resolution fixture included)
  • Probed both tree-sitter-fsharp.wasm and tree-sitter-fsharp_signature.wasm (v0.3.0): all node kinds consumed by the extractor are present, no ERROR nodes for the standard fixtures.

Closes #1161

The WASM engine pulled tree-sitter-fsharp 0.1.0 from npm while the native
engine used 0.3.0 from crates.io. The two versions diverged in how they
parse type signatures in .fsi files: 0.1.0 emits `function_type` nodes
for `a -> b` types, while 0.3.0 wraps every signature in `curried_spec`
with `arguments_spec` children for function shapes.

The F# extractor was forced to detect both shapes simultaneously, which
is fragile — future grammar churn could silently desync further.

* package.json now installs tree-sitter-fsharp from the ionide v0.3.0
  GitHub tarball (npm has no 0.3.0 release; ionide is the upstream the
  cargo crate also tracks). Lockfile pins via SRI hash.
* Both extractors now check only `curried_spec` → `arguments_spec`,
  removing the dead `function_type` branch from each.

docs check acknowledged: README's F# row already covers .fs/.fsx/.fsi and
the user-facing language count is unchanged; the grammar version is an
internal implementation detail.

Closes #1161
@claude

claude Bot commented May 19, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: chore(fsharp): align npm grammar with cargo at v0.3.0

Todo List:

  • Phase 1: Root Cause Analysis
  • Phase 2: Configuration Change Detection
  • Phase 3: Test Integrity Check
  • Phase 4: Code Quality Review
  • Phase 5: Backlog Compliance Assessment
  • Final Assessment and Rating

Root Cause Analysis ✅

Problem Being Solved: Grammar version skew between npm and cargo versions of tree-sitter-fsharp. The WASM engine was using an older grammar version while the native Rust engine was already on v0.3.0, causing potential parsing inconsistencies.

Why This Approach: Legitimate fix. Updates npm dependency to match the cargo version (v0.3.0) and removes compatibility workaround code that was bridging the grammar differences. This eliminates dual-shape detection that was only needed when engines used different grammar versions.

Risk Assessment: Low risk. This is a proper alignment fix, not symptom hiding. The removal of legacy function_type detection is justified since both engines now share the same grammar.

Configuration Change Detection ✅

Dependency Changes:

  • Updates tree-sitter-fsharp from npm registry to GitHub tarball (0.3.0)
  • This is NOT a relaxation or rule weakening - it's upgrading to a newer grammar version

No High-Risk Changes Detected: No linting rules disabled, test timeouts increased, TypeScript strict mode changes, or build configuration modifications.

Test Integrity Check ✅

Test Files: Both F# test files remain unchanged with no assertions weakened.
PR Test Results: All tests passing according to PR description (9 F# tests, 587 parser tests, 205 resolution benchmark tests).

Code Quality Review ✅

TypeScript Changes: Clean removal of legacy dual-shape detection in src/extractors/fsharp.ts:280-304. Code is more readable with single grammar version.

Rust Changes: Matching cleanup in crates/codegraph-core/src/extractors/fsharp.rs:357-370. Both engines now use identical logic.

Architecture: This properly addresses the CLAUDE.md principle: "Both engines must produce identical results. If they diverge, the less-accurate engine has a bug — fix it, don't document the gap."

Potential Issues: None. The changes are straightforward grammar version alignment with proper test coverage.

Backlog Compliance Assessment ✅

Zero-dep: ✓ — Changes existing tree-sitter dependency to newer version. No new runtime deps added.

Foundation-aligned: ✓ — Fully compliant. Addresses P1 (keeps graph current without breaking changes), maintains dual-engine architecture (P2), and eliminates engine divergence that violated the "identical results" requirement.

Problem-fit: 3/5 — This is infrastructure maintenance that prevents engine divergence issues. Doesn't directly address the core AI agent orientation problem but maintains system reliability.

Breaking: No — Grammar update is internal implementation detail. User-facing language support and API remain unchanged.

Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

None. This is a well-executed maintenance fix.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Solid infrastructure fix that eliminates dual-engine parsing inconsistencies. Proper test coverage, clean code changes, zero breaking changes. The only reason it's not 5/5 is that it's maintenance rather than feature work that directly benefits users.

@greptile-apps

greptile-apps Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR aligns the WASM engine's F# grammar version with the native (cargo) engine by pinning tree-sitter-fsharp to v0.3.0 via a GitHub tarball, and removes the now-dead dual-shape detection code that was introduced as a parity workaround in #1162.

  • Grammar pin: tree-sitter-fsharp is bumped from ^0.1.0 to the 0.3.0 tarball; both the npm (WASM) and cargo (native) engines now share the same grammar version, eliminating the v0.1.0 vs v0.3.0 split.
  • Dead-code removal: The function_type branch in handleValueDefinition (TS) and has_function_type (Rust) that targeted the old WASM grammar shape is deleted; only the v0.3.0 curried_spec/arguments_spec path remains, and findChild semantics are preserved (direct-child-only search in both languages).
  • Documentation: A docblock was added to src/extractors/fsharp.ts explaining why the tarball form is used and that npm update will not bump the entry, addressing the prior review thread.

Confidence Score: 5/5

Safe to merge — the change removes dead code that was unreachable once both engines shared the v0.3.0 grammar, and the package bump is mechanically straightforward.

The simplification in both the TypeScript and Rust extractors correctly preserves the existing findChild semantics (direct-child-only search), the removed function_type branch was genuinely dead under both engines at v0.3.0, and the tarball pin is the expected workaround for the missing npm release. No behavioural change for callers.

No files require special attention.

Important Files Changed

Filename Overview
src/extractors/fsharp.ts Removes the v0.1.0 function_type detection branch from handleValueDefinition; now uses findChild(node, 'curried_spec') + arguments_spec loop only. Adds a module docblock explaining the tarball pin. Logic is correct — findChild is shallow (direct children), matching the old manual loop.
crates/codegraph-core/src/extractors/fsharp.rs Simplifies has_function_type to use find_child(node, "curried_spec") + arguments_spec scan only, matching the TypeScript change. Rust find_child is also direct-child-only, so behaviour is identical to the old manual loop without the now-dead function_type arm.
package.json Replaces ^0.1.0 semver range with a pinned GitHub tarball URL for v0.3.0, consistent with the cargo crate version.
package-lock.json Lock entry updated to reflect v0.3.0 tarball; peer dependency bumped from ^0.21.0 to ^0.22.4 for the optional native tree-sitter binding, and transitive deps node-addon-api and node-gyp-build bumped accordingly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["value_definition node"] --> B["first child == 'val'?"]
    B -- No --> Z["return / ignore"]
    B -- Yes --> C["findChild(node, 'value_declaration_left')"]
    C -- not found --> Z
    C -- found --> D["findChild(declLeft, 'identifier_pattern')"]
    D -- not found --> Z
    D -- found --> E["extract identifier name"]
    E --> F["findChild(node, 'curried_spec')"]
    F -- not found --> G["hasFunctionType = false"]
    F -- found --> H["scan curried children for 'arguments_spec'"]
    H -- found --> I["hasFunctionType = true"]
    H -- not found --> G
    G --> J["emit kind: 'variable'"]
    I --> K["emit kind: 'function'"]
Loading

Reviews (3): Last reviewed commit: "docs(fsharp): explain tree-sitter-fsharp..." | Re-trigger Greptile

Comment thread package.json
@@ -162,7 +162,7 @@
"tree-sitter-dart": "^1.0.0",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 GitHub tarball pin has no semver upgrade path

Because the npm registry has no 0.3.0 release, the tarball URL is the right choice here. Worth noting: unlike ^0.x.y semver ranges, npm update will never touch this entry — upgrading to a future release (e.g. 0.4.0) requires a manual URL edit in both package.json and package-lock.json. A short comment in package.json would help future maintainers know why the tarball form was chosen instead of a registry reference.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a docblock to src/extractors/fsharp.ts explaining (a) why the tarball form is used (no v0.3.0 on npm), (b) that the cargo crate uses the same version, and (c) that upgrading requires a manual URL edit in both package.json and package-lock.json because npm update won't touch this entry. Put it in the extractor docblock rather than package.json since pure JSON doesn't support comments and a _comment field would be non-standard for this repo.

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

2 functions changed4 callers affected across 2 files

  • has_function_type in crates/codegraph-core/src/extractors/fsharp.rs:357 (2 transitive callers)
  • handleValueDefinition in src/extractors/fsharp.ts:271 (2 transitive callers)

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit adcaf40 into fix/1114-fsharp-signature-grammar May 19, 2026
29 checks passed
@carlos-alm carlos-alm deleted the fix/1161-fsharp-grammar-skew branch May 19, 2026 23:11
@github-actions github-actions Bot locked and limited conversation to collaborators May 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant