Skip to content

fix(wasm): emit receiver edges for declaration-typed locals (cpp, cuda)#1497

Merged
carlos-alm merged 23 commits into
mainfrom
fix/wasm-receiver-decl-typed-1466
Jun 14, 2026
Merged

fix(wasm): emit receiver edges for declaration-typed locals (cpp, cuda)#1497
carlos-alm merged 23 commits into
mainfrom
fix/wasm-receiver-decl-typed-1466

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

WASM engine was missing `receiver` edges for declaration-typed locals in C++ and CUDA (e.g. `UserService svc; svc.method()` → receiver edge `caller → UserService`). Native correctly emitted these at conf=0.9.

Root cause

The JS C++ and CUDA extractors had no handler for `declaration` AST nodes, so `typeMap` was never seeded for statically-typed locals. Without a typeMap entry for `svc`, `resolveReceiverEdge` found nothing to look up and silently skipped the receiver edge.

The native Rust path runs a second `walk_tree` pass with `match_c_family_type_map`, which handles the `declaration` node kind: it reads the `type` field and each `identifier` / `init_declarator` child to call `push_type_map_entry(symbols, varName, typeName)` at confidence 0.9.

Fix

Added `handleCppDeclaration` to `src/extractors/cpp.ts` and `handleCudaDeclaration` to `src/extractors/cuda.ts`. Both mirror the `match_c_family_type_map` "declaration" branch exactly: extract the type node text and seed `typeMap[varName] = { type, confidence: 0.9 }` for each identifier or `init_declarator` child. Primitive types (`int`, `char`, `bool`, …) are skipped to avoid spurious edges.

Verification

```
node scripts/parity-compare.mjs --langs cpp,cuda --hybrid

PARITY OK — 2 fixture(s), all engines identical

```

3044 tests pass, 0 failures.

Closes #1466

… diff

Adds snapshot-pre-bash.sh (PreToolUse Bash) + track-bash-writes.sh
(PostToolUse Bash): the pre-hook captures git status --porcelain to a
per-worktree temp file before each Bash call; the post-hook diffs the
before/after state and appends newly modified or created files to
.claude/session-edits.log.

This closes the gap where files written by sed -i, printf redirects,
tee, heredocs, or build tools (Cargo.lock, lockfiles) were never
recorded, causing guard-git.sh to emit false-positive BLOCKED errors.

Closes #1457
- clojure.rs: annotate lifetime-anchor assignment to silence false-positive
- cfg.rs: remove never-called start_line_of method
- complexity.rs: remove never-constructed NotHandled variant; convert
  irrefutable if-let patterns to plain let destructures
- dataflow.rs: remove never-read callee fields from CallReturn/Destructured
- incremental.rs: remove never-read lang field from CacheEntry

cargo check and cargo clippy both clean after these changes.
Adds .github/workflows/perf-canary.yml — a path-filtered workflow that
fires on PRs touching src/extractors/, src/domain/graph/, or crates/**
and runs only the incremental-benchmark suite (full build + no-op +
1-file rebuild, both engines). Catches the class of regressions that
accumulated invisibly across the Phase 8.x PRs and were only detected
at v3.12.0 publish time.

The regression guard gains BENCH_CANARY=1 mode: raises thresholds to
50%/100%/150% (standard/noisy/WASM) and skips the build, query, and
resolution suites — only incremental checks run. This absorbs shared-
runner timing variance while still blocking catastrophic regressions
(+98% full build, +1827% 1-file rebuild from v3.12.0).

Closes #1433
On incremental builds, runPostNativeCha previously scanned all
call→qualified-method edges in the DB (~12ms flat, O(graph size)),
even for 1-file changes where no hierarchy or RTA evidence changed.

Add two cheap indexed gate queries. Gate A checks whether any changed
file introduced a class/interface/trait/struct/record node (hierarchy
may have new implementors reachable from unchanged call sites). Gate B
checks whether any changed file added a call edge to a class-kind target
(RTA set may have grown, enabling previously filtered expansions in
unchanged callers). If neither gate fires, restrict the candidate query
to src.file IN changedFiles — safe because the hierarchy and instantiated
set are unchanged for all other files.

Full builds (isFullBuild=true) and cases where either gate fires retain
the existing full-scan behaviour. Mirrors the changed-files scoping
pattern of runPostNativeThisDispatch.

Closes #1441
Times each JS post-pass in tryNativeOrchestrator and exposes the
measurements in BuildResult.phases:

- gapDetectMs  — dropped-language gap detection + backfill
- chaMs        — CHA expansion (interface dispatch)
- thisDispatchMs — this/super dispatch WASM re-parse (was already
                   tracked but now properly named alongside the rest)
- reclassifyMs — scoped role re-classification after edge insertion
- techniqueBackfillMs — technique-column UPDATE on native-written edges

Previously only thisDispatchMs was reported, causing wall-clock vs
phaseSum to diverge by 1.1s+ on 1-file rebuilds and making benchmark
regressions undiagnosable from committed history.

Updates update-incremental-report.ts to render the new phases in a
collapsible details block under each engine's 1-file rebuild section.

Closes #1434
…ld for required-tier grammars

The docstring claimed pool cost was "amortised over enough parse work" —
measurements show IPC overhead scales linearly (~55–64ms/file pool vs
~8–10ms/file inline). The real motivation is crash safety for exotic WASM
grammars (#965); JS/TS/TSX (required-tier, used in all this-dispatch
backfill calls) have never triggered the V8 fatal crash class and are safe
to run inline.

Raise threshold 16 → 32 to keep typical this-dispatch batches (≤ 18 files
on the codegraph corpus) on the inline fast path. Exotic-language drops are
almost always well under 32 files and also benefit from the inline path
without meaningful crash risk increase.

Closes #1435
…e incremental rebuilds

On 1-file native incremental builds, two JS post-passes ran unconditionally
even when they had no work to do:

- `backfillNativeDroppedFiles`: called whenever changedCount > 0, even when
  detectDroppedLanguageGap returned an empty gap. Gate now checks
  gap.missingAbs.length > 0 || gap.staleRel.length > 0 directly, matching
  backfillNativeDroppedFiles's own internal early-exit guard.

- Node/edge COUNT(*) re-count: ran unconditionally after all post-passes even
  when none of them wrote any edges. COUNT(*) over 50K+ edge tables is
  non-trivial, especially via the NativeDbProxy napi-rs round-trip. Now gated
  on postPassWroteData (backfill | CHA edges | this-dispatch edges).

Closes #1454
The post-pass it timed (runPostNativePrototypeMethods) was deleted in
b5c03a2 when func-prop extraction moved to Rust (#1432). The optional
field was never set by any code path that survived the deletion.

Also remove the stale reference to "prototype-methods post-pass" from
the parseFilesWasmForBackfill docstring — only the this-dispatch
post-pass uses symbolsOnly now.

Closes #1432
… collision

Field type annotations (`private repo: OrderRepository`) were seeded as bare
file-wide typeMap keys, causing `this.repo` inside `UserService` to resolve to
`OrderRepository` when both classes had a `repo` field (issue #1458).

Both extractors (TS `handleFieldDefTypeMap` and Rust `field_definition` branch)
now seed `ClassName.field` keys at confidence 0.9, matching the `CallerClass.X`
resolver fallback added in PR #1382. Bare keys are kept at confidence 0.6 as
fallbacks for single-class files or class expressions where no enclosing class
name is available.

Both engines change identically — parity preserved.
…ed names

The resolution benchmark uses WASM-built graphs where the Elixir, Julia,
and Objective-C extractors emit module-qualified symbol names (Main.run,
App.main, UserService.create_user, etc.). The expected-edges manifests
were written with bare unqualified names (run, main, create_user), so
every correctly-resolved edge appeared as a false positive and every
expected edge appeared as a false negative — causing all three languages
to show 0% precision even though resolution was working correctly.

Root cause: starting in v3.12.0, cross-module call resolution began working
for these languages (via the improved receiver-dispatch and same-class
fallback in resolveByMethodOrGlobal / build-edges.ts). With 0 edges
previously resolved, the name mismatch was invisible; once edges started
resolving, the manifests showed 17 FP (elixir), 11 FP (julia), 6 FP
(objc) — all correctly resolved edges misidentified as false positives.

Fix:
- Update all three expected-edges.json manifests to use the
  module-qualified names matching actual extractor output:
  elixir: Main.run, UserService.create_user, Validators.validate_user, etc.
  julia:  App.main, Service.create_user, Repository.new_repo, etc.
  objc:   full ObjC selectors (createUserWithId:name:email:, isValidEmail:, etc.)
          plus add main -> run (plain C call correctly resolved)
- Ratchet THRESHOLDS for all three:
  elixir: precision 0.0 -> 1.0, recall 0.0 -> 0.8  (17/21 resolved)
  julia:  precision 0.0 -> 1.0, recall 0.0 -> 0.7  (11/15 resolved)
  objc:   precision 0.0 -> 1.0, recall 0.0 -> 0.4   (6/13 resolved)

Remaining FNs are genuine unresolved edges (same-file bare calls in
elixir/julia, receiver-typed message sends in objc) — not regressions.

Closes #1447
The JS C++ and CUDA extractors had no handler for 'declaration' AST nodes,
so typeMap was never seeded for statically-typed locals (e.g. 'UserService svc;').
Without a typeMap entry for 'svc', resolveReceiverEdge had nothing to look up and
silently skipped the receiver edge.

Add handleCppDeclaration / handleCudaDeclaration to both extractors. They mirror
match_c_family_type_map ('declaration' branch) from the native Rust path: extract
the type node text and seed typeMap[varName] = { type, confidence: 0.9 } for each
identifier or init_declarator child. Primitive types (int, char, bool, …) are
skipped to avoid spurious edges.

parity-compare.mjs --langs cpp,cuda --hybrid: PARITY OK (wasm = native = hybrid)
All 3044 tests pass.
@claude

claude Bot commented Jun 13, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: fix(wasm): emit receiver edges for declaration-typed locals (cpp, cuda)

  • Gather context and read changed files
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity check
  • Phase 4: Code quality review
  • Phase 5: Backlog compliance
  • Final assessment

Root Cause Analysis

Problem Being Solved: WASM C++/CUDA extractors never handled declaration AST nodes, so typeMap was never seeded for statically-typed locals. When resolveReceiverEdge looked up the receiver in typeMap it found nothing and silently dropped the edge. The native Rust engine ran a separate walk_tree pass with match_c_family_type_map which did handle declaration, so native correctly emitted these edges at conf=0.9.

Why This Approach: The fix mirrors the exact native Rust logic. The declaration case is wired into the existing walkCppNode/walkCudaNode switch, so no architectural change is needed — just a missing case. The unwrapCppDeclaratorName/extractCudaFieldName helpers are already correct for this scenario, reused here.

Risk Assessment: Low. The change is purely additive (adds new typeMap entries that previously didn't exist), can only produce more edges, and is bounded by the primitive filter and the downstream RECEIVER_KINDS filter in resolveReceiverEdge. Parity verified via the parity-compare.mjs script.


Phase 2: Configuration Changes

None. No linter rules, tsconfig flags, or test thresholds were touched.


Phase 3: Test Integrity

No tests were modified or removed. The resolution-benchmark fixtures already contain the relevant UserService svc; declarations and expected receiver-typed edges — the fix causes those pre-existing expected edges to pass. No assertions were weakened.


Phase 4: Code Quality

Minor — TypeMapEntry import added but unused by name

Both files now import TypeMapEntry from ../types.js (cpp.ts:7, cuda.ts:7), but neither handler references it explicitly — the object literal { type: typeName, confidence: 0.9 } is structurally typed by ctx.typeMap's existing Map<string, TypeMapEntry> signature. Before this PR, TypeMapEntry was not imported in either file.

Whether this counts as unused depends on the TypeScript/Biome configuration. If the linter runs with noUnusedVariables/noUnusedImports on type-only imports, this will warn. The import itself is harmless (it's a type import — zero runtime cost), but it should either be removed (since it's not explicitly annotating anything in the new code) or kept only if it's intentionally documenting the shape of what's being set. Given the existing patterns in the other extractors that do use TypeMapEntry (e.g. helpers.ts uses it as an explicit parameter type), this is fine to keep for clarity — but worth checking that npm run lint passes cleanly.

handleCudaDeclaration uses extractCudaFieldName while handleCppDeclaration uses unwrapCppDeclaratorName — this is correct and intentional. CUDA has a richer declarator model (function-pointer fields, parenthesized_declarator fallback) that extractCudaFieldName handles, while C++ uses the CPP_DECLARATOR_WRAPPERS-based unwrapCppDeclaratorName. Mirrors the native divergence between unwrap_cpp_declarator and unwrap_cuda_declarator in Rust.

Primitive filter divergence from native Rust: The WASM engines filter primitive types at extraction time (isPrimitiveCppType / isCudaPrimitiveType); native Rust has no such filter but achieves the same result because resolveReceiverEdge filters by RECEIVER_KINDS and a primitive like int will never appear as a class/struct node. The WASM approach is a valid early-out optimization and is consistent with the // Skip primitive types comment already in the new code. No actual behavioral difference — not a bug.

CUDA_PRIMITIVE_TYPES and CPP_PRIMITIVE_TYPES are identical sets (lines 365–370 in cpp.ts, 415–420 in cuda.ts). This is acceptable duplication given that the two extractors are deliberately kept separate (different WASM grammars, different declarator unwrap helpers), but a shared C_FAMILY_PRIMITIVE_TYPES constant in helpers.ts would be DRY-er. Not a blocking concern.


Phase 5: Backlog Compliance

Axis Result
Zero-dep ✓ — no new runtime dependencies
Foundation-aligned ✓ — purely additive WASM extraction fix, no cloud, no code gen
Problem-fit 5 — directly fixes an engine parity bug that causes hallucinated (missing) dependency edges for C++/CUDA codebases; an AI agent using codegraph on a C++ repo would miss receiver relationships entirely without this
Breaking No — purely additive, typeMap gains entries it previously lacked
Tier 1

Critical Concerns

None blocking.

Non-blocking observation: The TypeMapEntry import is new in both files but not explicitly referenced by name in the added code. It will compile fine (structurally compatible), but should be verified against npm run lint to confirm Biome doesn't flag it as an unused import.


Final Recommendation

  • Rating: ⭐⭐⭐⭐ (4/5)
  • Action: APPROVE
  • Reasoning: Correct fix for a real parity bug. The declaration handler mirrors the native Rust match_c_family_type_map "declaration" branch faithfully, reuses existing declarator-unwrap helpers correctly, includes a sound primitive filter, and is covered by the existing resolution-benchmark fixtures. Parity-compare verification is clean. The only minor issue is the TypeMapEntry import that is structurally needed (the map's value type) but not explicitly annotated in the new code — verify npm run lint passes before merging.

@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Fixes missing receiver edges for declaration-typed locals in the WASM C++ and CUDA extractors by adding handleCppDeclaration / handleCudaDeclaration, which seed typeMap at confidence 0.9 — exactly mirroring the Rust match_c_family_type_map "declaration" branch. Also removes a duplicate PostPassTimings interface declaration in native-orchestrator.ts.

  • src/extractors/cpp.ts and src/extractors/cuda.ts: new declaration handlers iterate identifier and init_declarator children of each declaration node, skipping primitive types via identical CPP_PRIMITIVE_TYPES / CUDA_PRIMITIVE_TYPES sets; both sets are currently duplicated rather than shared.
  • src/domain/graph/builder/stages/native-orchestrator.ts: duplicate interface definition removed with no semantic change.

Confidence Score: 5/5

Safe to merge — the new declaration handlers are a faithful, well-scoped port of the existing Rust logic with no impact on other extraction paths.

The change is additive and narrowly scoped: two new functions seed typeMap for declaration nodes, a known-missing path confirmed by parity tests. The logic mirrors the Rust extractor exactly, and the only issues found are the duplicated primitive-type set between the two files and a dead OR branch in the primitive-check helpers — neither affects correctness.

No files require special attention. The duplicated CPP_PRIMITIVE_TYPES / CUDA_PRIMITIVE_TYPES sets across cpp.ts and cuda.ts are worth consolidating to prevent future drift, but are not a blocker.

Important Files Changed

Filename Overview
src/extractors/cpp.ts Adds handleCppDeclaration to seed typeMap for declaration-typed locals; introduces CPP_PRIMITIVE_TYPES set and isPrimitiveCppType guard. Logic correctly mirrors the Rust match_c_family_type_map handler. Minor: identical primitive-type set is duplicated in cuda.ts, and the second OR branch in isPrimitiveCppType is dead code.
src/extractors/cuda.ts Adds handleCudaDeclaration, an exact structural mirror of the C++ handler using extractCudaFieldName for declarator name resolution. CUDA_PRIMITIVE_TYPES is identical to CPP_PRIMITIVE_TYPES; same redundant OR branch in isCudaPrimitiveType.
src/domain/graph/builder/stages/native-orchestrator.ts Removes a duplicate PostPassTimings interface declaration — straightforward cleanup with no semantic impact.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[walkCppNode / walkCudaNode] -->|declaration node| B[handleDeclaration]
    B --> C{type field found?}
    C -->|No| Z[return]
    C -->|Yes - typeName| D{isPrimitive?}
    D -->|true| Z
    D -->|false| E[iterate child nodes]
    E -->|identifier| F[nameNode = child]
    E -->|init_declarator| G[nameNode = declarator field]
    E -->|other| H[skip]
    F --> I[unwrap to varName]
    G --> I
    I -->|varName found| J[typeMap.set varName to typeName at conf 0.9]
    J --> K[resolveReceiverEdge emits receiver edge]
Loading

Reviews (6): Last reviewed commit: "fix: remove duplicate PostPassTimings in..." | Re-trigger Greptile

Comment thread src/extractors/cpp.ts
Comment on lines +228 to +232
if (kind === 'init_declarator') {
nameNode = child.childForFieldName('declarator') ?? null;
} else if (kind === 'identifier') {
nameNode = child;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Pointer/reference declarators silently skipped

The handler only seeds typeMap for direct identifier children and init_declarator children. For UserService *svc; or UserService &svc = ...;, the declarator field of the declaration node is a pointer_declarator / reference_declarator, not a bare identifier — so those variables never get a typeMap entry and svc->method() calls won't generate receiver edges. This may intentionally mirror the Rust match_c_family_type_map limitation; if so, a comment noting the gap would be helpful. The same applies in src/extractors/cuda.ts.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a clarifying comment in both handleCppDeclaration and handleCudaDeclaration (commit d3d83ef) documenting that this is intentional: the native Rust match_c_family_type_map helper in crates/codegraph-core/src/extractors/helpers.rs also only iterates init_declarator and identifier children of a declaration node — pointer_declarator/reference_declarator children are skipped on both sides. Both engines have identical scope here, so this is not a parity gap.

@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

6 functions changed6 callers affected across 2 files

  • walkCppNode in src/extractors/cpp.ts:27 (1 transitive callers)
  • handleCppDeclaration in src/extractors/cpp.ts:216 (2 transitive callers)
  • isPrimitiveCppType in src/extractors/cpp.ts:395 (3 transitive callers)
  • walkCudaNode in src/extractors/cuda.ts:40 (1 transitive callers)
  • handleCudaDeclaration in src/extractors/cuda.ts:216 (2 transitive callers)
  • isCudaPrimitiveType in src/extractors/cuda.ts:445 (3 transitive callers)

…sets to one-per-line

Biome flagged TypeMapEntry as an unused import in both cpp.ts and cuda.ts
(ctx.typeMap.set is typed via ExtractorOutput, not via a direct TypeMapEntry
annotation). Also reformatted CPP_PRIMITIVE_TYPES / CUDA_PRIMITIVE_TYPES to
one element per line, which is what the Biome formatter requires.
…Declaration

Add an inline comment in handleCppDeclaration and handleCudaDeclaration
explaining that pointer_declarator / reference_declarator children are
intentionally skipped. Native Rust match_c_family_type_map has the same
scope: it only iterates 'init_declarator' and 'identifier' children, so
both engines behave identically for pointer/reference-typed locals.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed reviewer feedback:

  1. Unused TypeMapEntry imports (Greptile P2 + Claude) — removed the import from both src/extractors/cpp.ts and src/extractors/cuda.ts (commit 2986c81). Biome and TypeScript type-check both pass cleanly.

  2. Formatting (Biome) — expanded CPP_PRIMITIVE_TYPES and CUDA_PRIMITIVE_TYPES to one-element-per-line format that Biome requires (same commit 2986c81).

  3. Pointer/reference declarator scope (Greptile P2 inline) — added explanatory comments to both handlers clarifying this is intentional scope that matches the native Rust match_c_family_type_map helper, which also only iterates init_declarator and identifier children (commit d3d83ef). Both engines have identical behaviour; no parity gap.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Lint fix (commit ba7f506): expanded inline toEqual object literals to multi-line format to pass Biome format check. No logic change.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Base automatically changed from fix/v3120-regressions-1446-1447 to main June 14, 2026 08:17
@carlos-alm carlos-alm merged commit 1c76abc into main Jun 14, 2026
23 checks passed
@carlos-alm carlos-alm deleted the fix/wasm-receiver-decl-typed-1466 branch June 14, 2026 09:01
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 14, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Engine parity: wasm misses receiver edges for declaration-typed locals (cpp, cuda)

1 participant