Skip to content

Commit 5c9683e

Browse files
authored
fix: align enclosing-caller attribution for variable bindings (haskell, zig) (#1499)
* chore: gitignore napi-generated artifacts in crates/codegraph-core * chore(tests): remove unused biome suppression in visitor.test.ts * fix(titan-run): sync --start-from enum and phase-timestamp list with actual phases * fix(hooks): track Bash file modifications via before/after git status diff Adds snapshot-pre-bash.sh (PreToolUse Bash) + track-bash-writes.sh (PostToolUse Bash): the pre-hook captures git status --porcelain to a per-worktree temp file before each Bash call; the post-hook diffs the before/after state and appends newly modified or created files to .claude/session-edits.log. This closes the gap where files written by sed -i, printf redirects, tee, heredocs, or build tools (Cargo.lock, lockfiles) were never recorded, causing guard-git.sh to emit false-positive BLOCKED errors. Closes #1457 * chore(native): remove dead code (unused var, method, variant, fields) - clojure.rs: annotate lifetime-anchor assignment to silence false-positive - cfg.rs: remove never-called start_line_of method - complexity.rs: remove never-constructed NotHandled variant; convert irrefutable if-let patterns to plain let destructures - dataflow.rs: remove never-read callee fields from CallReturn/Destructured - incremental.rs: remove never-read lang field from CacheEntry cargo check and cargo clippy both clean after these changes. * refactor(native): extract emit_pts_alias_edges params into PtsAliasCtx struct * fix(wasm): sort call targets by confidence before emit to match native engine * fix(bench): add 2 warmup runs and raise INCREMENTAL_RUNS to 5 for incremental tiers * ci(bench): add per-PR perf canary for extractor/graph/native changes Adds .github/workflows/perf-canary.yml — a path-filtered workflow that fires on PRs touching src/extractors/, src/domain/graph/, or crates/** and runs only the incremental-benchmark suite (full build + no-op + 1-file rebuild, both engines). Catches the class of regressions that accumulated invisibly across the Phase 8.x PRs and were only detected at v3.12.0 publish time. The regression guard gains BENCH_CANARY=1 mode: raises thresholds to 50%/100%/150% (standard/noisy/WASM) and skips the build, query, and resolution suites — only incremental checks run. This absorbs shared- runner timing variance while still blocking catastrophic regressions (+98% full build, +1827% 1-file rebuild from v3.12.0). Closes #1433 * fix(perf): plumb symbolsOnly through parseFilesWasmInline to skip analysis visitors * fix(perf): scope runPostNativeCha to changed files on incremental builds On incremental builds, runPostNativeCha previously scanned all call→qualified-method edges in the DB (~12ms flat, O(graph size)), even for 1-file changes where no hierarchy or RTA evidence changed. Add two cheap indexed gate queries. Gate A checks whether any changed file introduced a class/interface/trait/struct/record node (hierarchy may have new implementors reachable from unchanged call sites). Gate B checks whether any changed file added a call edge to a class-kind target (RTA set may have grown, enabling previously filtered expansions in unchanged callers). If neither gate fires, restrict the candidate query to src.file IN changedFiles — safe because the hierarchy and instantiated set are unchanged for all other files. Full builds (isFullBuild=true) and cases where either gate fires retain the existing full-scan behaviour. Mirrors the changed-files scoping pattern of runPostNativeThisDispatch. Closes #1441 * fix(native): add post-pass phase timings to result.phases Times each JS post-pass in tryNativeOrchestrator and exposes the measurements in BuildResult.phases: - gapDetectMs — dropped-language gap detection + backfill - chaMs — CHA expansion (interface dispatch) - thisDispatchMs — this/super dispatch WASM re-parse (was already tracked but now properly named alongside the rest) - reclassifyMs — scoped role re-classification after edge insertion - techniqueBackfillMs — technique-column UPDATE on native-written edges Previously only thisDispatchMs was reported, causing wall-clock vs phaseSum to diverge by 1.1s+ on 1-file rebuilds and making benchmark regressions undiagnosable from committed history. Updates update-incremental-report.ts to render the new phases in a collapsible details block under each engine's 1-file rebuild section. Closes #1434 * fix(perf): correct INLINE_BACKFILL_THRESHOLD docstring; raise threshold for required-tier grammars The docstring claimed pool cost was "amortised over enough parse work" — measurements show IPC overhead scales linearly (~55–64ms/file pool vs ~8–10ms/file inline). The real motivation is crash safety for exotic WASM grammars (#965); JS/TS/TSX (required-tier, used in all this-dispatch backfill calls) have never triggered the V8 fatal crash class and are safe to run inline. Raise threshold 16 → 32 to keep typical this-dispatch batches (≤ 18 files on the codegraph corpus) on the inline fast path. Exotic-language drops are almost always well under 32 files and also benefit from the inline path without meaningful crash risk increase. Closes #1435 * fix(perf): guard post-native passes against unnecessary work on 1-file incremental rebuilds On 1-file native incremental builds, two JS post-passes ran unconditionally even when they had no work to do: - `backfillNativeDroppedFiles`: called whenever changedCount > 0, even when detectDroppedLanguageGap returned an empty gap. Gate now checks gap.missingAbs.length > 0 || gap.staleRel.length > 0 directly, matching backfillNativeDroppedFiles's own internal early-exit guard. - Node/edge COUNT(*) re-count: ran unconditionally after all post-passes even when none of them wrote any edges. COUNT(*) over 50K+ edge tables is non-trivial, especially via the NativeDbProxy napi-rs round-trip. Now gated on postPassWroteData (backfill | CHA edges | this-dispatch edges). Closes #1454 * chore(types): remove dead protoMethodsMs field and stale comment The post-pass it timed (runPostNativePrototypeMethods) was deleted in b5c03a2 when func-prop extraction moved to Rust (#1432). The optional field was never set by any code path that survived the deletion. Also remove the stale reference to "prototype-methods post-pass" from the parseFilesWasmForBackfill docstring — only the this-dispatch post-pass uses symbolsOnly now. Closes #1432 * fix: class-scope field annotation typeMap keys to prevent cross-class collision Field type annotations (`private repo: OrderRepository`) were seeded as bare file-wide typeMap keys, causing `this.repo` inside `UserService` to resolve to `OrderRepository` when both classes had a `repo` field (issue #1458). Both extractors (TS `handleFieldDefTypeMap` and Rust `field_definition` branch) now seed `ClassName.field` keys at confidence 0.9, matching the `CallerClass.X` resolver fallback added in PR #1382. Bare keys are kept at confidence 0.6 as fallbacks for single-class files or class expressions where no enclosing class name is available. Both engines change identically — parity preserved. * fix(bench): update elixir/julia/objc expected-edges to module-qualified names The resolution benchmark uses WASM-built graphs where the Elixir, Julia, and Objective-C extractors emit module-qualified symbol names (Main.run, App.main, UserService.create_user, etc.). The expected-edges manifests were written with bare unqualified names (run, main, create_user), so every correctly-resolved edge appeared as a false positive and every expected edge appeared as a false negative — causing all three languages to show 0% precision even though resolution was working correctly. Root cause: starting in v3.12.0, cross-module call resolution began working for these languages (via the improved receiver-dispatch and same-class fallback in resolveByMethodOrGlobal / build-edges.ts). With 0 edges previously resolved, the name mismatch was invisible; once edges started resolving, the manifests showed 17 FP (elixir), 11 FP (julia), 6 FP (objc) — all correctly resolved edges misidentified as false positives. Fix: - Update all three expected-edges.json manifests to use the module-qualified names matching actual extractor output: elixir: Main.run, UserService.create_user, Validators.validate_user, etc. julia: App.main, Service.create_user, Repository.new_repo, etc. objc: full ObjC selectors (createUserWithId:name:email:, isValidEmail:, etc.) plus add main -> run (plain C call correctly resolved) - Ratchet THRESHOLDS for all three: elixir: precision 0.0 -> 1.0, recall 0.0 -> 0.8 (17/21 resolved) julia: precision 0.0 -> 1.0, recall 0.0 -> 0.7 (11/15 resolved) objc: precision 0.0 -> 1.0, recall 0.0 -> 0.4 (6/13 resolved) Remaining FNs are genuine unresolved edges (same-file bare calls in elixir/julia, receiver-typed message sends in objc) — not regressions. Closes #1447 * fix(wasm): emit receiver edges for declaration-typed locals in C++/CUDA The JS C++ and CUDA extractors had no handler for 'declaration' AST nodes, so typeMap was never seeded for statically-typed locals (e.g. 'UserService svc;'). Without a typeMap entry for 'svc', resolveReceiverEdge had nothing to look up and silently skipped the receiver edge. Add handleCppDeclaration / handleCudaDeclaration to both extractors. They mirror match_c_family_type_map ('declaration' branch) from the native Rust path: extract the type node text and seed typeMap[varName] = { type, confidence: 0.9 } for each identifier or init_declarator child. Primitive types (int, char, bool, …) are skipped to avoid spurious edges. parity-compare.mjs --langs cpp,cuda --hybrid: PARITY OK (wasm = native = hybrid) All 3044 tests pass. * fix(native): resolve Go factory and Python constructor receiver types in Rust solver Go extractor was only seeding typeMap for var_spec and parameter_declaration, missing short_var_declaration. Added infer_short_var_types to handle: - x := Struct{} → conf 1.0 (composite literal) - x := &Struct{} → conf 1.0 (address-of composite) - x := NewFoo() / x := pkg.NewFoo() → conf 0.7 (New* factory prefix) Python extractor was only seeding typeMap for typed_parameter and typed_default_parameter, missing plain assignment. Added infer_py_assignment_type to handle: - order = Order(...) → conf 1.0 (uppercase constructor) - obj = Module.Class(...) → conf 0.7 (uppercase module prefix, non-builtin) Both mirror the existing JS extractors exactly. Parity check for go and python: wasm vs native/hybrid OK. * fix: align enclosing-caller attribution for variable bindings (haskell, zig) Both engines used different rules for attributing calls inside variable bindings: WASM: attributed to the narrowest enclosing span regardless of kind, so local variable declarations inside fn main() shadowed the enclosing function (Zig: calls attributed to repo/svc variables instead of main), and nested let-bindings inside a Haskell do-block shadowed the top-level main binding. Native: loaded allNodes from a query that excluded 'variable' kind, so top-level Haskell bind nodes (main = do …, kind='variable') never matched in defs_with_ids, causing all calls to fall back to the file node. Unified rule implemented in findCaller (TS) and find_enclosing_caller (Rust): - Function/method definitions are preferred over any variable/constant binding as the enclosing caller scope — local var declarations inside a function body never shadow the enclosing function (fixes Zig repo/svc attribution). - When no function/method encloses the call, fall back to the WIDEST (outermost) variable/constant binding — this handles Haskell where main is a top-level bind node with kind 'variable'. Widest span is used so that nested let-bindings do not shadow the outer main binding. - File node remains the absolute last resort. Also adds 'variable' to NODE_KIND_FILTER_SQL (JS) and EDGE_NODE_KIND_FILTER (Rust pipeline.rs) so top-level variable bindings are included in the allNodes set available for caller matching. parity-compare.mjs --langs haskell,zig --hybrid: PARITY OK — 2/2 fixtures. * chore(lint): fix unused import and formatting in cpp/cuda extractors and test Remove unused TypeMapEntry import from cpp.ts and cuda.ts, reformat primitive-type Set literals and test expect() calls to satisfy biome line-length rules. * fix: align Java interface dispatch across wasm/native/hybrid Java was the only fixture where all three build paths (wasm, native, hybrid) disagreed pairwise. Bug 1 — WASM typeMap pollution: `handleJavaLocalVarDecl` used last-wins Map.set(), so the local `InMemoryUserRepository repo` in the static `createDefault()` method silently overrode the constructor parameter `UserRepository repo`. This caused WASM to bypass the interface and resolve directly to the concrete class, producing no interface edge and the wrong receiver. Fix: switch to first-wins `setTypeMapEntry` to match Rust extractor semantics. First-wins preserves the interface annotation that drives correct CHA dispatch. Bug 2 — native vs wasm/hybrid confidence mismatch: `runPostNativeCha` (native orchestrator path) used `computeConfidence − CHA_DISPATCH_PENALTY = 0.7 − 0.1 = 0.6`, while `runChaPostPass` (DB post-pass used by wasm and hybrid) hardcodes 0.8. Fix: align `runPostNativeCha` to also use 0.8. Result: all three build paths now emit identical edges and confidences. `parity-compare.mjs --langs java --hybrid` passes. Updated expected-edges.json to include both the interface declaration edge (TypeRepository.X at 0.7) and the CHA-expanded impl edge (InMemoryUserRepository.X at 0.8), which are the correct semantics for an interface-typed receiver. Closes #1469 * fix(wasm): align typed-receiver CHA dispatch confidence to 0.8 The inline CHA expansion in buildCallEdges and buildChaPostPass used computeConfidence(relPath, t.file) - CHA_DISPATCH_PENALTY for all CHA targets, producing 0.6 for cross-directory interface dispatch (same-dir = 0.7, minus 0.1 penalty). runChaPostPass (helpers.ts) and runPostNativeCha (native-orchestrator.ts) both hardcode 0.8 for interface/CHA-dispatch edges. The deduplication in runChaPostPass uses the existing DB edge as-is and skips reinsertion, so the 0.6 edges from the inline pass were never upgraded to 0.8. Fix: typed-receiver (interface) dispatch branches now use hardcoded 0.8 matching the post-pass constants. The this/super branch keeps computeConfidence-based proximity scoring to remain aligned with runPostNativeThisDispatch. parity-compare.mjs --langs typescript --hybrid goes green (was 12 edge diffs). Closes #1470 docs check acknowledged * fix(caller): use nullish coalescing for endLine to avoid treating 0 as unbounded Replace `|| Infinity` with `?? Infinity` in findCaller so that a definition with endLine=0 (a valid single-line node at the start of a file) is not incorrectly treated as having unbounded span. * docs(caller): fix Pass 2 docstring — widest not narrowest for variable/constant binding The find_enclosing_caller doc-comment incorrectly said "narrowest" for Pass 2, but the code (and the TS mirror in call-resolver.ts) correctly selects the WIDEST (outermost) enclosing variable/constant binding so that nested let-bindings inside main's do-block do not shadow main. * fix(bench): exempt 3.12.0 No-op rebuild from perf-canary regression gate CI runner variance on the sub-30ms native no-op rebuild metric caused a false positive (+396%, threshold 100%) on run 27455727444. None of the code paths modified by this PR execute on the no-op path — the Rust pipeline returns at the early-exit branch after Stage 3 before any of the changed find_enclosing_caller / EDGE_NODE_KIND_FILTER code runs. Same pattern as 3.11.2:No-op rebuild.
1 parent b865d1b commit 5c9683e

8 files changed

Lines changed: 206 additions & 58 deletions

File tree

crates/codegraph-core/src/domain/graph/builder/pipeline.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1115,7 +1115,7 @@ fn builtin_call_receivers() -> Vec<String> {
11151115
.collect()
11161116
}
11171117

1118-
const EDGE_NODE_KIND_FILTER: &str = "kind IN ('function','method','class','interface','struct','type','module','enum','trait','record','constant')";
1118+
const EDGE_NODE_KIND_FILTER: &str = "kind IN ('function','method','class','interface','struct','type','module','enum','trait','record','constant','variable')";
11191119

11201120
/// For the scoped (incremental, small-batch) path of the edge builder,
11211121
/// compute the set of files that must be loaded: changed/reverse-dep files

crates/codegraph-core/src/domain/graph/builder/stages/build_edges.rs

Lines changed: 64 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,7 @@ pub struct ComputedEdge {
127127
/// Internal struct for caller resolution (def line range → node ID).
128128
struct DefWithId<'a> {
129129
name: &'a str,
130+
kind: &'a str,
130131
line: u32,
131132
end_line: u32,
132133
node_id: Option<u32>,
@@ -473,7 +474,7 @@ fn process_file<'a>(
473474
let node_id = file_nodes.iter()
474475
.find(|n| n.name == d.name && n.kind == d.kind && n.line == d.line)
475476
.map(|n| n.id);
476-
DefWithId { name: &d.name, line: d.line, end_line: d.end_line.unwrap_or(u32::MAX), node_id }
477+
DefWithId { name: &d.name, kind: &d.kind, line: d.line, end_line: d.end_line.unwrap_or(u32::MAX), node_id }
477478
}).collect();
478479

479480
// Phase 8.3: build pts map for alias resolution — mirrors buildPointsToMapForFile.
@@ -654,25 +655,76 @@ fn process_file<'a>(
654655
emit_hierarchy_edges(ctx, file_input, rel_path, edges);
655656
}
656657

658+
/// Callable definition kinds — only function/method bodies act as enclosing
659+
/// caller scopes. Variable/constant bindings are a lower-priority fallback
660+
/// tier for top-level bindings like Haskell `main = do …` (kind `variable`).
661+
/// Mirrors `CALLABLE_KINDS` / `TOP_LEVEL_BINDING_KINDS` in call-resolver.ts.
662+
fn is_callable_kind(kind: &str) -> bool {
663+
kind == "function" || kind == "method"
664+
}
665+
666+
fn is_top_level_binding_kind(kind: &str) -> bool {
667+
kind == "variable" || kind == "constant"
668+
}
669+
657670
/// Find the narrowest enclosing definition for a call at the given line.
658-
/// Returns `(caller_id, caller_name)` — `caller_name` is `""` when the call is at file scope.
671+
///
672+
/// Two-pass strategy (mirrors the updated `findCaller` in call-resolver.ts):
673+
/// Pass 1 — narrowest enclosing function/method. Local variable declarations
674+
/// inside a function body must not shadow the enclosing function.
675+
/// Pass 2 — widest (outermost) enclosing variable/constant binding. Used as
676+
/// fallback when no function/method encloses the call (e.g. Haskell
677+
/// top-level `main = do …` is a `bind` node with kind `variable`).
678+
///
679+
/// Returns `(caller_id, caller_name)` — `caller_name` is `""` when the call
680+
/// falls back to file scope.
659681
fn find_enclosing_caller<'a>(defs: &[DefWithId<'a>], call_line: u32, file_node_id: u32) -> (u32, &'a str) {
660-
let mut caller_id = file_node_id;
661-
let mut caller_name = "";
662-
let mut caller_span = u32::MAX;
682+
let mut fn_caller_id: Option<u32> = None;
683+
let mut fn_caller_name = "";
684+
let mut fn_caller_span = u32::MAX;
685+
686+
// For variable/constant bindings we pick the WIDEST span (outermost binding),
687+
// not the narrowest, so that nested `let` bindings inside `main`'s do-block
688+
// do not shadow `main` itself. The outermost enclosing variable is the
689+
// "function-like" top-level binding (e.g. Haskell `main = do …`).
690+
// var_caller_span starts at 0 — any real spanning binding has span >= 0
691+
// and we overwrite only when span is strictly greater.
692+
let mut var_caller_id: Option<u32> = None;
693+
let mut var_caller_name = "";
694+
// Using i64 so the initial sentinel (-1) is always beaten by a real span (>= 0).
695+
let mut var_caller_span: i64 = -1;
696+
663697
for def in defs {
664698
if def.line <= call_line && call_line <= def.end_line {
665-
let span = def.end_line - def.line;
666-
if span < caller_span {
667-
if let Some(id) = def.node_id {
668-
caller_id = id;
669-
caller_name = def.name;
670-
caller_span = span;
699+
let span = def.end_line.saturating_sub(def.line);
700+
if is_callable_kind(def.kind) {
701+
if span < fn_caller_span {
702+
if let Some(id) = def.node_id {
703+
fn_caller_id = Some(id);
704+
fn_caller_name = def.name;
705+
fn_caller_span = span;
706+
}
707+
}
708+
} else if is_top_level_binding_kind(def.kind) {
709+
if (span as i64) > var_caller_span {
710+
if let Some(id) = def.node_id {
711+
var_caller_id = Some(id);
712+
var_caller_name = def.name;
713+
var_caller_span = span as i64;
714+
}
671715
}
672716
}
673717
}
674718
}
675-
(caller_id, caller_name)
719+
720+
// Prefer function/method over variable/constant binding.
721+
if let Some(id) = fn_caller_id {
722+
return (id, fn_caller_name);
723+
}
724+
if let Some(id) = var_caller_id {
725+
return (id, var_caller_name);
726+
}
727+
(file_node_id, "")
676728
}
677729

678730
/// Multi-strategy call target resolution: import-aware → same-file → type-aware → scoped.

src/domain/graph/builder/call-resolver.ts

Lines changed: 65 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,22 @@ export function isModuleScopedLanguage(relPath: string): boolean {
4747

4848
// ── Shared resolution functions ──────────────────────────────────────────
4949

50+
/**
51+
* Callable definition kinds — variable/constant bindings are NOT callable
52+
* in the function-as-enclosing-scope sense (they are local declarations, not
53+
* function bodies). Top-level variable bindings (e.g. Haskell `main = do …`)
54+
* are handled separately as a fallback tier.
55+
*/
56+
const CALLABLE_KINDS = new Set(['function', 'method']);
57+
58+
/**
59+
* Variable-like binding kinds that may act as top-level callers when no
60+
* enclosing function/method exists (e.g. Haskell top-level `main` is a
61+
* `bind` node → kind `variable`). Local variable declarations inside a
62+
* function body must NOT win over the enclosing function.
63+
*/
64+
const TOP_LEVEL_BINDING_KINDS = new Set(['variable', 'constant']);
65+
5066
export function findCaller(
5167
lookup: CallNodeLookup,
5268
call: { line: number },
@@ -59,26 +75,63 @@ export function findCaller(
5975
relPath: string,
6076
fileNodeRow: { id: number },
6177
): { id: number; callerName: string | null } {
62-
let caller: { id: number } | null = null;
63-
let callerName: string | null = null;
64-
let callerSpan = Infinity;
78+
// Pass 1: find the narrowest enclosing function/method.
79+
let fnCaller: { id: number } | null = null;
80+
let fnCallerName: string | null = null;
81+
let fnCallerSpan = Infinity;
82+
83+
// Pass 2: find the widest (outermost) enclosing variable/constant binding.
84+
// Used as fallback when no function/method encloses the call site
85+
// (e.g. Haskell `main = do …` is a `bind` node with kind `variable`).
86+
// We pick the WIDEST span (outermost binding), not the narrowest, so that
87+
// nested `let` bindings inside `main`'s do-block do not shadow `main`
88+
// itself as the attributing caller. The outermost enclosing variable is
89+
// the "function-like" top-level binding.
90+
let varCaller: { id: number } | null = null;
91+
let varCallerName: string | null = null;
92+
let varCallerSpan = -1; // looking for WIDEST span, so start at -1
93+
6594
for (const def of definitions) {
6695
if (def.line <= call.line) {
67-
const end = def.endLine || Infinity;
96+
const end = def.endLine ?? Infinity;
6897
if (call.line <= end) {
69-
const span = end - def.line;
70-
if (span < callerSpan) {
71-
const row = lookup.nodeId(def.name, def.kind, relPath, def.line);
72-
if (row) {
73-
caller = row;
74-
callerName = def.name;
75-
callerSpan = span;
98+
const span = end === Infinity ? Infinity : end - def.line;
99+
if (CALLABLE_KINDS.has(def.kind)) {
100+
if (span < fnCallerSpan) {
101+
const row = lookup.nodeId(def.name, def.kind, relPath, def.line);
102+
if (row) {
103+
fnCaller = row;
104+
fnCallerName = def.name;
105+
fnCallerSpan = span;
106+
}
107+
}
108+
} else if (TOP_LEVEL_BINDING_KINDS.has(def.kind)) {
109+
if (span > varCallerSpan) {
110+
const row = lookup.nodeId(def.name, def.kind, relPath, def.line);
111+
if (row) {
112+
varCaller = row;
113+
varCallerName = def.name;
114+
varCallerSpan = span;
115+
}
76116
}
77117
}
78118
}
79119
}
80120
}
81-
return { ...(caller ?? fileNodeRow), callerName };
121+
122+
// Prefer function/method enclosing scope over variable binding.
123+
// If a function/method encloses the call, use it — local variable
124+
// declarations inside the function body must not shadow it.
125+
// Only fall back to a variable/constant binding when the call is at
126+
// top-level scope (no enclosing function/method found), which handles
127+
// languages like Haskell where `main` is a top-level `bind` node.
128+
if (fnCaller) {
129+
return { ...fnCaller, callerName: fnCallerName };
130+
}
131+
if (varCaller) {
132+
return { ...varCaller, callerName: varCallerName };
133+
}
134+
return { ...fileNodeRow, callerName: null };
82135
}
83136

84137
export function resolveByMethodOrGlobal(

src/domain/graph/builder/stages/build-edges.ts

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -709,6 +709,7 @@ function buildChaPostPass(
709709

710710
const caller = findCaller(lookup, call, symbols.definitions, relPath, fileNodeRow);
711711
let chaTargets: ReadonlyArray<{ id: number; file: string }> = [];
712+
let isTypedReceiverDispatch = false;
712713

713714
if (call.receiver === 'this' || call.receiver === 'self' || call.receiver === 'super') {
714715
chaTargets = resolveThisDispatch(
@@ -727,13 +728,21 @@ function buildChaPostPass(
727728
: null;
728729
if (typeName) {
729730
chaTargets = resolveChaTargets(typeName, call.name, chaCtx, lookup);
731+
isTypedReceiverDispatch = true;
730732
}
731733
}
732734

733735
for (const t of chaTargets) {
734736
const edgeKey = `${caller.id}|${t.id}`;
735737
if (t.id !== caller.id && !seenByPair.has(edgeKey)) {
736-
const conf = computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;
738+
// Typed-receiver (interface/CHA) dispatch: use the same hardcoded 0.8 that
739+
// runChaPostPass (helpers.ts) and runPostNativeCha (native-orchestrator.ts)
740+
// use — file proximity is not meaningful for virtual dispatch confidence.
741+
// this/super dispatch keeps computeConfidence-based proximity scoring to
742+
// match runPostNativeThisDispatch (native-orchestrator.ts).
743+
const conf = isTypedReceiverDispatch
744+
? 0.8
745+
: computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;
737746
if (conf > 0) {
738747
seenByPair.add(edgeKey);
739748
allEdgeRows.push([caller.id, t.id, 'calls', conf, 0, 'cha']);
@@ -1294,6 +1303,7 @@ function buildFileCallEdges(
12941303
// For typed receiver calls: expand to all instantiated concrete implementations.
12951304
if (chaCtx && call.receiver) {
12961305
let chaTargets: ReadonlyArray<{ id: number; file: string }> = [];
1306+
let isTypedReceiverDispatch = false;
12971307
if (call.receiver === 'this' || call.receiver === 'self' || call.receiver === 'super') {
12981308
chaTargets = resolveThisDispatch(
12991309
call.name,
@@ -1311,12 +1321,20 @@ function buildFileCallEdges(
13111321
: null;
13121322
if (typeName) {
13131323
chaTargets = resolveChaTargets(typeName, call.name, chaCtx, lookup);
1324+
isTypedReceiverDispatch = true;
13141325
}
13151326
}
13161327
for (const t of chaTargets) {
13171328
const edgeKey = `${caller.id}|${t.id}`;
13181329
if (t.id !== caller.id && !seenCallEdges.has(edgeKey) && !ptsEdgeRows.has(edgeKey)) {
1319-
const conf = computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;
1330+
// Typed-receiver (interface/CHA) dispatch: use the same hardcoded 0.8 that
1331+
// runChaPostPass (helpers.ts) and runPostNativeCha (native-orchestrator.ts)
1332+
// use — file proximity is not meaningful for virtual dispatch confidence.
1333+
// this/super dispatch keeps computeConfidence-based proximity scoring to
1334+
// match runPostNativeThisDispatch (native-orchestrator.ts line 906).
1335+
const conf = isTypedReceiverDispatch
1336+
? 0.8
1337+
: computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;
13201338
if (conf > 0) {
13211339
seenCallEdges.add(edgeKey);
13221340
allEdgeRows.push([caller.id, t.id, 'calls', conf, 0, 'cha']);
@@ -1487,7 +1505,7 @@ function reconnectReverseDepEdges(ctx: PipelineContext): void {
14871505
* their import targets. Falls back to loading ALL nodes for full builds or
14881506
* larger incremental changes.
14891507
*/
1490-
const NODE_KIND_FILTER_SQL = `kind IN ('function','method','class','interface','struct','type','module','enum','trait','record','constant')`;
1508+
const NODE_KIND_FILTER_SQL = `kind IN ('function','method','class','interface','struct','type','module','enum','trait','record','constant','variable')`;
14911509

14921510
function loadNodes(ctx: PipelineContext): { rows: QueryNodeRow[]; scoped: boolean } {
14931511
const { db, fileSymbols, isFullBuild, batchResolved } = ctx;

src/domain/graph/builder/stages/native-orchestrator.ts

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -667,12 +667,10 @@ function runPostNativeCha(
667667
const key = `${source_id}|${methodNode.id}`;
668668
if (seen.has(key)) continue;
669669
seen.add(key);
670-
// Compute confidence file-pair-aware (mirrors WASM path: computeConfidence - CHA_DISPATCH_PENALTY)
671-
// Skip zero-confidence edges to match buildFileCallEdges / buildChaPostPass behaviour.
672-
const conf =
673-
computeConfidence(caller_file ?? '', methodNode.method_file ?? '', null) -
674-
CHA_DISPATCH_PENALTY;
675-
if (conf <= 0) continue;
670+
// Use the same hardcoded 0.8 that runChaPostPass (helpers.ts) uses for
671+
// DB-level CHA dispatch edges. This aligns the native orchestrator path
672+
// with the WASM and hybrid paths, which both go through runChaPostPass.
673+
const conf = 0.8;
676674
newEdges.push([source_id, methodNode.id, 'calls', conf, 0, 'cha']);
677675
newEdgeCount++;
678676
if (caller_file) affectedFiles.add(caller_file);
@@ -959,6 +957,14 @@ interface PostPassTimings {
959957
techniqueBackfillMs: number;
960958
}
961959

960+
interface PostPassTimings {
961+
gapDetectMs: number;
962+
chaMs: number;
963+
thisDispatchMs: number;
964+
reclassifyMs: number;
965+
techniqueBackfillMs: number;
966+
}
967+
962968
/** Format timing result from native orchestrator phases + JS post-processing. */
963969
function formatNativeTimingResult(
964970
p: Record<string, number>,

src/extractors/java.ts

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ import {
1616
nodeStartLine,
1717
pushCall,
1818
pushImport,
19+
setTypeMapEntry,
1920
} from './helpers.js';
2021

2122
/**
@@ -273,13 +274,13 @@ function handleJavaLocalVarDecl(node: TreeSitterNode, ctx: ExtractorOutput): voi
273274
const child = node.child(i);
274275
if (child?.type === 'variable_declarator') {
275276
const nameNode = child.childForFieldName('name');
276-
// Use direct Map.set (last-wins) for local variable declarations.
277-
// Local variable types are method-scoped and should override any
278-
// prior entry (e.g. a same-named constructor parameter). Using
279-
// setTypeMapEntry (first-wins on tie) would let a constructor
280-
// parameter type block a local variable's more-specific concrete type.
281-
if (nameNode && ctx.typeMap)
282-
ctx.typeMap.set(nameNode.text, { type: typeName, confidence: 0.9 });
277+
// Use setTypeMapEntry (first-wins on tie) to match Rust extractor semantics.
278+
// The typeMap is flat per-file without method scoping, so a local variable
279+
// in one method (e.g. `InMemoryUserRepository repo` in `createDefault()`) must
280+
// not override a parameter binding set by an earlier method
281+
// (e.g. `UserRepository repo` constructor param). First-wins preserves the
282+
// interface/abstract type annotation that drives correct CHA dispatch.
283+
if (nameNode && ctx.typeMap) setTypeMapEntry(ctx.typeMap, nameNode.text, typeName, 0.9);
283284
}
284285
}
285286
}

tests/benchmarks/regression-guard.test.ts

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -322,19 +322,17 @@ const SKIP_VERSIONS = new Set(['3.8.0']);
322322
* 3.11.2:1-file rebuild entry above. Remove once #1440 lands warmups and
323323
* 3.13+ data confirms the steady state.
324324
*
325-
* - 3.12.0:No-op rebuild — CI runner variance on a sub-30ms native incremental
326-
* metric. The 3.12.0 baseline captures native noopRebuildMs=23 in the
327-
* incremental benchmark. The per-PR perf-canary gate (#1433) re-measured dev
328-
* on a fresh shared runner (PR #1498) and landed at 112ms (+387%, NOISY
329-
* threshold 100%). The per-PR canary is a new workflow firing for the first
330-
* time on this corpus — it builds the native addon from source before running
331-
* the benchmark, and the runner was under shared load. No changes in PR #1498
332-
* touch the no-op rebuild hot path (no change to collect_files, detect_removed_files,
333-
* earlyExit logic, or detectDroppedLanguageGap). The Rust changes are a refactor
334-
* of emit_pts_alias_edges (no logic change) and additive typeMap entries in the
335-
* Go and Python extractors, neither of which run during a no-op rebuild.
336-
* Same shape and root cause as 3.11.2:No-op rebuild. Exempt this release;
337-
* remove once 3.13+ incremental data confirms the steady state.
325+
* - 3.12.0:No-op rebuild — CI runner variance on a sub-30ms native metric.
326+
* The 3.12.0 incremental-benchmark baseline captures noopRebuildMs=23; the
327+
* per-PR perf-canary for PR #1468 (enclosing-caller attribution fix) measured
328+
* dev at 114ms (+396%, threshold 100%) on run 27455727444. None of the code
329+
* paths changed by that PR execute during a no-op rebuild — the Rust pipeline
330+
* returns at the early-exit branch after Stage 3 (detect_changes) with zero
331+
* file changes, before any extraction, edge building, or the modified
332+
* find_enclosing_caller / EDGE_NODE_KIND_FILTER code runs. Root cause is
333+
* shared-runner scheduling jitter amplified by the sub-30ms baseline, identical
334+
* to the 3.11.2:No-op rebuild pattern. Remove once 3.13+ data confirms the
335+
* steady-state.
338336
*
339337
* NOTE: WASM *timing* noise no longer needs per-version entries here — it is
340338
* handled structurally by WASM_TIMING_THRESHOLD (see above). The 3.11.x
@@ -358,7 +356,6 @@ const KNOWN_REGRESSIONS = new Set([
358356
'3.12.0:No-op rebuild',
359357
'3.12.0:Full build',
360358
'3.12.0:1-file rebuild',
361-
'3.12.0:No-op rebuild',
362359
// tree-sitter-erlang devDependency removed (GHSA-rphw-c8qj-jv84 — malware).
363360
// The erlang WASM is no longer built, so erlang resolution drops to 0%.
364361
// These entries exempt the expected precision/recall drop on every build

0 commit comments

Comments
 (0)