Skip to content

Commit 95bc17b

Browse files
authored
fix(native): persist this/super dispatch via hybrid WASM post-pass (#1337)
* fix(native): persist this/super dispatch via hybrid WASM post-pass (#1326) The native orchestrator resolves typed receiver calls but does not persist raw unresolved call site receiver info (this/super) to the DB, so runPostNativeCha could not resolve this.method() or super.method() calls. Add runPostNativeThisDispatch: after the Rust pipeline completes, WASM-re- parses JS/TS/TSX files to collect call sites with this/super receivers, then resolves them through the DB class hierarchy (extends edges) using the existing resolveThisDispatch function. Only runs when extends edges exist. Removes the skipIf(engine === 'native') guards on the this-dispatch and super-dispatch integration tests — both engines now produce identical edges for ConcreteWorker.doWork → ConcreteWorker.prepare and Lion.speak → Animal.speak. The two CHA transitive skips remain (pending abstract_class_ declaration fix in a future native binary). Closes #1326 * fix(native): correct NULL ordering in findCallerByLine and remove self receiver SQLite ASC ordering puts NULL values first, so (end_line - line) ASC would pick unbounded nodes before any bounded node — inverting the intent. Replace with COALESCE(end_line - line, 999999999) ASC so unbounded nodes sort last. Also remove 'self' from the this/super receiver filter in runPostNativeThisDispatch. In JS/TS files 'self' refers to WindowOrWorkerGlobalScope, not a class instance — including it would produce spurious dispatch edges from Worker call sites. * fix(native): scope this/super WASM re-parse to inheritance-hierarchy files only On a full native build, runPostNativeThisDispatch was WASM-re-parsing every JS/TS file in the project, adding a costly second parse pass on top of the native Rust parse (measured: +358% ms/file on codegraph itself). Narrow the file set to only files that appear in the class inheritance graph (sources and targets of 'extends' edges). Files outside the hierarchy have no class relationship, so this/super calls in them either resolve locally or are skipped by resolveThisDispatch anyway — WASM re-parsing them adds cost with zero benefit. Also replace the hardcoded 0.1 confidence penalty with the CHA_DISPATCH_PENALTY named constant (already imported), matching every other CHA confidence calculation in native-orchestrator.ts and build-edges.ts. Fixes: regression-guard failure "Build ms/file: 3.6 → 16.5 (+358%)" (#1337) * fix(native): document incremental limitation and capture thisDispatchMs timing Add a comment to the incremental-build branch of runPostNativeThisDispatch documenting the known gap: if a parent-class method is replaced (new node ID) but the child file is unchanged, the stale super.method() edge is not refreshed until the next full rebuild. Add wall-clock timing for the this/super dispatch post-pass. The function now returns the elapsed milliseconds (Promise<number>), and the result is threaded through formatNativeTimingResult as a new thisDispatchMs phase. For large class hierarchies the WASM re-parse can be non-trivial, so surfacing it in build diagnostics makes performance regressions visible. * fix(native): re-classify node roles after this/super dispatch post-pass (#1337) The Rust orchestrator runs role classification before the post-passes, so target methods (e.g. Animal.speak, ConcreteWorker.prepare) that had no callers at Rust build time were classified dead or dead-ffi. runPostNativeThisDispatch inserted the correct call edges but never re-ran classifyNodeRoles, leaving stale role labels visible to dead-code detection and API boundary analysis. Mirror the pattern used after runPostNativeCha: change the return type from Promise<number> to Promise<{ elapsedMs: number; targetIds: Set<number> }>, collect target node IDs while building newEdges, then look up the affected files and call classifyNodeRoles on them — same chunk-and-dedupe pattern as the CHA post-pass.
1 parent 6031440 commit 95bc17b

3 files changed

Lines changed: 286 additions & 49 deletions

File tree

src/domain/graph/builder/stages/native-orchestrator.ts

Lines changed: 261 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,9 @@ import {
4242
parseFilesWasmForBackfill,
4343
} from '../../../parser.js';
4444
import { computeConfidence } from '../../resolve.js';
45+
import type { CallNodeLookup } from '../call-resolver.js';
46+
import type { ChaContext } from '../cha.js';
47+
import { resolveThisDispatch } from '../cha.js';
4548
import type { PipelineContext } from '../context.js';
4649
import {
4750
batchInsertEdges,
@@ -395,9 +398,8 @@ async function runPostNativeAnalysis(
395398
* each call to an interface/abstract method to ALL RTA-filtered concrete
396399
* implementations.
397400
*
398-
* Note: `this`/`super` dispatch requires the raw unresolved call sites which are
399-
* not persisted to the DB by the Rust pipeline. That case is handled by the WASM
400-
* path (`buildFileCallEdges`) and is a known gap for the native orchestrator.
401+
* Note: `this`/`super` dispatch is handled separately by `runPostNativeThisDispatch`,
402+
* which WASM-re-parses JS/TS files to obtain raw call site receiver info.
401403
*
402404
* Returns the set of target node IDs for newly inserted CHA edges so the caller
403405
* can re-classify roles for the affected implementation files. An empty set
@@ -558,11 +560,204 @@ function runPostNativeCha(db: BetterSqlite3Database): Set<number> {
558560
return newTargetIds;
559561
}
560562

563+
// Extensions where `this`/`super` dispatch can occur (JS/TS family)
564+
const THIS_DISPATCH_EXTS = new Set(['.js', '.ts', '.tsx', '.jsx', '.mjs', '.cjs', '.mts', '.cts']);
565+
566+
/**
567+
* Phase 8.5: this/super dispatch post-pass for the native orchestrator path.
568+
*
569+
* The Rust build pipeline resolves typed receiver calls but does NOT persist raw
570+
* unresolved call site receiver info (e.g. `this`, `super`) to the DB. This
571+
* hybrid post-pass re-parses JS/TS/TSX files via WASM to collect call sites with
572+
* `this`/`super` receivers, then resolves them through the class hierarchy stored
573+
* in DB `extends` edges — mirroring what `buildChaPostPass` does on the WASM path.
574+
*
575+
* Only runs when `extends` edges exist in the DB; if there is no inheritance
576+
* hierarchy there is nothing to resolve via `this`/`super` dispatch.
577+
*/
578+
async function runPostNativeThisDispatch(
579+
db: BetterSqlite3Database,
580+
rootDir: string,
581+
changedFiles: string[] | undefined,
582+
isFullBuild: boolean,
583+
): Promise<{ elapsedMs: number; targetIds: Set<number> }> {
584+
const t0 = Date.now();
585+
const targetIds = new Set<number>();
586+
// Fast guard: need at least one extends edge for this/super to have meaning
587+
const hasExtends = db.prepare(`SELECT 1 FROM edges WHERE kind = 'extends' LIMIT 1`).get();
588+
if (!hasExtends) return { elapsedMs: 0, targetIds };
589+
590+
// Build parents map: child class → direct parent class (from `extends` edges)
591+
const parentRows = db
592+
.prepare(`
593+
SELECT src.name AS child_name, tgt.name AS parent_name
594+
FROM edges e
595+
JOIN nodes src ON e.source_id = src.id
596+
JOIN nodes tgt ON e.target_id = tgt.id
597+
WHERE e.kind = 'extends'
598+
`)
599+
.all() as Array<{ child_name: string; parent_name: string }>;
600+
601+
const parents = new Map<string, string>();
602+
for (const row of parentRows) {
603+
if (!parents.has(row.child_name)) parents.set(row.child_name, row.parent_name);
604+
}
605+
if (parents.size === 0) return { elapsedMs: 0, targetIds };
606+
607+
const chaCtx: ChaContext = {
608+
implementors: new Map(), // not needed for this/super resolution
609+
parents,
610+
instantiatedTypes: new Set(), // not needed for this/super resolution
611+
};
612+
613+
// Determine which files to re-parse.
614+
//
615+
// On a full build we do NOT re-parse every JS/TS file — that would WASM-parse
616+
// the entire project on top of the native pass, causing a massive regression
617+
// (measured: +358% ms/file on codegraph itself). Instead we restrict to files
618+
// that are part of the class inheritance hierarchy: both subclass files (which
619+
// contain `super.X()` calls dispatching to a parent) and parent-class files
620+
// (whose method bodies contain `this.X()` calls that CHA must resolve). Any
621+
// file not in the hierarchy has no `extends` relationship, so `this`/`super`
622+
// calls in it either resolve locally (same-class dispatch, already handled by
623+
// the direct-call edge) or have no class context — and will be skipped by
624+
// `resolveThisDispatch` anyway.
625+
let relFiles: string[];
626+
if (isFullBuild || !changedFiles) {
627+
const rows = db
628+
.prepare(`
629+
SELECT DISTINCT file FROM (
630+
SELECT src.file AS file
631+
FROM edges e
632+
JOIN nodes src ON e.source_id = src.id
633+
WHERE e.kind = 'extends' AND src.file IS NOT NULL
634+
UNION
635+
SELECT tgt.file AS file
636+
FROM edges e
637+
JOIN nodes tgt ON e.target_id = tgt.id
638+
WHERE e.kind = 'extends' AND tgt.file IS NOT NULL
639+
)
640+
`)
641+
.all() as Array<{ file: string }>;
642+
relFiles = rows
643+
.map((r) => r.file)
644+
.filter((f) => THIS_DISPATCH_EXTS.has(path.extname(f).toLowerCase()));
645+
} else {
646+
// NOTE: Only files explicitly listed in changedFiles are re-parsed.
647+
// If a parent-class method is replaced (new node ID) but the child file is
648+
// unchanged, the stale super.method() edge is not refreshed here. A full
649+
// rebuild (isFullBuild=true) is required to recover in that scenario.
650+
relFiles = changedFiles.filter((f) => THIS_DISPATCH_EXTS.has(path.extname(f).toLowerCase()));
651+
}
652+
if (relFiles.length === 0) return { elapsedMs: 0, targetIds };
653+
654+
// DB-backed CallNodeLookup — resolveThisDispatch only calls byName()
655+
const findByNameStmt = db.prepare(`SELECT id, file, kind FROM nodes WHERE name = ?`);
656+
const lookup: CallNodeLookup = {
657+
byName: (name) => findByNameStmt.all(name) as Array<{ id: number; file: string; kind: string }>,
658+
byNameAndFile: (name, file) =>
659+
(findByNameStmt.all(name) as Array<{ id: number; file: string; kind: string }>).filter(
660+
(n) => n.file === file,
661+
),
662+
isBarrel: () => false,
663+
resolveBarrel: () => null,
664+
nodeId: () => undefined,
665+
};
666+
667+
// Seed seen-pairs from existing call edges on source nodes in our file set
668+
const seen = new Set<string>();
669+
const CHUNK = 500;
670+
for (let i = 0; i < relFiles.length; i += CHUNK) {
671+
const chunk = relFiles.slice(i, i + CHUNK);
672+
const ph = chunk.map(() => '?').join(',');
673+
const rows = db
674+
.prepare(
675+
`SELECT e.source_id, e.target_id
676+
FROM edges e
677+
JOIN nodes n ON e.source_id = n.id
678+
WHERE e.kind = 'calls' AND n.file IN (${ph})`,
679+
)
680+
.all(...chunk) as Array<{ source_id: number; target_id: number }>;
681+
for (const r of rows) seen.add(`${r.source_id}|${r.target_id}`);
682+
}
683+
684+
// Find the innermost containing method/function for a call at `line` in `file`.
685+
// COALESCE maps NULL end_line to a large sentinel so unbounded nodes sort last
686+
// (SQLite ASC orders NULLs first, so a raw `end_line - line` would pick them first).
687+
const findCallerByLineStmt = db.prepare(`
688+
SELECT id, name FROM nodes
689+
WHERE file = ? AND kind IN ('method', 'function')
690+
AND line <= ? AND (end_line IS NULL OR end_line >= ?)
691+
ORDER BY COALESCE(end_line - line, 999999999) ASC
692+
LIMIT 1
693+
`);
694+
695+
// WASM-parse the files to obtain raw call sites with receiver info
696+
const absFiles = relFiles.map((f) => path.join(rootDir, f));
697+
const wasmResults = await parseFilesWasmForBackfill(absFiles, rootDir);
698+
699+
const newEdges: Array<[number, number, string, number, number, string]> = [];
700+
701+
for (const [relPath, symbols] of wasmResults) {
702+
for (const call of symbols.calls) {
703+
// Only 'this' and 'super' are class-instance receivers in JS/TS.
704+
// 'self' refers to WindowOrWorkerGlobalScope — not a class instance — so
705+
// filtering it here prevents spurious dispatch edges from Worker call sites.
706+
if (call.receiver !== 'this' && call.receiver !== 'super') continue;
707+
708+
const callerRow = findCallerByLineStmt.get(relPath, call.line, call.line) as
709+
| { id: number; name: string }
710+
| undefined;
711+
if (!callerRow) continue;
712+
713+
const targets = resolveThisDispatch(
714+
call.name,
715+
callerRow.name,
716+
call.receiver as 'this' | 'super',
717+
chaCtx,
718+
lookup,
719+
);
720+
721+
for (const t of targets) {
722+
const key = `${callerRow.id}|${t.id}`;
723+
if (seen.has(key)) continue;
724+
seen.add(key);
725+
const conf = computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;
726+
if (conf <= 0) continue;
727+
newEdges.push([callerRow.id, t.id, 'calls', conf, 0, 'cha']);
728+
targetIds.add(t.id);
729+
}
730+
}
731+
}
732+
733+
if (newEdges.length > 0) {
734+
db.transaction(() => batchInsertEdges(db, newEdges))();
735+
debug(`this/super dispatch post-pass: inserted ${newEdges.length} edge(s)`);
736+
}
737+
738+
// Free WASM parse trees — mirrors the cleanup in backfillNativeDroppedFiles
739+
for (const [, symbols] of wasmResults) {
740+
const tree = (symbols as { _tree?: { delete?: () => void } })._tree;
741+
if (tree && typeof tree.delete === 'function') {
742+
try {
743+
tree.delete();
744+
} catch {
745+
/* ignore cleanup errors */
746+
}
747+
}
748+
(symbols as { _tree?: unknown; _langId?: unknown })._tree = undefined;
749+
(symbols as { _tree?: unknown; _langId?: unknown })._langId = undefined;
750+
}
751+
752+
return { elapsedMs: Date.now() - t0, targetIds };
753+
}
754+
561755
/** Format timing result from native orchestrator phases + JS post-processing. */
562756
function formatNativeTimingResult(
563757
p: Record<string, number>,
564758
structurePatchMs: number,
565759
analysisTiming: { astMs: number; complexityMs: number; cfgMs: number; dataflowMs: number },
760+
thisDispatchMs: number,
566761
): BuildResult {
567762
return {
568763
phases: {
@@ -575,6 +770,7 @@ function formatNativeTimingResult(
575770
edgesMs: +(p.edgesMs ?? 0).toFixed(1),
576771
structureMs: +((p.structureMs ?? 0) + structurePatchMs).toFixed(1),
577772
rolesMs: +(p.rolesMs ?? 0).toFixed(1),
773+
thisDispatchMs: +thisDispatchMs.toFixed(1),
578774
astMs: +(analysisTiming.astMs ?? 0).toFixed(1),
579775
complexityMs: +(analysisTiming.complexityMs ?? 0).toFixed(1),
580776
cfgMs: +(analysisTiming.cfgMs ?? 0).toFixed(1),
@@ -1114,7 +1310,7 @@ export async function tryNativeOrchestrator(
11141310
ctx.nativeFirstProxy = false;
11151311
} else if (!ctx.nativeFirstProxy && !handoffWalAfterNativeBuild(ctx)) {
11161312
// DB reopen failed — return partial result
1117-
return formatNativeTimingResult(p, 0, analysisTiming);
1313+
return formatNativeTimingResult(p, 0, analysisTiming, 0);
11181314
}
11191315
}
11201316

@@ -1195,10 +1391,68 @@ export async function tryNativeOrchestrator(
11951391
}
11961392
}
11971393

1394+
// Phase 8.5: this/super dispatch — hybrid WASM re-parse to resolve call sites
1395+
// whose raw receiver info the Rust pipeline does not persist to DB.
1396+
const { elapsedMs: thisDispatchMs, targetIds: thisDispatchTargetIds } =
1397+
await runPostNativeThisDispatch(
1398+
ctx.db as unknown as BetterSqlite3Database,
1399+
ctx.rootDir,
1400+
result.changedFiles,
1401+
!!result.isFullBuild,
1402+
);
1403+
1404+
// Re-classify roles for methods that gained incoming this/super dispatch edges.
1405+
// The Rust orchestrator classifies roles BEFORE this post-pass, so target methods
1406+
// (e.g. Animal.speak, ConcreteWorker.prepare) that had no callers at Rust time
1407+
// are classified `dead` or `dead-ffi`. Inserting the new call edges does not
1408+
// automatically update those role labels — without a re-run the stale labels
1409+
// propagate to dead-code detection and API boundary analysis.
1410+
if (thisDispatchTargetIds.size > 0) {
1411+
try {
1412+
const db = ctx.db as unknown as BetterSqlite3Database;
1413+
const idArray = Array.from(thisDispatchTargetIds);
1414+
const CHUNK_SIZE = 500;
1415+
const seenFiles = new Set<string>();
1416+
const affectedFiles: Array<{ file: string }> = [];
1417+
for (let i = 0; i < idArray.length; i += CHUNK_SIZE) {
1418+
const chunk = idArray.slice(i, i + CHUNK_SIZE);
1419+
const placeholders = chunk.map(() => '?').join(',');
1420+
const rows = db
1421+
.prepare(
1422+
`SELECT DISTINCT file FROM nodes WHERE id IN (${placeholders}) AND file IS NOT NULL`,
1423+
)
1424+
.all(...chunk) as Array<{ file: string }>;
1425+
for (const row of rows) {
1426+
if (!seenFiles.has(row.file)) {
1427+
seenFiles.add(row.file);
1428+
affectedFiles.push(row);
1429+
}
1430+
}
1431+
}
1432+
if (affectedFiles.length > 0) {
1433+
const { classifyNodeRoles } = (await import('../../../../features/structure.js')) as {
1434+
classifyNodeRoles: (
1435+
db: BetterSqlite3Database,
1436+
changedFiles?: string[] | null,
1437+
) => Record<string, number>;
1438+
};
1439+
classifyNodeRoles(
1440+
db,
1441+
affectedFiles.map((r) => r.file),
1442+
);
1443+
debug(
1444+
`this/super dispatch post-pass: re-classified roles for ${affectedFiles.length} target file(s)`,
1445+
);
1446+
}
1447+
} catch (err) {
1448+
debug(`this/super dispatch post-pass role re-classification failed: ${toErrorMessage(err)}`);
1449+
}
1450+
}
1451+
11981452
// Backfill the `technique` column on `calls` edges written by the Rust
11991453
// orchestrator, which does not write the column. Runs after all edge-writing
1200-
// phases (including the WASM dropped-language backfill and CHA post-pass) so
1201-
// every new edge in this build cycle gets a technique label.
1454+
// phases (including the WASM dropped-language backfill, CHA post-pass, and
1455+
// this/super dispatch) so every new edge in this build cycle gets a label.
12021456
backfillEdgeTechniquesAfterNativeOrchestrator(ctx.db, !!result.isFullBuild, result.changedFiles);
12031457

12041458
// ── Structure and analysis fallback (run after edge-writing so roles see full graph) ──
@@ -1223,5 +1477,5 @@ export async function tryNativeOrchestrator(
12231477
}
12241478

12251479
closeDbPair({ db: ctx.db, nativeDb: ctx.nativeDb });
1226-
return formatNativeTimingResult(p, structurePatchMs, analysisTiming);
1480+
return formatNativeTimingResult(p, structurePatchMs, analysisTiming, thisDispatchMs);
12271481
}

src/types.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1162,6 +1162,8 @@ export interface BuildResult {
11621162
edgesMs: number;
11631163
structureMs: number;
11641164
rolesMs: number;
1165+
/** Wall-clock time for the this/super dispatch WASM post-pass (native path only). */
1166+
thisDispatchMs?: number;
11651167
astMs: number;
11661168
complexityMs: number;
11671169
cfgMs: number;

0 commit comments

Comments
 (0)