Skip to content

Commit c9f2a4f

Browse files
authored
feat: track dynamic import() expressions as graph edges (#389)
* feat: track dynamic import() expressions as graph edges Extract dynamic import() calls in both query-based and walk-based extraction paths. Adds 'dynamic-imports' edge kind to CORE_EDGE_KINDS. - Add extractDynamicImportsWalk() for the query-based fast path (tree-sitter Query patterns don't match import() function type) - Add extractDynamicImportNames() to extract destructured names from patterns like `const { a } = await import('./foo.js')` - Update builder.js edge kind selection to emit 'dynamic-imports' edges including barrel resolution propagation - Add query-walk parity test for dynamic import expressions Impact: 6 functions changed, 24 affected * fix: correct misleading comment, add debug warning for non-static imports - Fix comment: .then() pattern does not extract destructured names (edge has empty names) - Add debug-level warning when import() has a template literal or variable path that can't be statically resolved Impact: 3 functions changed, 3 affected * style: fix import order and format debug() calls Impact: 3 functions changed, 3 affected * docs: add backlog IDs 81-82 for dynamic import tracking gaps * fix: handle rest_pattern in array destructuring and add aliased import test Address PR review feedback: - Extract `...rest` identifiers from array_pattern in extractDynamicImportNames - Add `{ readFile: rf }` aliased destructuring case to query-walk parity test Impact: 1 functions changed, 5 affected
1 parent 8d6b76e commit c9f2a4f

5 files changed

Lines changed: 156 additions & 7 deletions

File tree

docs/roadmap/BACKLOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,8 @@ These address fundamental limitations in the parsing and resolution pipeline tha
144144
| 71 | Basic type inference for typed languages | Extract type annotations from TypeScript and Java AST nodes (variable declarations, function parameters, return types, generics) to resolve method calls through typed references. Currently `const x: Router = express.Router(); x.get(...)` produces no edge because `x.get` can't be resolved without knowing `x` is a `Router`. Tree-sitter already parses type annotations — we just don't use them for resolution. Start with declared types (no flow inference), which covers the majority of TS/Java code. | Resolution | Dramatically improves call graph completeness for TypeScript and Java — the two languages where developers annotate types explicitly and expect tooling to use them. Directly prevents hallucinated "no callers" results for methods called through typed variables ||| 5 | No ||
145145
| 72 | Interprocedural dataflow analysis | Extend the existing intraprocedural dataflow (ID 14) to propagate `flows_to`/`returns`/`mutates` edges across function boundaries. When function A calls B with argument X, and B's dataflow shows X flows to its return value, connect A's call site to the downstream consumers of B's return. Requires stitching per-function dataflow summaries at call edges — no new parsing, just graph traversal over existing `dataflow` + `edges` tables. Start with single-level propagation (caller↔callee), not transitive closure. | Analysis | Current dataflow stops at function boundaries, missing the most important flows — data passing through helper functions, middleware chains, and factory patterns. Single-function scope means `dataflow` can't answer "where does this user input end up?" across call boundaries. Cross-function propagation is the difference between toy dataflow and useful taint-like analysis ||| 5 | No | 14 |
146146
| 73 | Improved dynamic call resolution | Upgrade the current "best-effort" dynamic dispatch resolution for Python, Ruby, and JavaScript. Three concrete improvements: **(a)** receiver-type tracking — when `x = SomeClass()` is followed by `x.method()`, resolve `method` to `SomeClass.method` using the assignment chain (leverages existing `ast_nodes` + `dataflow` tables); **(b)** common pattern recognition — resolve `EventEmitter.on('event', handler)` callback registration, `Promise.then/catch` chains, `Array.map/filter/reduce` with named function arguments, and decorator/annotation patterns; **(c)** confidence-tiered edges — mark dynamically-resolved edges with a confidence score (high for direct assignment, medium for pattern match, low for heuristic) so consumers can filter by reliability. | Resolution | In Python/Ruby/JS, 30-60% of real calls go through dynamic dispatch — method calls on variables, callbacks, event handlers, higher-order functions. The current best-effort resolution misses most of these, leaving massive gaps in the call graph for the languages where codegraph is most commonly used. Even partial improvement here has outsized impact on graph completeness | ✓ | ✓ | 5 | No | — |
147+
| 81 | Track dynamic `import()` and re-exports as graph edges | Extract `import()` expressions as `dynamic-imports` edges in both WASM extraction paths (query-based and walk-based). Destructured names (`const { a } = await import(...)`) feed into `importedNames` for call resolution. **Partially done:** WASM JS/TS extraction works (PR #389). Remaining: **(a)** native Rust engine support — `crates/codegraph-core/src/extractors/javascript.rs` doesn't extract `import()` calls; **(b)** non-static paths (`import(\`./plugins/${name}.js\`)`, `import(variable)`) are skipped with a debug warning; **(c)** re-export consumer counting in `exports --unused` only checks `calls` edges, not `imports`/`dynamic-imports` — symbols consumed only via import edges show as zero-consumer false positives. | Resolution | Fixes false "zero consumers" reports for symbols consumed via dynamic imports. 95 `dynamic-imports` edges found in codegraph's own codebase — these were previously invisible to impact analysis, exports audit, and dead-export hooks | ✓ | ✓ | 5 | No | — |
148+
| 82 | Extract names from `import().then()` callback patterns | `extractDynamicImportNames` only extracts destructured names from `const { a } = await import(...)` (walks up to `variable_declarator`). The `.then()` pattern — `import('./foo.js').then(({ a, b }) => ...)` — produces an edge with empty names because the destructured parameters live in the `.then()` callback, not a `variable_declarator`. Detect when an `import()` call's parent is a `member_expression` with `.then`, find the arrow/function callback in `.then()`'s arguments, and extract parameter names from its destructuring pattern. | Resolution | `.then()`-style dynamic imports are common in older codebases and lazy-loading patterns (React.lazy, Webpack code splitting). Without name extraction, these produce file-level edges only — no symbol-level `calls` edges, so the imported symbols still appear as zero-consumer false positives ||| 4 | No | 81 |
147149

148150
### Tier 1i — Search, navigation, and monitoring improvements
149151

src/builder.js

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1041,7 +1041,13 @@ export async function buildGraph(rootDir, opts = {}) {
10411041
const resolvedPath = getResolved(path.join(rootDir, relPath), imp.source);
10421042
const targetRow = getNodeId.get(resolvedPath, 'file', resolvedPath, 0);
10431043
if (targetRow) {
1044-
const edgeKind = imp.reexport ? 'reexports' : imp.typeOnly ? 'imports-type' : 'imports';
1044+
const edgeKind = imp.reexport
1045+
? 'reexports'
1046+
: imp.typeOnly
1047+
? 'imports-type'
1048+
: imp.dynamicImport
1049+
? 'dynamic-imports'
1050+
: 'imports';
10451051
allEdgeRows.push([fileNodeId, targetRow.id, edgeKind, 1.0, 0]);
10461052

10471053
if (!imp.reexport && isBarrelFile(resolvedPath)) {
@@ -1060,7 +1066,11 @@ export async function buildGraph(rootDir, opts = {}) {
10601066
allEdgeRows.push([
10611067
fileNodeId,
10621068
actualRow.id,
1063-
edgeKind === 'imports-type' ? 'imports-type' : 'imports',
1069+
edgeKind === 'imports-type'
1070+
? 'imports-type'
1071+
: edgeKind === 'dynamic-imports'
1072+
? 'dynamic-imports'
1073+
: 'imports',
10641074
0.9,
10651075
0,
10661076
]);

src/extractors/javascript.js

Lines changed: 130 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import { debug } from '../logger.js';
12
import { findChild, nodeEndLine } from './helpers.js';
23

34
/**
@@ -173,6 +174,9 @@ function extractSymbolsQuery(tree, query) {
173174
// Extract top-level constants via targeted walk (query patterns don't cover these)
174175
extractConstantsWalk(tree.rootNode, definitions);
175176

177+
// Extract dynamic import() calls via targeted walk (query patterns don't match `import` function type)
178+
extractDynamicImportsWalk(tree.rootNode, imports);
179+
176180
return { definitions, calls, imports, classes, exports: exps };
177181
}
178182

@@ -224,6 +228,41 @@ function extractConstantsWalk(rootNode, definitions) {
224228
}
225229
}
226230

231+
/**
232+
* Recursive walk to find dynamic import() calls.
233+
* Query patterns match call_expression with identifier/member_expression/subscript_expression
234+
* functions, but import() has function type `import` which none of those patterns cover.
235+
*/
236+
function extractDynamicImportsWalk(node, imports) {
237+
if (node.type === 'call_expression') {
238+
const fn = node.childForFieldName('function');
239+
if (fn && fn.type === 'import') {
240+
const args = node.childForFieldName('arguments') || findChild(node, 'arguments');
241+
if (args) {
242+
const strArg = findChild(args, 'string');
243+
if (strArg) {
244+
const modPath = strArg.text.replace(/['"]/g, '');
245+
const names = extractDynamicImportNames(node);
246+
imports.push({
247+
source: modPath,
248+
names,
249+
line: node.startPosition.row + 1,
250+
dynamicImport: true,
251+
});
252+
} else {
253+
debug(
254+
`Skipping non-static dynamic import() at line ${node.startPosition.row + 1} (template literal or variable)`,
255+
);
256+
}
257+
}
258+
return; // no need to recurse into import() children
259+
}
260+
}
261+
for (let i = 0; i < node.childCount; i++) {
262+
extractDynamicImportsWalk(node.child(i), imports);
263+
}
264+
}
265+
227266
function handleCommonJSAssignment(left, right, node, imports) {
228267
if (!left || !right) return;
229268
const leftText = left.text;
@@ -455,11 +494,36 @@ function extractSymbolsWalk(tree) {
455494
case 'call_expression': {
456495
const fn = node.childForFieldName('function');
457496
if (fn) {
458-
const callInfo = extractCallInfo(fn, node);
459-
if (callInfo) calls.push(callInfo);
460-
if (fn.type === 'member_expression') {
461-
const cbDef = extractCallbackDefinition(node, fn);
462-
if (cbDef) definitions.push(cbDef);
497+
// Dynamic import(): import('./foo.js') → extract as an import entry
498+
if (fn.type === 'import') {
499+
const args = node.childForFieldName('arguments') || findChild(node, 'arguments');
500+
if (args) {
501+
const strArg = findChild(args, 'string');
502+
if (strArg) {
503+
const modPath = strArg.text.replace(/['"]/g, '');
504+
// Extract destructured names from parent context:
505+
// const { a, b } = await import('./foo.js')
506+
// (standalone import('./foo.js').then(...) calls produce an edge with empty names)
507+
const names = extractDynamicImportNames(node);
508+
imports.push({
509+
source: modPath,
510+
names,
511+
line: node.startPosition.row + 1,
512+
dynamicImport: true,
513+
});
514+
} else {
515+
debug(
516+
`Skipping non-static dynamic import() at line ${node.startPosition.row + 1} (template literal or variable)`,
517+
);
518+
}
519+
}
520+
} else {
521+
const callInfo = extractCallInfo(fn, node);
522+
if (callInfo) calls.push(callInfo);
523+
if (fn.type === 'member_expression') {
524+
const cbDef = extractCallbackDefinition(node, fn);
525+
if (cbDef) definitions.push(cbDef);
526+
}
463527
}
464528
}
465529
break;
@@ -941,3 +1005,64 @@ function extractImportNames(node) {
9411005
scan(node);
9421006
return names;
9431007
}
1008+
1009+
/**
1010+
* Extract destructured names from a dynamic import() call expression.
1011+
*
1012+
* Handles:
1013+
* const { a, b } = await import('./foo.js') → ['a', 'b']
1014+
* const mod = await import('./foo.js') → ['mod']
1015+
* import('./foo.js') → [] (no names extractable)
1016+
*
1017+
* Walks up the AST from the call_expression to find the enclosing
1018+
* variable_declarator and reads the name/object_pattern.
1019+
*/
1020+
function extractDynamicImportNames(callNode) {
1021+
// Walk up: call_expression → await_expression → variable_declarator
1022+
let current = callNode.parent;
1023+
// Skip await_expression wrapper if present
1024+
if (current && current.type === 'await_expression') current = current.parent;
1025+
// We should now be at a variable_declarator (or not, if standalone import())
1026+
if (!current || current.type !== 'variable_declarator') return [];
1027+
1028+
const nameNode = current.childForFieldName('name');
1029+
if (!nameNode) return [];
1030+
1031+
// const { a, b } = await import(...) → object_pattern
1032+
if (nameNode.type === 'object_pattern') {
1033+
const names = [];
1034+
for (let i = 0; i < nameNode.childCount; i++) {
1035+
const child = nameNode.child(i);
1036+
if (child.type === 'shorthand_property_identifier_pattern') {
1037+
names.push(child.text);
1038+
} else if (child.type === 'pair_pattern') {
1039+
// { a: localName } → use localName (the alias) for the local binding,
1040+
// but use the key (original name) for import resolution
1041+
const key = child.childForFieldName('key');
1042+
if (key) names.push(key.text);
1043+
}
1044+
}
1045+
return names;
1046+
}
1047+
1048+
// const mod = await import(...) → identifier (namespace-like import)
1049+
if (nameNode.type === 'identifier') {
1050+
return [nameNode.text];
1051+
}
1052+
1053+
// const [a, b] = await import(...) → array_pattern (rare but possible)
1054+
if (nameNode.type === 'array_pattern') {
1055+
const names = [];
1056+
for (let i = 0; i < nameNode.childCount; i++) {
1057+
const child = nameNode.child(i);
1058+
if (child.type === 'identifier') names.push(child.text);
1059+
else if (child.type === 'rest_pattern') {
1060+
const inner = child.child(0) || child.childForFieldName('name');
1061+
if (inner && inner.type === 'identifier') names.push(inner.text);
1062+
}
1063+
}
1064+
return names;
1065+
}
1066+
1067+
return [];
1068+
}

src/kinds.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ export const ALL_SYMBOL_KINDS = CORE_SYMBOL_KINDS;
3333
export const CORE_EDGE_KINDS = [
3434
'imports',
3535
'imports-type',
36+
'dynamic-imports',
3637
'reexports',
3738
'calls',
3839
'extends',

tests/engines/query-walk-parity.test.js

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ function normalize(symbols) {
4646
...(i.reexport ? { reexport: true } : {}),
4747
...(i.wildcardReexport ? { wildcardReexport: true } : {}),
4848
...(i.typeOnly ? { typeOnly: true } : {}),
49+
...(i.dynamicImport ? { dynamicImport: true } : {}),
4950
}))
5051
.sort((a, b) => a.line - b.line),
5152
classes: (symbols.classes || [])
@@ -178,6 +179,16 @@ export class Server {
178179
fn.call(null, arg);
179180
obj.apply(undefined, args);
180181
method.bind(ctx);
182+
`,
183+
},
184+
{
185+
name: 'dynamic import() expressions',
186+
file: 'test.js',
187+
code: `
188+
const { readFile } = await import('fs/promises');
189+
const { readFile: rf } = await import('node:fs/promises');
190+
const mod = await import('./utils.js');
191+
import('./side-effect.js');
181192
`,
182193
},
183194
// TypeScript-specific

0 commit comments

Comments
 (0)