Skip to content

Commit 204862c

Browse files
authored
feat(parity): port the JS points-to solver to native and unify engine resolution (#1465)
* refactor(native): mirror crate module layout to the src/ TypeScript tree Reorganize crates/codegraph-core/src/ so every module sits at the path of its TypeScript counterpart (snake_case for kebab-case): shared/, infrastructure/, db/repository/, domain/graph/builder/stages/, ast_analysis/, graph/algorithms/, graph/classifiers/, features/. - Pure git mv moves; only graph_algorithms.rs is split (bfs, shortest_path, centrality, louvain) along its existing section boundaries - lib.rs doc comment carries the full Rust<->TypeScript mapping table - Cross-references in TS sources, tests, and docs updated to new paths - Cargo.lock version synced to 3.12.0 (Cargo.toml was already bumped) - cargo test: 360 passed; tsc build and drift-guard test green * feat(parity): port the JS points-to solver to the native engine and unify per-engine resolution Native (Rust): - Extract all eight pts binding kinds in the JS extractor (param, this-call, array-element, spread-arg, for-of, array-callback, object-rest-param, object-prop) and surface them on FileSymbols - Run the same fixed-point points-to solver as the WASM path inside build_call_edges: thisCall-to-fnRef conversion, four-case key gate with receiver-key fallback, hop-penalised alias edges with pts upgrade - Normalize inline-new receivers (new A().t() -> receiver "A") in extract_receiver_name, mirroring extractReceiverName in the TS extractor - Apply the >=0.5 confidence filter on exact cross-file lookups (#1439) WASM/TS: - Plumb params and all eight binding arrays through NativeFileEntry so the hybrid path feeds the native solver - Serialize thisCallBindings across the WASM worker boundary - Backport the native engine's class-field type-annotation extraction (private repo: Repository seeds typeMap "repo"/"this.repo") - Remove the four JS pts post-passes that duplicated the native solver on the hybrid path (param-flow, fnRef, thisCall, object-rest) - Report the native build summary and build_meta counts after the JS edge-writing post-passes so they include CHA/this-dispatch edges (#1452) WASM, full-native orchestrator, and hybrid builds now produce identical edge multisets on the javascript fixture (155 rows each, including confidence and dynamic flags); javascript 42/42 and pts-javascript 13/13 expected edges on both engines; 392 Rust tests, 3043 JS tests, and the 176-test resolution benchmark are green. Closes #1453 Closes #1452 Closes #1439 * fix(native): port Phase 8.2 cross-file return-type propagation to the Rust orchestrator The JS pipeline seeds each file's typeMap with the return types of imported factory functions (propagateReturnTypesAcrossFiles) before edge resolution, so `const svc = buildService(); svc.createUser()` resolves across files. The Rust orchestrator extracted returnTypeMap and callAssignments but never consumed them, dropping those calls and receiver edges on the native path (hybrid was unaffected because the JS pipeline pre-seeds the typeMap it sends over napi). Mirror the JS pass in pipeline.rs: build a per-file + global return-type index, resolve each call assignment through the file's imports (or the qualified Type.method global map), and inject typeMap entries at confidence minus PROPAGATION_HOP_PENALTY, never overwriting locally typed variables. Verified with scripts/parity-compare.mjs: the javascript fixture now matches exactly across wasm/native/hybrid (180 edges incl. driver.mjs conf=0.7 calls + conf=0.75 receiver edges). * fix(native): add process/window/document/globalThis to JS_BUILTIN_GLOBALS (#1465) * fix(native): add safety comment on max_idx usize cast guard (#1465)
1 parent f814067 commit 204862c

13 files changed

Lines changed: 2386 additions & 528 deletions

File tree

crates/codegraph-core/src/domain/graph/builder/pipeline.rs

Lines changed: 248 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ use crate::domain::parallel;
2727
use crate::db::repository::ast::{self, AstInsertNode, FileAstBatch};
2828
use crate::graph::classifiers::roles;
2929
use crate::features::structure;
30-
use crate::types::{FileSymbols, ImportResolutionInput};
30+
use crate::types::{FileSymbols, ImportResolutionInput, TypeMapEntry};
3131
use rusqlite::Connection;
3232
use serde::Serialize;
3333
use std::collections::{HashMap, HashSet};
@@ -550,6 +550,11 @@ pub fn run_pipeline(
550550
let import_edge_rows = import_edges::build_import_edges(conn, &import_ctx);
551551
import_edges::insert_edges(conn, &import_edge_rows);
552552

553+
// Phase 8.2: cross-file return-type propagation — seed each file's
554+
// type_map with the return types of imported functions before call-edge
555+
// building, mirroring propagateReturnTypesAcrossFiles in build-edges.ts.
556+
propagate_return_types_across_files(&mut file_symbols, &import_ctx);
557+
553558
// Build call edges using existing Rust edge_builder (internal path)
554559
// For now, call edges are built via the existing napi-exported function's
555560
// internal logic. We load nodes from DB and pass to the edge builder.
@@ -1288,6 +1293,106 @@ fn collect_imported_names_for_file(
12881293
imported_names
12891294
}
12901295

1296+
/// Phase 8.2: cross-file return-type propagation.
1297+
///
1298+
/// Mirrors `propagateReturnTypesAcrossFiles` in `build-edges.ts`: when a file
1299+
/// assigns the return value of an imported function to a variable
1300+
/// (`const svc = buildService()`), look up the callee's return type in the
1301+
/// defining file's `return_type_map` and seed the assigning file's `type_map`
1302+
/// so method calls and receiver edges on that variable resolve. Must run
1303+
/// before `build_and_insert_call_edges`.
1304+
fn propagate_return_types_across_files(
1305+
file_symbols: &mut HashMap<String, FileSymbols>,
1306+
import_ctx: &ImportEdgeContext,
1307+
) {
1308+
use crate::domain::graph::builder::stages::build_edges::PROPAGATION_HOP_PENALTY;
1309+
1310+
// rel_path → (fn_name → (type_name, confidence))
1311+
let mut return_type_index: HashMap<String, HashMap<String, (String, f64)>> = HashMap::new();
1312+
for (rel_path, symbols) in file_symbols.iter() {
1313+
if symbols.return_type_map.is_empty() {
1314+
continue;
1315+
}
1316+
let per_file = return_type_index.entry(rel_path.clone()).or_default();
1317+
for e in &symbols.return_type_map {
1318+
per_file.insert(e.name.clone(), (e.type_name.clone(), e.confidence));
1319+
}
1320+
}
1321+
if return_type_index.is_empty() {
1322+
return;
1323+
}
1324+
1325+
// Flat map for qualified `Type.method` lookups. Higher confidence wins;
1326+
// ties keep the first writer. Files are visited in sorted order so the
1327+
// tie-break is deterministic (HashMap iteration order is not).
1328+
let mut global_return_types: HashMap<String, (String, f64)> = HashMap::new();
1329+
let mut sorted_paths: Vec<&String> = return_type_index.keys().collect();
1330+
sorted_paths.sort();
1331+
for rel_path in sorted_paths {
1332+
for (name, entry) in &return_type_index[rel_path] {
1333+
let replace = match global_return_types.get(name) {
1334+
Some(existing) => entry.1 > existing.1,
1335+
None => true,
1336+
};
1337+
if replace {
1338+
global_return_types.insert(name.clone(), entry.clone());
1339+
}
1340+
}
1341+
}
1342+
1343+
for (rel_path, symbols) in file_symbols.iter_mut() {
1344+
if symbols.call_assignments.is_empty() {
1345+
continue;
1346+
}
1347+
1348+
let abs_file = Path::new(&import_ctx.root_dir).join(rel_path.as_str());
1349+
let abs_str = abs_file.to_str().unwrap_or("");
1350+
let imported_names = collect_imported_names_for_file(abs_str, symbols, import_ctx);
1351+
// Later entries overwrite earlier ones on duplicate names — same as the
1352+
// HashMap collect in build_call_edges.
1353+
let imported_map: HashMap<String, String> = imported_names
1354+
.into_iter()
1355+
.map(|e| (e.name, e.file))
1356+
.collect();
1357+
1358+
let mut injections: Vec<TypeMapEntry> = Vec::new();
1359+
let mut injected: HashSet<String> = HashSet::new();
1360+
for ca in &symbols.call_assignments {
1361+
// Already resolved locally (JS: `typeMap.has(varName)`); first
1362+
// successful injection wins for repeated assignments to one name.
1363+
if injected.contains(&ca.var_name)
1364+
|| symbols.type_map.iter().any(|t| t.name == ca.var_name)
1365+
{
1366+
continue;
1367+
}
1368+
1369+
let found = match &ca.receiver_type_name {
1370+
Some(receiver) => {
1371+
global_return_types.get(&format!("{receiver}.{}", ca.callee_name))
1372+
}
1373+
None => imported_map.get(&ca.callee_name).and_then(|from| {
1374+
return_type_index
1375+
.get(from)
1376+
.and_then(|m| m.get(&ca.callee_name))
1377+
}),
1378+
};
1379+
1380+
if let Some((type_name, confidence)) = found {
1381+
let propagated = confidence - PROPAGATION_HOP_PENALTY;
1382+
if propagated > 0.0 {
1383+
injections.push(TypeMapEntry {
1384+
name: ca.var_name.clone(),
1385+
type_name: type_name.clone(),
1386+
confidence: propagated,
1387+
});
1388+
injected.insert(ca.var_name.clone());
1389+
}
1390+
}
1391+
}
1392+
symbols.type_map.extend(injections);
1393+
}
1394+
}
1395+
12911396
/// Insert the edges produced by the native edge builder into the edges table.
12921397
fn insert_call_edge_rows(conn: &Connection, edges: &[crate::domain::graph::builder::stages::build_edges::ComputedEdge]) {
12931398
if edges.is_empty() {
@@ -1348,6 +1453,10 @@ fn build_and_insert_call_edges(
13481453
})
13491454
.collect();
13501455

1456+
fn non_empty<T: Clone>(v: &[T]) -> Option<Vec<T>> {
1457+
if v.is_empty() { None } else { Some(v.to_vec()) }
1458+
}
1459+
13511460
file_entries.push(FileEdgeInput {
13521461
file: rel_path.clone(),
13531462
file_node_id,
@@ -1359,6 +1468,15 @@ fn build_and_insert_call_edges(
13591468
kind: d.kind.clone(),
13601469
line: d.line,
13611470
end_line: d.end_line,
1471+
// Phase 8.3c: ordered parameter names for parameter-flow pts —
1472+
// mirrors buildDefinitionParamsMap reading def.children.
1473+
params: d.children.as_ref().map(|children| {
1474+
children
1475+
.iter()
1476+
.filter(|c| c.kind == "parameter")
1477+
.map(|c| c.name.clone())
1478+
.collect()
1479+
}),
13621480
})
13631481
.collect(),
13641482
calls: symbols
@@ -1382,11 +1500,15 @@ fn build_and_insert_call_edges(
13821500
})
13831501
.collect(),
13841502
type_map,
1385-
fn_ref_bindings: if symbols.fn_ref_bindings.is_empty() {
1386-
None
1387-
} else {
1388-
Some(symbols.fn_ref_bindings.clone())
1389-
},
1503+
fn_ref_bindings: non_empty(&symbols.fn_ref_bindings),
1504+
param_bindings: non_empty(&symbols.param_bindings),
1505+
this_call_bindings: non_empty(&symbols.this_call_bindings),
1506+
array_elem_bindings: non_empty(&symbols.array_elem_bindings),
1507+
spread_arg_bindings: non_empty(&symbols.spread_arg_bindings),
1508+
for_of_bindings: non_empty(&symbols.for_of_bindings),
1509+
array_callback_bindings: non_empty(&symbols.array_callback_bindings),
1510+
object_rest_param_bindings: non_empty(&symbols.object_rest_param_bindings),
1511+
object_prop_bindings: non_empty(&symbols.object_prop_bindings),
13901512
});
13911513
}
13921514

@@ -1798,3 +1920,123 @@ fn now_ms() -> f64 {
17981920
.map(|d| d.as_millis() as f64)
17991921
.unwrap_or(0.0)
18001922
}
1923+
1924+
#[cfg(test)]
1925+
mod tests {
1926+
use super::*;
1927+
use crate::types::{Import, PathAliases};
1928+
1929+
fn make_import_ctx(file_symbols: &HashMap<String, FileSymbols>) -> ImportEdgeContext {
1930+
let mut batch_resolved = HashMap::new();
1931+
batch_resolved.insert("/repo/driver.js|./service.js".to_string(), "service.js".to_string());
1932+
ImportEdgeContext {
1933+
batch_resolved,
1934+
reexport_map: HashMap::new(),
1935+
barrel_only_files: HashSet::new(),
1936+
file_symbols: file_symbols.clone(),
1937+
root_dir: "/repo".to_string(),
1938+
aliases: PathAliases { base_url: None, paths: vec![] },
1939+
known_files: HashSet::new(),
1940+
}
1941+
}
1942+
1943+
fn entry(name: &str, type_name: &str, confidence: f64) -> TypeMapEntry {
1944+
TypeMapEntry {
1945+
name: name.to_string(),
1946+
type_name: type_name.to_string(),
1947+
confidence,
1948+
}
1949+
}
1950+
1951+
#[test]
1952+
fn propagates_imported_factory_return_type_into_type_map() {
1953+
let mut service = FileSymbols::new("service.js".to_string());
1954+
service.return_type_map.push(entry("buildService", "UserService", 0.85));
1955+
1956+
let mut driver = FileSymbols::new("driver.js".to_string());
1957+
driver.imports.push(Import::new(
1958+
"./service.js".to_string(),
1959+
vec!["buildService".to_string()],
1960+
1,
1961+
));
1962+
driver.call_assignments.push(crate::types::NativeCallAssignment {
1963+
var_name: "svc".to_string(),
1964+
callee_name: "buildService".to_string(),
1965+
receiver_type_name: None,
1966+
});
1967+
1968+
let mut file_symbols = HashMap::new();
1969+
file_symbols.insert("service.js".to_string(), service);
1970+
file_symbols.insert("driver.js".to_string(), driver);
1971+
let import_ctx = make_import_ctx(&file_symbols);
1972+
1973+
propagate_return_types_across_files(&mut file_symbols, &import_ctx);
1974+
1975+
let driver = &file_symbols["driver.js"];
1976+
let seeded = driver
1977+
.type_map
1978+
.iter()
1979+
.find(|t| t.name == "svc")
1980+
.expect("svc should be seeded from buildService's return type");
1981+
assert_eq!(seeded.type_name, "UserService");
1982+
// 0.85 (inferred `return new X()`) minus one propagation hop.
1983+
assert!((seeded.confidence - 0.75).abs() < 1e-9);
1984+
}
1985+
1986+
#[test]
1987+
fn qualified_receiver_lookup_uses_global_return_type_map() {
1988+
let mut factory = FileSymbols::new("factory.js".to_string());
1989+
factory.return_type_map.push(entry("Factory.create", "Widget", 1.0));
1990+
1991+
let mut driver = FileSymbols::new("driver.js".to_string());
1992+
driver.type_map.push(entry("factory", "Factory", 0.9));
1993+
driver.call_assignments.push(crate::types::NativeCallAssignment {
1994+
var_name: "w".to_string(),
1995+
callee_name: "create".to_string(),
1996+
receiver_type_name: Some("Factory".to_string()),
1997+
});
1998+
1999+
let mut file_symbols = HashMap::new();
2000+
file_symbols.insert("factory.js".to_string(), factory);
2001+
file_symbols.insert("driver.js".to_string(), driver);
2002+
let import_ctx = make_import_ctx(&file_symbols);
2003+
2004+
propagate_return_types_across_files(&mut file_symbols, &import_ctx);
2005+
2006+
let driver = &file_symbols["driver.js"];
2007+
let seeded = driver.type_map.iter().find(|t| t.name == "w").expect("w seeded");
2008+
assert_eq!(seeded.type_name, "Widget");
2009+
assert!((seeded.confidence - 0.9).abs() < 1e-9);
2010+
}
2011+
2012+
#[test]
2013+
fn locally_typed_variables_are_not_overwritten() {
2014+
let mut service = FileSymbols::new("service.js".to_string());
2015+
service.return_type_map.push(entry("buildService", "UserService", 0.85));
2016+
2017+
let mut driver = FileSymbols::new("driver.js".to_string());
2018+
driver.imports.push(Import::new(
2019+
"./service.js".to_string(),
2020+
vec!["buildService".to_string()],
2021+
1,
2022+
));
2023+
driver.type_map.push(entry("svc", "LocalOverride", 1.0));
2024+
driver.call_assignments.push(crate::types::NativeCallAssignment {
2025+
var_name: "svc".to_string(),
2026+
callee_name: "buildService".to_string(),
2027+
receiver_type_name: None,
2028+
});
2029+
2030+
let mut file_symbols = HashMap::new();
2031+
file_symbols.insert("service.js".to_string(), service);
2032+
file_symbols.insert("driver.js".to_string(), driver);
2033+
let import_ctx = make_import_ctx(&file_symbols);
2034+
2035+
propagate_return_types_across_files(&mut file_symbols, &import_ctx);
2036+
2037+
let driver = &file_symbols["driver.js"];
2038+
let svc_entries: Vec<_> = driver.type_map.iter().filter(|t| t.name == "svc").collect();
2039+
assert_eq!(svc_entries.len(), 1, "no duplicate entry should be injected");
2040+
assert_eq!(svc_entries[0].type_name, "LocalOverride");
2041+
}
2042+
}

0 commit comments

Comments
 (0)