diff --git a/.claude/skills/parity/SKILL.md b/.claude/skills/parity/SKILL.md new file mode 100644 index 000000000..3592f840b --- /dev/null +++ b/.claude/skills/parity/SKILL.md @@ -0,0 +1,188 @@ +--- +name: parity +description: Audit WASM/native engine correctness parity across all resolution fixtures and fix any divergence at the root cause — both engines must produce identical graphs +argument-hint: "[--langs js,python] [--hybrid] [--audit-only]" +allowed-tools: Bash, Read, Write, Edit, Glob, Grep, Agent +--- + +# /parity — Engine Correctness Parity Audit & Fix + +Codegraph has two engines that MUST produce identical results (see CLAUDE.md): + +- **wasm** — JS pipeline + JS extractors + JS edge resolution +- **native** — full Rust orchestrator (`crates/codegraph-core/src/domain/graph/builder/pipeline.rs`) +- **hybrid** — JS pipeline + napi `buildCallEdges` (the fallback when the + orchestrator is skipped: forced full rebuilds, older addons) + +This skill runs `scripts/parity-compare.mjs`, which builds every +resolution-benchmark fixture with each engine and compares the **full node and +edge multisets** (kind, name, file, line, confidence, dynamic flag). Any +difference is a bug in the less-accurate engine — never an acceptable gap, and +never something to document as expected. The skill finds the root cause, fixes +it, and re-verifies until the audit is clean. + +## Arguments + +- `$ARGUMENTS` may contain: + - `--langs a,b,c` — restrict to specific fixture names (e.g. `javascript,pts-javascript`) + - `--hybrid` — also audit the hybrid path (recommended; slower) + - `--audit-only` — report divergences without fixing them + - No arguments — full audit across all fixtures, then fix divergences + +## Phase 0 — Pre-flight + +All steps run from the repo root. + +1. Confirm `scripts/parity-compare.mjs` exists. If not, this repo doesn't have + the parity tooling — stop and report. +2. Build the TypeScript dist (the script imports `dist/index.js`, and extractor + changes in `src/` are invisible until rebuilt): + ```bash + npm run build + ``` +3. Ensure the native addon reflects the local Rust source: + ```bash + cd crates/codegraph-core && npx napi build --platform --release && cd ../.. + ``` + On macOS, locally built binaries must be re-signed or Node kills the process + (exit 137): + ```bash + codesign --sign - --force crates/codegraph-core/*.node + ``` +4. Verify the loader picks up the **locally built** binary, not the published + package. First check which path is actually resolved: + ```bash + node -e " + const { createRequire } = require('node:module'); + const r = createRequire(require.resolve('./dist/index.js')); + try { console.log(r.resolve('codegraph-core')); } catch { console.log('not found via require'); } + " + ``` + If the resolved path points to + `node_modules/@optave/codegraph--/codegraph-core.node` + (the installed package), copy your freshly built binary over it: + ```bash + cp crates/codegraph-core/*.node node_modules/@optave/codegraph--/codegraph-core.node + ``` + Then confirm the loader picks it up: + ```bash + node -e "import('./dist/infrastructure/native.js').then(m => console.log(m.isNativeAvailable()))" + ``` + If `false`, stop and report — auditing parity without the native engine is + meaningless. + +## Phase 1 — Audit + +Run the comparison (pass through `--langs` / `--hybrid` from `$ARGUMENTS`): + +```bash +node scripts/parity-compare.mjs [--langs ...] [--hybrid] 2>/dev/null +``` + +- Exit 0 → parity holds. Skip to Phase 4 and report a clean audit. +- Exit 1 → divergences or fixture build failures. Collect every `[node]` / + `[edge]` diff line and any `BUILD FAILED` fixtures. +- Exit 2 → pre-flight failure; go back to Phase 0. + +For machine-readable output (useful when many fixtures diverge), re-run with +`--json` and parse `fixtures[].comparisons[].nodeDiffs/edgeDiffs`. + +If `--audit-only` was passed: report the diffs (Phase 4 format) and stop. + +## Phase 2 — Root-cause and fix + +For each divergence, identify which engine is wrong — the one missing edges or +producing lower-quality resolution is usually the buggy one, but verify by +reading the fixture source and deciding what the *correct* graph is. + +**Localize the bug by which paths disagree:** + +| wasm | hybrid | native | Bug location | +|------|--------|--------|--------------| +| A | A | B | Rust pipeline prep (`pipeline.rs`) or Rust extractor (`crates/.../extractors/`) — the napi solver gets correct input from JS but the orchestrator's own input differs | +| A | B | B | Rust `build_edges.rs` solver (shared by hybrid + native) | +| A | B | A | JS↔napi boundary: `NativeFileEntry` plumbing in `build-edges.ts` or the wasm-worker protocol | +| B | A | A | JS extractor or JS resolution (`src/extractors/`, `src/domain/graph/builder/stages/build-edges.ts`) | + +**Fix rules (from CLAUDE.md — non-negotiable):** + +- Fix the extraction/resolution layer that produces incorrect results. Never + add comments, tests, or fixture exclusions that frame wrong output as + expected. +- Changes may land in either language or both — create the best version based + on both implementations, don't restrict the fix to one side. +- The module layout is mirrored between `src/` and `crates/codegraph-core/src/` + — read the TS and Rust counterparts side by side (e.g. + `src/domain/graph/builder/stages/build-edges.ts` ↔ + `crates/.../domain/graph/builder/stages/build_edges.rs`). +- Mirror *semantics exactly*: confidence constants, hop penalties, tie-breaking + order, first-wins vs highest-wins rules. A 0.05 confidence difference is a + parity failure. +- Add a focused unit test next to the fix (Rust `#[cfg(test)]` or vitest) that + pins the behavior. + +**Gotchas that mask fixes:** + +- `src/` changes need `npm run build` before the script (which imports dist) + sees them. +- Rust changes need the napi rebuild + macOS codesign from Phase 0. +- New `ExtractorOutput` fields must be added to `SerializedExtractorOutput` in + `src/domain/wasm-worker-{protocol,entry,pool}.ts` or they are silently + dropped at the Worker-thread boundary. +- New per-file fields crossing the napi boundary need: the `FileSymbols` / + `FileEdgeInput` structs in `crates/.../types.rs` & `build_edges.rs`, the + `NativeFileEntry` assembly in `build-edges.ts`, and the orchestrator's own + assembly in `pipeline.rs` (`build_and_insert_call_edges`). Missing the last + one produces hybrid-OK/native-broken splits. +- Out-of-scope findings discovered along the way (pre-existing bugs, refactor + opportunities) → `gh issue create` immediately, then continue. + +## Phase 3 — Verify + +Repeat until the audit is clean — never stop at "fewer diffs than before": + +1. Rebuild whichever side changed (`npm run build` / napi build + codesign). +2. Re-run the Phase 1 audit command. Any remaining divergence → back to Phase 2. +3. Once clean, run the full verification suite — all must pass: + ```bash + cargo test --manifest-path crates/codegraph-core/Cargo.toml + npm test + npx vitest run tests/benchmarks/resolution/resolution-benchmark.test.ts + ``` + (From a `.claude` worktree, vitest needs the worktree override config — + check memory/project notes if no tests are found.) +4. If any verification step cannot run, STOP and report it — never proceed + with unverified changes. + +## Phase 4 — Report + +Print a summary: + +``` +PARITY AUDIT — +Fixtures audited: N (wasm vs native[, hybrid]) +Divergences found: M +Fixed: +Verification: cargo test ✓ | npm test ✓ | resolution benchmark ✓ +Issues filed: #NNN (out-of-scope findings) +``` + +- If divergences were found and fixed, list each root cause in one line — + which engine was wrong, which layer, what semantic was mismatched. +- If `--audit-only`: list divergences grouped by fixture with the + wasm/hybrid/native localization table applied. +- Suggest committing engine fixes separately from unrelated work (one PR = one + concern). + +## Rules + +- **Zero divergence is the only passing state** — a single edge differing in + confidence is a failure. +- **Never exclude a fixture or file to make the audit pass.** +- **Never run the audit against a stale dist or stale native binary** — Phase 0 + is mandatory after any code change. +- **The wasm/hybrid/native disagreement pattern localizes the bug** — use the + table before reading code. +- **Both engines evolve together**: a feature added to one engine without the + other is a parity bug from day one. New resolution techniques must land in + `src/` and `crates/codegraph-core/src/` in the same PR. diff --git a/.claude/skills/titan-run/SKILL.md b/.claude/skills/titan-run/SKILL.md index 781f25614..51e22569f 100644 --- a/.claude/skills/titan-run/SKILL.md +++ b/.claude/skills/titan-run/SKILL.md @@ -1,7 +1,7 @@ --- name: titan-run -description: Run the full Titan Paradigm pipeline end-to-end by dispatching each phase to sub-agents with fresh context windows. Orchestrates recon → gauntlet → sync → forge → grind automatically. -argument-hint: <--skip-recon> <--skip-gauntlet> <--start-from recon|gauntlet|sync|forge|grind> <--gauntlet-batch-size 5> <--yes> +description: Run the full Titan Paradigm pipeline end-to-end by dispatching each phase to sub-agents with fresh context windows. Orchestrates recon → gauntlet → sync → forge → grind (+ repo-provided parity audit) automatically. +argument-hint: <--skip-recon> <--skip-gauntlet> <--start-from recon|gauntlet|sync|forge|grind|parity> <--gauntlet-batch-size 5> <--yes> allowed-tools: Agent, Read, Bash, Glob, Write, Edit --- @@ -16,7 +16,7 @@ You are the **orchestrator** for the full Titan Paradigm pipeline. Your job is t - `` → target path (passed to recon) - `--skip-recon` → skip recon (assumes artifacts exist) - `--skip-gauntlet` → skip gauntlet (assumes artifacts exist) -- `--start-from ` → jump to phase: `recon`, `gauntlet`, `sync`, `forge`, `grind` +- `--start-from ` → jump to phase: `recon`, `gauntlet`, `sync`, `forge`, `grind`, `parity` - `--gauntlet-batch-size ` → batch size for gauntlet (default: 5) - `--yes` → skip all confirmation prompts in the orchestrator (pre-pipeline, forge checkpoint, and resume prompts) and in forge (per-phase confirmation) @@ -50,7 +50,7 @@ You are the **orchestrator** for the full Titan Paradigm pipeline. Your job is t node -e "const fs=require('fs');const s=JSON.parse(fs.readFileSync('.codegraph/titan/titan-state.json','utf8'));s.phaseTimestamps=s.phaseTimestamps||{};s.phaseTimestamps['']=s.phaseTimestamps['']||{};s.phaseTimestamps[''].completedAt=new Date().toISOString();fs.writeFileSync('.codegraph/titan/titan-state.json',JSON.stringify(s,null,2));" ``` - Replace `` with `recon`, `gauntlet`, `sync`, `forge`, or `close`. **Run the start command immediately before dispatching each phase's first sub-agent, and the completion command immediately after post-phase validation passes.** If resuming a phase (e.g., gauntlet loop iteration 2+), do NOT overwrite `startedAt` — only set it if it doesn't already exist. + Replace `` with `recon`, `gauntlet`, `sync`, `forge`, `parity`, or `close`. **Run the start command immediately before dispatching each phase's first sub-agent, and the completion command immediately after post-phase validation passes.** If resuming a phase (e.g., gauntlet loop iteration 2+), do NOT overwrite `startedAt` — only set it if it doesn't already exist. **Timestamp validation:** After recording `completedAt` for any phase, verify `startedAt < completedAt`: ```bash @@ -639,7 +639,7 @@ Record `phaseTimestamps.forge.completedAt`. Grind runs after forge to close the adoption loop. Forge extracts helpers; grind wires them into consumers and removes dead code. Without grind, the dead symbol count inflates with every forge phase. -**Skip if:** `--start-from` is `close`, or `titan-state.json → grind.completedPhases` already covers all forge phases. +**Skip if:** `--start-from` is `parity` or `close`, or `titan-state.json → grind.completedPhases` already covers all forge phases. ### 4.5a. Pre-loop check @@ -742,6 +742,63 @@ Record `phaseTimestamps.grind.completedAt`. --- +## Step 4.6 — PARITY (conditional, repo-provided) + +Some repos ship multiple implementations of the same logic that must stay in lockstep (e.g. a dual native/WASM engine, a client and server copy of a validator). Forge and grind edit code across the tree; this step verifies those edits didn't leave one implementation behind. + +**titan-run is repo-agnostic** — never assume the target repo has engines, fixtures, or any parity surface. The contract: a repo opts in by shipping its own `/parity` skill at `.claude/skills/parity/SKILL.md` (wrapping whatever audit mechanism it uses internally). No skill → no parity phase. + +### 4.6a. Detect the repo's parity mechanism + +```bash +test -f .claude/skills/parity/SKILL.md && echo "PARITY SKILL FOUND" || echo "NO PARITY SKILL" +``` + +- **NO PARITY SKILL** → print `"PARITY skipped — repo provides no /parity skill."` and continue to Step 5. Absence is normal for most repos; do not warn. +- **PARITY SKILL FOUND** → continue below. + +**Skip also if:** `--start-from` is `close`, or the pipeline made no code changes this run (`titan-state.json → execution.commits` empty/absent AND no grind adoption commits) — unless `--start-from parity` was given explicitly, which always runs the audit. + +### 4.6b. Record phase start + +Record `phaseTimestamps.parity.startedAt`. + +```bash +headBefore=$(git rev-parse HEAD) +``` + +### 4.6c. Run Pre-Agent Gate (G1-G4) + +### 4.6d. Dispatch sub-agent + +``` +Agent → "Run /parity. Read .claude/skills/parity/SKILL.md and follow it exactly. + Skip worktree check — already handled. + Audit every surface the skill covers. Fix any divergence introduced by + recent commits at the root cause, commit the fixes, and re-verify until + the audit is clean. If a divergence pre-dates this run (verify via + git log on the relevant files), follow the skill's and repo's rules for + pre-existing findings (typically: file an issue, don't expand scope)." +``` + +### 4.6e. Post-phase validation + +After the agent returns: + +```bash +headAfter=$(git rev-parse HEAD) +``` + +- `git status --short` → the working tree must be clean. The sub-agent commits its fixes; uncommitted changes mean it stopped mid-fix → **stop** and report. +- If the agent fixed divergences, run V16-style commit audit: `git log --oneline $headBefore..$headAfter` and print the parity-fix commits. +- If the agent reports divergences introduced by THIS run that it could not fix → **stop**: "PARITY failed — this run introduced implementation drift. Fix before CLOSE or revert the offending commits." Pre-existing divergences filed as issues are not blockers; print the issue URLs. + +Print: `"PARITY complete: "` + +Record `phaseTimestamps.parity.completedAt`. + +--- + ## Step 5 — CLOSE (report + PRs) After forge completes, dispatch `/titan-close` to produce the final report with before/after metrics and split commits into focused PRs. @@ -778,6 +835,7 @@ Record `phaseTimestamps.close.completedAt`. - **NDJSON corrupt lines:** Warn but continue — partial results are better than none. The corrupt lines are logged so the user knows which targets to re-audit. - **Merge conflict detected by pre-agent gate:** Stop immediately with the conflicting files listed. - **Tests fail after forge phase:** Stop immediately. Print the failing phase's commits so the user can revert. +- **Parity audit fails on drift introduced by this run:** Stop before CLOSE. Retry with `/titan-run --start-from parity` after fixing or reverting. - **Validation failure (any V-check marked FAILED):** Stop with details. Warn-level V-checks are logged but don't stop the pipeline. --- @@ -786,7 +844,7 @@ Record `phaseTimestamps.close.completedAt`. - **You are the orchestrator, not the executor.** Never run codegraph commands, edit source files, or make commits yourself. Only spawn sub-agents and read state files. Exceptions (pure validation/snapshot, no code changes): the post-forge test run (V13), NDJSON integrity checks, the V3 baseline snapshot check (`codegraph snapshot list`), and the pre-forge architectural snapshot capture (Step 3.5a) are run directly by the orchestrator. - **Run the Pre-Agent Gate (G1-G4) before EVERY sub-agent.** No exceptions. -- **One sub-agent at a time.** Phases are sequential — recon before gauntlet, gauntlet before sync, sync before forge, forge before grind, grind before close. +- **One sub-agent at a time.** Phases are sequential — recon before gauntlet, gauntlet before sync, sync before forge, forge before grind, grind before parity (when the repo provides one), parity before close. - **Fresh context per sub-agent.** This is the whole point — each sub-agent gets a clean context window. - **Read AND validate state files after every sub-agent.** Trust the on-disk state, not the sub-agent's text output — but verify the state is structurally sound. - **Back up state before every sub-agent.** The `.bak` file is your safety net against mid-write crashes. diff --git a/scripts/parity-compare.mjs b/scripts/parity-compare.mjs new file mode 100644 index 000000000..368f41477 --- /dev/null +++ b/scripts/parity-compare.mjs @@ -0,0 +1,298 @@ +#!/usr/bin/env node +// Correctness-parity gate between the WASM and native engines. +// +// CLAUDE.md mandates that both engines produce identical results; this script +// asserts it directly. For every resolution-benchmark fixture it builds the +// graph with each engine into an isolated temp dir and compares the full node +// and edge multisets — any difference is a bug in the less-accurate engine, +// never an acceptable gap. Complements scripts/benchmark-parity-gate.mjs, +// which gates performance parity (timings, DB size), not correctness. +// +// Build paths covered: +// wasm — JS pipeline + JS extractors + JS edge resolution +// native — full Rust orchestrator (pipeline.rs) +// hybrid — JS pipeline + napi buildCallEdges (--hybrid; forced by building +// wasm-incremental then native-incremental in the same dir, which +// promotes to a full rebuild that skips the orchestrator) +// +// Usage: +// node scripts/parity-compare.mjs [--langs js,python] [--hybrid] [--json] +// +// Exit codes: 0 = parity, 1 = divergence or fixture build failure, +// 2 = pre-flight failure (missing dist, native unavailable). + +import { cpSync, existsSync, mkdtempSync, readdirSync, rmSync, statSync } from 'node:fs'; +import { createRequire } from 'node:module'; +import { tmpdir } from 'node:os'; +import { dirname, join, resolve } from 'node:path'; +import { fileURLToPath, pathToFileURL } from 'node:url'; + +const repoRoot = resolve(dirname(fileURLToPath(import.meta.url)), '..'); +const require = createRequire(import.meta.url); + +function usage() { + console.error( + [ + 'Usage: node scripts/parity-compare.mjs [options]', + '', + ' --langs a,b,c Only run the named fixtures (fixture dir names,', + ' e.g. javascript,pts-javascript,python)', + ' --hybrid Also build via the hybrid path (JS pipeline + native', + ' buildCallEdges) and compare it against the wasm baseline', + ' --json Machine-readable report on stdout (logs stay on stderr)', + ' -h, --help Show this help', + ].join('\n'), + ); +} + +// ── Argument parsing ────────────────────────────────────────────────────── + +const args = process.argv.slice(2); +let langsFilter = null; +let hybrid = false; +let json = false; +for (let i = 0; i < args.length; i++) { + const a = args[i]; + if (a === '--langs') { + langsFilter = (args[++i] ?? '') + .split(',') + .map((s) => s.trim()) + .filter(Boolean); + } else if (a.startsWith('--langs=')) { + langsFilter = a + .slice('--langs='.length) + .split(',') + .map((s) => s.trim()) + .filter(Boolean); + } else if (a === '--hybrid') { + hybrid = true; + } else if (a === '--json') { + json = true; + } else if (a === '-h' || a === '--help') { + usage(); + process.exit(0); + } else { + console.error(`Unknown argument: ${a}\n`); + usage(); + process.exit(2); + } +} + +// ── Pre-flight ──────────────────────────────────────────────────────────── + +const distIndex = join(repoRoot, 'dist', 'index.js'); +if (!existsSync(distIndex)) { + console.error('parity-compare: dist/index.js not found — run `npm run build` first.'); + process.exit(2); +} + +const { buildGraph } = await import(pathToFileURL(distIndex).href); +const { isNativeAvailable } = await import( + pathToFileURL(join(repoRoot, 'dist', 'infrastructure', 'native.js')).href +); + +if (!isNativeAvailable()) { + console.error( + 'parity-compare: native engine unavailable — install the platform package or build it locally:\n' + + ' cd crates/codegraph-core && npx napi build --platform --release\n' + + ' (macOS: codesign --sign - --force )\n' + + 'then place the binary where infrastructure/native.ts can load it.', + ); + process.exit(2); +} + +const Database = require('better-sqlite3'); + +const fixturesRoot = join(repoRoot, 'tests', 'benchmarks', 'resolution', 'fixtures'); +const allFixtures = readdirSync(fixturesRoot) + .filter((name) => statSync(join(fixturesRoot, name)).isDirectory()) + .sort(); + +let fixtures = allFixtures; +if (langsFilter) { + if (langsFilter.length === 0) { + console.error('parity-compare: --langs requires at least one fixture name.'); + process.exit(2); + } + const unknown = langsFilter.filter((l) => !allFixtures.includes(l)); + if (unknown.length > 0) { + console.error( + `parity-compare: unknown fixture(s): ${unknown.join(', ')}\n` + + `Available: ${allFixtures.join(', ')}`, + ); + process.exit(2); + } + fixtures = allFixtures.filter((f) => langsFilter.includes(f)); +} + +// ── Build + read helpers ────────────────────────────────────────────────── + +// dataflow/cfg/ast are out of scope for the node+edge parity surface; the +// rest of the options stay at CLI defaults so the comparison reflects what a +// real `codegraph build` produces. +const BUILD_OPTS = { + incremental: false, + dataflow: false, + cfg: false, + ast: false, + skipRegistry: true, +}; + +async function buildEngine(fixtureDir, engine, label, tempDirs) { + const dir = mkdtempSync(join(tmpdir(), `parity-${label}-`)); + tempDirs.push(dir); // register before await so cleanup runs even if buildGraph throws + cpSync(fixtureDir, dir, { recursive: true }); + await buildGraph(dir, { ...BUILD_OPTS, engine }); + return dir; +} + +// Hybrid path: an incremental wasm build followed by an incremental native +// build on the same dir triggers "Engine changed (wasm -> native), promoting +// to full rebuild", which sets forceFullRebuild and skips the orchestrator — +// the JS pipeline then drives the napi buildCallEdges resolver. +async function buildHybrid(fixtureDir, label, tempDirs) { + const dir = mkdtempSync(join(tmpdir(), `parity-${label}-`)); + tempDirs.push(dir); // register before await so cleanup runs even if buildGraph throws + cpSync(fixtureDir, dir, { recursive: true }); + await buildGraph(dir, { ...BUILD_OPTS, incremental: true, engine: 'wasm' }); + await buildGraph(dir, { ...BUILD_OPTS, incremental: true, engine: 'native' }); + return dir; +} + +function bump(map, key) { + map.set(key, (map.get(key) ?? 0) + 1); +} + +function readMultisets(dir) { + const db = new Database(join(dir, '.codegraph', 'graph.db'), { readonly: true }); + try { + const nodes = new Map(); + const nodeRows = db.prepare('SELECT kind, name, file, line FROM nodes').all(); + for (const r of nodeRows) bump(nodes, `${r.kind}|${r.name}|${r.file}|${r.line ?? ''}`); + nodes.set('__TOTAL_ROWS__', nodeRows.length); + + const edges = new Map(); + const edgeRows = db + .prepare( + `SELECT e.kind AS kind, + sn.name AS srcName, sn.kind AS srcKind, sn.file AS srcFile, + tn.name AS tgtName, tn.kind AS tgtKind, tn.file AS tgtFile, + e.confidence AS conf, e.dynamic AS dyn + FROM edges e + JOIN nodes sn ON sn.id = e.source_id + JOIN nodes tn ON tn.id = e.target_id`, + ) + .all(); + for (const r of edgeRows) { + bump( + edges, + `[${r.kind}] ${r.srcFile}:${r.srcName}(${r.srcKind}) -> ${r.tgtFile}:${r.tgtName}(${r.tgtKind}) conf=${r.conf} dyn=${r.dyn}`, + ); + } + edges.set('__TOTAL_ROWS__', edgeRows.length); + return { nodes, edges }; + } finally { + db.close(); + } +} + +function diffMultisets(base, other) { + const diffs = []; + const keys = new Set([...base.keys(), ...other.keys()]); + keys.delete('__TOTAL_ROWS__'); + for (const key of keys) { + const a = base.get(key) ?? 0; + const b = other.get(key) ?? 0; + if (a !== b) diffs.push({ key, base: a, other: b }); + } + diffs.sort((x, y) => (x.key < y.key ? -1 : x.key > y.key ? 1 : 0)); + return diffs; +} + +// ── Main loop ───────────────────────────────────────────────────────────── + +const report = { fixtures: [], ok: true }; + +for (const fixture of fixtures) { + const fixtureDir = join(fixturesRoot, fixture); + const entry = { name: fixture, comparisons: [], error: null }; + report.fixtures.push(entry); + const tempDirs = []; + + try { + const wasmDir = await buildEngine(fixtureDir, 'wasm', `${fixture}-wasm`, tempDirs); + const base = readMultisets(wasmDir); + + const nativeDir = await buildEngine(fixtureDir, 'native', `${fixture}-native`, tempDirs); + const variants = [['native', nativeDir]]; + if (hybrid) { + const hybridDir = await buildHybrid(fixtureDir, `${fixture}-hybrid`, tempDirs); + variants.push(['hybrid', hybridDir]); + } + + for (const [variantName, dir] of variants) { + const other = readMultisets(dir); + const nodeDiffs = diffMultisets(base.nodes, other.nodes); + const edgeDiffs = diffMultisets(base.edges, other.edges); + const ok = nodeDiffs.length === 0 && edgeDiffs.length === 0; + if (!ok) report.ok = false; + entry.comparisons.push({ + baseline: 'wasm', + variant: variantName, + ok, + nodeCount: base.nodes.get('__TOTAL_ROWS__'), + edgeCount: base.edges.get('__TOTAL_ROWS__'), + nodeDiffs, + edgeDiffs, + }); + + if (!json) { + if (ok) { + console.log( + `=== ${fixture}: wasm vs ${variantName} OK ` + + `(${base.nodes.get('__TOTAL_ROWS__')} nodes, ${base.edges.get('__TOTAL_ROWS__')} edges)`, + ); + } else { + console.log( + `=== ${fixture}: wasm vs ${variantName} DIVERGED ` + + `(${nodeDiffs.length} node diffs, ${edgeDiffs.length} edge diffs)`, + ); + for (const d of nodeDiffs) { + console.log(` [node] ${d.key} wasm=${d.base} ${variantName}=${d.other}`); + } + for (const d of edgeDiffs) { + console.log(` [edge] ${d.key} wasm=${d.base} ${variantName}=${d.other}`); + } + } + } + } + } catch (err) { + entry.error = err instanceof Error ? err.message : String(err); + report.ok = false; + if (!json) console.log(`=== ${fixture}: BUILD FAILED — ${entry.error}`); + } finally { + for (const dir of tempDirs) { + try { + rmSync(dir, { recursive: true, force: true }); + } catch { + // best-effort cleanup of temp dirs + } + } + } +} + +if (json) { + console.log(JSON.stringify(report, null, 2)); +} else { + const failed = report.fixtures.filter( + (f) => f.error || f.comparisons.some((c) => !c.ok), + ); + console.log( + report.ok + ? `\nPARITY OK — ${report.fixtures.length} fixture(s), all engines identical` + : `\nPARITY FAILED — ${failed.length}/${report.fixtures.length} fixture(s) diverged: ${failed.map((f) => f.name).join(', ')}`, + ); +} + +// The WASM worker pool keeps the event loop alive; exit explicitly. +process.exit(report.ok ? 0 : 1);