Closes #20. Users with
multiple Claude Code accounts (personal vs work) set CLAUDE_CONFIG_DIR to
keep their settings.json, plugins/, and projects/ separate. The plugin
previously hardcoded ~/.claude/ across ~15 call sites, so for any account
running with the override:
- hook registrations were written to a file Claude Code did not read,
- adoption files / MEMORY.md sentinels landed in the wrong project dir,
- cache cleanup and
installed_plugins.jsonwrites pointed at the default install, not the configured one.
Net effect: the plugin was effectively broken under multi-account isolation.
- New shared helper
claude-plugin/scripts/claude-config.jsexposesclaudeHome()(returnsprocess.env.CLAUDE_CONFIG_DIR || ~/.claude, re-read on every call). lifecycle.js,auto-update.js,doctor.js,session-init.js,adopt.jsnow route all~/.claude/...paths through the helper.adopt.js: memoryDir()keeps its(cwd, home)signature for back-compat;CLAUDE_CONFIG_DIRsimply overrides thehome + .claudejoin.adopt.js: isPluginModeInstall()matches both the legacy~/.claude/plugins/marker andCLAUDE_CONFIG_DIR/plugins/.
- New
claude-config.test.js: env-var resolution + empty-string fallback. adopt.test.js:memoryDir+isPluginModeInstallhonor the override.lifecycle.e2e.test.js: full install subprocess writes intoCLAUDE_CONFIG_DIRand never touches default~/.claude/.
No default-behavior change: when CLAUDE_CONFIG_DIR is unset, every path
resolves exactly as before.
Four rounds of end-to-end dogfooding (fresh-project workflows, MCP stdio fuzz, IO edge cases) surfaced 16 places where the tool silently swallowed errors, gave misleading guidance, or returned empty results indistinguishable from a successful no-op. All four commits are pure UX/correctness — no API removals, no behavior changes for happy paths.
incremental-indexnow reports file deletions (src/indexer/pipeline/):IndexResultgainedfiles_deleted; the summary line reads "N files updated, M files removed, K nodes created" when M > 0. Previously a delete-only incremental said "0 files updated, 0 nodes created" — looked like a no-op even when nodes/edges were cascade-deleted.deps <file>distinguishes missing-file from no-imports (src/cli.rs): pre-checkproject_root.join(file_path).is_file()and report "File not found" before the barrel-scan fallback fires.deps <file>surfaces unresolved imports (src/cli.rs): when all edges point to<external>or cross-language targets and get filtered out, render "(no resolved deps; N unresolved outgoing/incoming)" instead of just printing the bare filename. JSON gets matchingunresolved_*fields.callgraphsuppresses no-op fuzzy resolve (src/cli.rs): no longer prints[code-graph] Resolved 'X' → 'X'when fuzzy matched the same input verbatim.incremental-indexin non-git dir prints why it skipped (src/main.rs): the silent-bail guard for multi-repo workspace parents now emits "Skipping index: no .git anchor or existing .code-graph/ at …" in non-quiet mode.--quiet(hook path) remains silent — PostToolUse contract preserved.mapon empty project replaces dangling header (src/cli.rs): "(empty project — no indexed source files)" instead of a lone "Modules:".
health-check --jsonemits valid JSON even with no index (src/cli.rs): returns{healthy:false, reason:"no_index", issue:...}+ exit 0 instead of bailing with stderr "No index found" + exit 1. Non-JSON mode keeps the stderr+exit-1 contract for interactive callers.doctorroutes "No index found" to the index-empty fix path (claude-plugin/scripts/doctor.js): old behavior labeled itbinary-broken(no fix handler) and reported "Fixing… 0/1 addressed". Now detects thereason:"no_index"JSON flag and runsincremental-indexautomatically.
get_call_graph/find_references/get_ast_node/find_similar_codereject empty/whitespacesymbol_name: previouslysymbol_name=""fell through to fuzzy resolve and silently matched a random Unique() candidate (seen returningfunction:"x"from a DB with one fn calledx).get_call_graph/find_referencestreat emptyfile_pathas absent:Some("")used to filter to a nonexistent path, producing "Symbol 'x' not found in file ''" or silent empty-edge results.trace_http_chain(route_path="")rejects upfront: empty pattern used to substring-match every route, returning "no routes found" indistinguishable from a typo.dependency_graph(file_path="")rejects upfront: empty path used to trigger the "looks like a directory" hint at "", giving wrong guidance.
module_overviewrejects absolute paths /../traversal / Windows drive letters (src/mcp/server/tools/overview.rs): the index stores relative paths from project root, so/etc,../foo,C:\Windowswill never match. Old behavior silently returned0 files; now errors with actionable message.snapshot create --outpre-flights the target path (src/cli.rs): rejects dir-as-out (/tmp/) and missing-parent (/nonexistent/snap.db) before SQLite VACUUM INTO leaks its raw "unable to open database file" error chain.scan_directorytolerates per-entry walk errors (src/indexer/merkle.rs): a single unreadable subdir (chmod 000build artifact, restricted mount, broken symlink target) used to abort the whole rebuild withPermission denied (os error 13). Now skip-and-warn via tracing — readable files still get indexed.
- 526 tests pass,
cargo +1.95.0 clippy --all-targets -- -D warningsclean on both default and--no-default-features. - No public API removals. New
IndexResult.files_deletedfield is purely additive. MCP tool schemas unchanged.
Five rounds of end-to-end dogfooding surfaced 12 silent-failure / mis-attribution
bugs across the parser → resolver → MCP/CLI surface. Net effect on the project's
own index: dead-code false positives dropped 21 → 6 (all remaining 6 are
documented design limitations: receiver obj.method() calls, type-as-field
references, cross-file constant access), edges restored 3030 → 3266+ (+8%
real call relations recovered). The user-visible behavior shift is the
TypeScript return_type shape (no longer leaks ": " prefix) — minor bump
flags that.
- TypeScript
return_typestrips leading:(src/parser/treesitter.rs):extract_signature_inforead the tree-sitterreturn_typefield verbatim, which on TS/JS is atype_annotationnode whose text starts with:. Stored values were": string"not"string"; signatures rendered(name: string) -> : string. Now trimmed at extraction — Python / Rust / Go produce clean values unchanged (no-op when first char isn't:). - Rust generic trait impl emits no method edges (
src/parser/relations/rust.rs,src/parser/treesitter.rs):impl<'a, W: Write> Write for CapWriter<'a, W>storedsource_name = "CapWriter<'a, W>"fromtype_nodeverbatim; Phase 2 source resolution exact-name-matched against"CapWriter"and dropped every method edge. Every generic trait impl's methods looked dead. Now strips generic params at both extraction sites (relations + treesitter qualified_name). - Method-level implements edges fan out within a file
(
src/parser/relations/mod.rs,src/indexer/pipeline/index_files.rs): N structs eachimpl Trait for StructNin one file produced N×N×methods edge combinations because the resolver matched bare method names against every same-name node in the file. Parser now stamps{"q":"impl_method","v":"<Type>"}; resolver filters method candidates byqualified_name LIKE "<Type>.%"via the existingself_filter_candidates.
- Same-file targets dropped under Path qualifier
(
src/indexer/pipeline/index_files.rs): theCalleeMeta::Pathbranch excludedlocal_idsbefore applying the path filter, contradicting the spec's "same-file matches take precedence".Foo::helper()in the same file asimpl Foo { fn helper }produced zero call edges. Now includes same-file candidates in the path-filtered pool. path_filter_candidatesmisses single-file Rust mods (src/indexer/pipeline/resolve.rs): only matched/<seg>/or<seg>/directory boundaries, socrate::domain::foo()resolving intosrc/domain.rs(single-file mod, nodomain/directory) dropped. Now also acceptspath.ends_with("/<last_seg>.rs"). This single fix eliminated 14 of the 20 dead-code false positives (normalize_type_filter, allmigrate_v*_to_v*,create_tables_sql, etc.).
ast_search/semantic_code_searchinvalidtypefilter silently empty (src/mcp/server/tools/ast_search.rs,src/mcp/server/tools/search.rs,src/cli.rs):normalize_type_filter("INVALID")returns emptyVec;.any()on empty returns false → every node filtered out → "No results" with exit 0. Now bails up-front with the valid-values list.find_referencesinvalidrelationsilently falls back toall(src/mcp/server/tools/refs.rs):match relation { "calls"=>..., _=>None }treated"call"(typo) identical to"all". Now explicit"all" => None- Err on anything else.
get_call_graphsymbol_name+route_pathsilently usesroute_path(src/mcp/server/tools/callgraph.rs): schema marks them mutually exclusive but impl preferred route_path silently. Now errors with the conflict.module_overview path=""returns the whole project (src/mcp/server/tools/overview.rs,src/cli.rs): empty string normalized to the same "match all" prefix as".". Common variable-substitution bug (process.env.X || ""→ dumps entire repo). Now errors;"."still works as the documented match-all alias.
- FTS5 reserved-word queries leak raw syntax error
(
src/storage/queries/search.rs):semantic_code_search query="NOT"returnedError: fts5: syntax error near "NOT". Each sanitized token is now wrapped in"…"(FTS5 phrase syntax) — equivalent for normal tokens, defuses the NOT/AND/OR/NEAR keywords. snapshot inspectsilently succeeds on truncated SQLite (src/snapshot/mod.rs): a 100-byte file starting with"SQLite format 3\0"magic passed the header check;Database::openinitialized empty schema; every meta lookup returned None → defaults. Inspect returned a fake "empty valid snapshot" with zeroed fields. Now bails when all meta is missing.- Concurrent
incremental-indexshows crypticError code 5: database is locked(src/cli.rs): two CLI processes racing on.code-graph/index.dbgot raw rusqlite SQLITE_BUSY. Newwrap_busy()translates to "Anothercode-graph-mcpprocess is writing... Wait for it to finish, then retry." while keeping the original error for debug.
tests/cli_e2e.rs::test_cli_ast_search_invalid_type/test_cli_search_invalid_node_type/test_cli_overview_empty_path_errors.tests/integration.rs::test_module_overview_empty_path_errors/test_find_references_invalid_relation_errors/test_get_call_graph_symbol_and_route_mutually_exclusive/test_fts5_keyword_query_does_not_leak_syntax_error.tests/integration_call_qualifier.rs::path_qualifier_keeps_same_file_target/path_qualifier_resolves_single_file_rust_mod/same_file_generic_impl_method_edges_dont_fan_out.src/snapshot/tests.rs::inspect_rejects_truncated_sqlite_header.src/parser/relations/tests.rs::test_extract_rust_impl_trait_generic_type_strips_params.
353+67+50+9+19 tests pass; cargo +1.95.0 clippy --no-default-features and
--all-targets both clean under -D warnings.
Six dead-code entries remain after this pass and are documented design gaps, not bugs:
- Receiver method calls (
obj.method()where receiver type isn't statically inferable):validate,file_exists,db(). - Type-as-field references (
pub foo: SomeStruct):SnapshotConfig. - Cross-file constant access via Path qualifier (extractor doesn't emit edges
for non-call identifier references):
PROD_SOURCE_FILTER_AND,TEST_SOURCE_FILTER_OR.
Fixing these requires either a Rust type inferencer or extending edge extraction to non-call identifier references — out of scope for this release.
Data-driven release based on a 7-day usage audit (2026-05-12 → 2026-05-14,
141 main sessions): 1972 raw bash grep vs 47 code-graph calls (the 47
all came from one dogfooding session — zero organic cg invocations across
20 sampled real sessions). Recall bench has stayed at P@1=100% throughout
2026, so the gap isn't "which tool" but "is the model reaching for a tool
at all". This release adds a measurement axis for that and widens three
hook-layer interventions that were missing real-world surface.
BenchMode::TriggerRate(tests/routing_bench.rs): new bench mode withtool_choice: auto(recall benches userequired/any). Surfaces the "model returned no tool / picked Bash" failure mode that forced-tool benches hide. 12-entry hardTRIGGER_ORACLE(10__CG__+ 2__DECOY__) covering pure symptom prompts ("今天的报告数据不准"), misleading-grep framing ("用 grep 找一下…"), answer-flavored guesses ("应该是 cleanup 那段没等完吧"), and generic feature refs. Newbash_decoy()(TriggerRate-only — kept out of ContextRich so v0.17.x baselines stay comparable),compute_trigger_metrics,matches_trigger_class,DECOY_NAMES. Baseline on Sonnet 4.5: 60.0% trigger (6/10) · 30.0% no-tool (3/10) · 100% decoy boundary · 0% leak. Run withROUTING_BENCH_MODE=trigger-rate.- PreToolUse:Read fanout detector (
claude-plugin/scripts/pre-read-guide.js): new hook fires on the 5th Read into the same source dir within a 30-min window. 7d audit found 16 sessions with 5+ Reads into one dir (top: 13 reads intobackend/app/services/) — Claude burns ~500-2000 tokens per Read when onemodule_overview path=X/returns symbols + caller counts in ~600. Per-cwd state in/tmp/.code-graph-readfan-<hash>.json, 5-min per-dir cooldown, 30-min state TTL. One-line hint suggestscode-graph-mcp overview <dir>/or MCPmodule_overview path=<dir>. Newtool == "Read"matcher inclaude-plugin/hooks/hooks.json(3s timeout). - UserPromptSubmit symptom-hint fallback (
claude-plugin/scripts/user-prompt-context.js): 24-entrySYMPTOM_PATTERNS(/bug/i·/crash/i·/not work/i·/why does/i·挂了·失败了·卡死·不准·缺失·又失败·为什么·哪里[\s\S]{0,5}(?:错|有问题|不对)· …). When the 4 existing channels (intent / qualified symbol / file path / any symbol) all return no actionable query AND the message has symptom phrasing AND a 10-min cooldown is cold,determineQueryTypereturns{ type: 'symptom-hint' }andrunMainemits ONE LINE of prose (NO CLI execution — Phase A's lesson: heavy structured injection backfires on borderline prompts). Hint format:[code-graph:hint] indexed repo — for vague-symptom prompts, try \semantic_code_search ""` or `module_overview ` to surface candidate code structurally. Skip if not searching code.Actionable paths (impact / overview / callgraph / search) still take precedence; signature gains a 5thmessage = ''` param keeping all legacy callers backward-compat.
pre-grep-guide.jsSRC_PATH expansion (claude-plugin/scripts/pre-grep-guide.js): added 20 backend / DDD / web convention prefixes —backend frontend services models domain controllers views handlers middleware routes repositories entities migrations tasks jobs workers features modules api web. Root cause: the v0.21+ regex required prefix terms in(src|tests|lib|…|app|server| client)/to be preceded by whitespace / quote / start-of-string, so the dominant daagu-style missbackend/app/services/…never fired (app/sat afterbackend/). Audit shows 5 of the worst-offender sessions used exactly this layout. Generic terms (core/utils/shared/common/types) deliberately omitted — too many non-code contexts. 10 new positive regression tests pass; 3 precision guards (web.config/node_modules//docs/*.md) stay false.
doc_lazy_continuationlint ontests/routing_bench.rs:704—Grep \n /// + Read decoyswrapped such that the continuation line began with+, which clippy 1.95 reads as a markdown bullet. Reworded tothe Grep/Read decoys alone don't model…. Both clippy passes (--no-default-featuresand--all-targets) clean.
- Phase A — "DO NOT use when X (Grep/Read/Bash)" steering in tool descriptions:
Hypothesis (mem #8234 /
feedback_negative_steering_backfire.md): mem-style negative steering should lift trigger rate. Bench falsified the hypothesis before commit: TriggerRate baseline 60% → 40% (3-run unanimous), two brand-new misses (今天的报告数据不准 → None,test 又挂了 → Bash), and the target miss (用 grep 找一下…→ Grep) wasn't fixed. Root cause: clauses like "INSTEAD OF Grep" + "DO NOT for: literals (Grep)" are self-contradictory — when the user hints "grep" the second clause licenses the wrong tool. New rule infeedback_negative_steering_backfire.md: negative steering is safe only between cg-vs-cg, never pointing at native decoys. Revert viagit restore, no commit hitmain.
tests/routing_bench.rs::scoring_tests::compute_trigger_metrics_*(4 tests covering all-correct / mixed / empty-oracle / missing-pick), plusdecoy_tests::bash_decoy_has_required_fields_and_anchor,mode_tests::detect_mode_trigger_rate,trigger_oracle_well_formed.claude-plugin/scripts/pre-read-guide.test.js— 34 tests covering source extension whitelist, dir extraction, cooldown / threshold logic, state load+save round-trip, TTL pruning, malformed-JSON tolerance, hint shape, silenced env, integrated 5-read flow.claude-plugin/scripts/pre-grep-guide.test.js— 10 new positive prefix regressions (backend/app/services/,services/scheduler/,models/,controllers/,domain/,handlers/,migrations/,features/,api/,frontend/) + 3 precision guards. 52/52 pass.claude-plugin/scripts/user-prompt-context.test.js— 18 new tests (hasSymptompositive + precision,determineQueryTypesymptom-fallback with cooldown + precedence + backward-compat, integration viaanalyze). 99/99 pass.
The strategy: don't try to argue Claude into picking better tools at the
description level — bench has already saturated there. Instead measure
"trigger rate" as a separate axis (Phase 0/F), widen the existing Bash and
Read hooks to fire on real-world layouts (Phases C/D), and add a low-noise
symptom-only fallback to the UPS hook for the bug-shaped prompts that
slipped through the existing 4 channels (Phase E). Phase A demonstrated
that re-trying the description angle backfires. Phase B (MCP instructions
symptom-mapping line) deliberately deferred until the C/D/E real-world
fire-rate data justifies it.
Surface: plugin-shipped behavior — users with adopted CG indexes will see
new [code-graph] hints on bash grep -rn backend/services/..., 5+ Reads
into the same source dir, and bug-flavored prompts that previously got
zero UPS injection. Escape hatch unchanged: CODE_GRAPH_QUIET_HOOKS=1
(env in ~/.claude/settings.json).
Pre-push parity: cargo +1.95.0 clippy --no-default-features --all-targets -- -D warnings clean on both targets; node --test claude-plugin/scripts/ *.test.js 251 tests pass (52 pre-grep + 34 pre-read + 99 user-prompt +
66 other); cargo test full suite passes.
- Python call-edge extraction (P0):
src/parser/relations/mod.rsonly matched thecall_expressionarm plus a Ruby-guardedcallarm; tree-sitter-python emitscallnodes that fell through to a no-op, so every.pyfile produced 0 call edges despite README andCLAUDE.mddocumenting Python as Full tier. Knock-on effects:module_overviewshowedcaller_count=0for every Python symbol,find_dead_codeover-reported orphans, andimpact_analysis/get_call_graph/find_referencesreturned wrong results for any Python query. New"call" if config.name == "python"arm, plushelpers::extract_callee_namenow treats Pythonattribute(field nameattribute, notproperty/field) the same as JSmember_expression. Reindexing this repo: 0 → 2969 total edges;scripts/analyze-search-queries.pynow produces non-zero caller counts. cmd_overviewJSON empty contract (src/cli.rs):overview . --jsonreturned[]on stdout but exited 1 withError: [code-graph] No symbols found under: .smeared on stderr byanyhow::bail!, breaking log consumers piping stdout tojq. Now.normalizes to project root (mirroring MCPtool_module_overview); JSON-mode empty path emits a cleaneprintln!+exit(1)so stderr stays free of the anyhowError:prefix.find_dead_codetruncation guard (src/storage/queries/dead_code.rs): whenCODE_GRAPH_MAX_CODE_LENcaps a long function's stored body, references in the truncated tail were invisible to the SQLinstrfallback, falsely flagging callback targets as dead. New OR clause detects truncated hosts via two co-signals — trailing...sentinel and declared-span > stored-newline-count by 5+ lines — and gives same-file names benefit of the doubt. Single signal alone is rejected (Pythondef stub(): ...has the sentinel without a span gap; compact test fixtures have a gap without the sentinel). Default 4 KB limit means real-world activation is rare.snapshot::{mod,install}.rs: three best-effortgitinvocations (rev-parse HEAD/remote get-url origin/cat-file -e) now redirect stderr toStdio::null(). Previouslyfatal: not a git repositoryleaked intocargo testoutput andsnapshot createruns on non-git roots.
LanguageConfig.call_node_kindfield — defined but never read. The call dispatcher inrelations/mod.rsuses hardcoded literal match arms because per-language call shapes diverge too far for one string to drive them (Ruby'scalldoubles asrequire; PHP splits into three kinds; C# usesinvocation_expression; Bash usescommand; Python'scallcarries anattributefield for method names). Keeping the field misled contributors into thinking new languages could be added by editing config alone — that was the Python regression's root cause. Field + assertions removed; the dispatcher entry now carries a comment enumerating every language's call-node kind so the trap is visible at the dispatch site.
src/parser/relations/tests.rs::test_extract_python_bare_call,::test_extract_python_method_call— Pythoncallarm +attributecallee.src/storage/queries/dead_code.rs::tests::test_find_dead_code_skips_when_caller_content_truncated— truncation guard.tests/cli_e2e.rs::test_cli_overview_dot_means_project_root,::test_cli_overview_json_empty_no_anyhow_prefix— overview path normalization + JSON empty-stderr cleanliness.
Autonomous iteration loop (4 rounds): 1 P0 + 4 P1 + 1 P2 surfaced, all
fixed. Each change stays inside internal surface — no Δ-contract on MCP
tool schemas, CLI flags published to npm users, or SQLite schema.
Pre-push parity: cargo +1.95.0 clippy --no-default-features --all-targets -- -D warnings
clean on both targets; 467 tests pass.
claude-plugin/scripts/user-prompt-context.js:computeQuietHooksdefault flipped back to noisy (push ON). The v0.21 opt-in flip cited routing-bench P@1=100% as evidence the agent already picks tools correctly without push, but that bench measures triage accuracy once the agent has decided to query a tool — not the prior question of whether the agent reaches for a tool at all. The real counter-evidence is inpre-grep-guide.js's 15-day baseline: 429 rawgrepvs 191 functional CLI calls on the same indexed source tree (~13× pre-training bias toward grep). Push is the corrective. Per-type cooldowns (impact 30s / overview 5min / callgraph 60s / search 60s) cap frequency; the 8-char message floor +shouldSkipfilter keep confirmation chatter silent. Escape hatch:CODE_GRAPH_QUIET_HOOKS=1.claude-plugin/scripts/pre-edit-guide.js: caller threshold lowered fromdirectCallers < 2to< 1. Editing any function with one or more callers now surfaces the one-line impact summary; the per-symbol 2-minute cooldown is unchanged so the noise floor stays the same.- SessionStart
project_mapinjection (session-init.js) stays default OFF — that hook is a static dump duplicated byMEMORY.md's decision table; this hook is a reactive trigger reminder. The two defaults are intentionally asymmetric.
src/mcp/server/mod.rsMCPinstructionsfield gains one line of explicit scenario triggers:"who calls X?" → get_call_graph; "impact of X?" or before editing a fn → get_ast_node include_impact=true; concept search without an exact symbol → semantic_code_search. Compile-timeassert!(NOISY.len() <= 1500)budget guard unchanged (now 772 / 1500 bytes).- Project
CLAUDE.md"Code Graph Integration" section replaced with a 5-row trigger table (who calls / impact / module overview / concept search / HTTP route) —CLAUDE.mdis loaded every session, higher priority than the invited-memory path inMEMORY.md. claude-plugin/templates/plugin_code_graph_mcp.mdclarifies the asymmetric hook defaults and listsCODE_GRAPH_QUIET_HOOKS=1as the context-push escape hatch alongside the existingVERBOSE_HOOKS/QUIET_HOOKS=0flags.
- mem #8234 documents that hook content has bounded leverage when the current bench corpus is saturated (Sonnet 4.5 hits P@1=100%); bench is the right oracle for tool-description boundary disambiguation, not for server-prelude / hook-content tuning. This release therefore lands without a fresh routing-bench cycle — the changes are all hook-content surface.
cargo check: clean (compile-timeassert!(len <= 1500)onNOISYinstructions string holds; final length ~772 bytes).node --test claude-plugin/scripts/user-prompt-context.test.js: 77/77 pass — sixcomputeQuietHookspriority-chain cases rewritten for the default-noisy invariant; one e2e check kept on the=1escape hatch.- No change to
routing_bench.rscorpus; intentionally skipped per mem #8234.
- Existing users on default env will start seeing
[code-graph:impact| overview|callgraph|search]push lines on intent-matching prompts. SetCODE_GRAPH_QUIET_HOOKS=1in~/.claude/settings.jsonenv to opt out. - Adopted projects: the
plugin_code_graph_mcp.mdtemplate auto-refreshes on next SessionStart (unlessCODE_GRAPH_NO_TEMPLATE_REFRESH=1is set). - No data-migration, no schema change, no MCP tool API change.
claude-plugin/scripts/find-binary.js: disk cache (~/.cache/code-graph/binary-path) now validates the cached binary's--versionagainst the package version before returning it. Previously the cache short-circuit atfindBinary()entry only checkedisNativeBinary(cached)(file exists + right basename) — once a stale path got written, it shadowed every newer binary on the system forever. Symptom on this dev machine: cache pinnedbin/code-graph-mcpv0.5.28 (the un-trackedscripts/copy-binary.jsartifact from March 17) while~/.cargo/bin/code-graph-mcpwas the freshly installed v0.25.0, causingincremental-index.test.jsto fail mid-pre-commit hook with the older binary's pre-v0.16.9 hard-bail behavior.
- Asymmetric version-check coverage. Auto-update cache at
find-binary.js:184-188was already version-gated (mem #8187 fixed three install-chain bugs but landed only on the~/.cache/.../bin/branch). Disk cachebinary-path— the entry-level fast-path that runs on every hook tick — had no equivalent gate. Afternpm install -gof an updated platform pkg, or any path drift in the platform-pkg layout, the disk cache would keep returning the pre-update binary until a user manuallyrm-ed the cache file. - New
isCachedBinaryFresh(cachedPath, pkgVersion)helper. Permissive on unknown values (missing pkg version, unreadable binary--versionoutput) → trust the cache (don't refuse the only path we know about). Strict only when both versions parse and cached < pkg.
node --test find-binary.test.js: 19/19 pass — 11 existing + 8 new covering THE BUG case (cached0.5.28vs pkg0.25.0→ invalidate), equal versions, newer cache, missing pkg version permissive, unreadable binary permissive, non-existent path, null/undefined input, basename mismatch.node --test lifecycle.test.js: 12/12 — schema regression-clean.cargo +1.95.0 clippy --no-default-features --all-targets -D warnings: 0.
- No user action needed. First findBinary call after upgrade detects stale cache (older than 0.25.1 cached binary) → invalidates → falls through to the rest of the discovery chain (target/release → auto-update cache → platform pkg → bundled → cargo install → PATH).
- For users on the dev branch with manually-recorded cache paths:
rm ~/.cache/code-graph/binary-pathtriggers the same fresh walk.
claude-plugin/scripts/pre-grep-guide.js: new PreToolUse:Bash hook that detects rawgrep/rg/aginvocations on the indexed source tree (src/,tests/,lib/,scripts/,claude-plugin/,tools/,pkg/,cmd/,internal/,app/,components/,server/,client/,crates/,packages/) and emits a 6-line hint pointing atcode-graph-mcp grep / ast-search / callgraph / show. Fires only on bare grep at command HEAD (pipe-greps likecargo test | grep FAILEDare output filters and skipped). Per-command-hash cooldown 60s prevents repeat noise. Registered inclaude-plugin/hooks/hooks.jsonwith 3s timeout.
- 15-day session telemetry (78 sessions / 13.5K assistant turns) showed
429 raw
grep -rncalls on source trees vs 437code-graph-mcpinvocations — ~1:1 overall but with severe variance (3 work days at 10:0 or worse againstcode-graph-mcp, today's 05-11 at 39:10). Pre-training bias givesgrep -rn pattern src/an enormous default weight; tool descriptions alone can route correctly (routing_bench Opus 4.7 P@1=95.5% in tool-only mode) but don't surface the indexed alternative when Claude isn't already deciding between tools. This hook closes the loop at the Bash entry point — same shape as the existing PreToolUse:Edit (pre-edit-guide.js) impact-summary hook.
node --test claude-plugin/scripts/pre-grep-guide.test.js: 35/35 pass. Covers fire cases (grep/rg/ag on src + tests + lib + claude-plugin, alternation patterns, env-prefixed, head/tail pipes downstream), skip cases (pipe-grep output filters, code-graph-mcp self-invocation, config-only targets like Cargo.toml/.gitignore/CHANGELOG.md, non-search tools like ls/cat/git/find), and 5 regression cases lifted verbatim from 2026-05-11 session telemetry.node --test claude-plugin/scripts/lifecycle.test.js: 12/12 pass — hooks.json schema change accepted by lifecycle's hook-identity matcher.- E2E sanity: piping
{"tool_input":{"command":"grep -rn ... src/storage/"}}throughpre-grep-guide.jsemits the 6-line hint on first invocation, silent on repeat (cooldown verified), silent oncargo test | grep FAILED(pipe-grep correctly skipped). - Bench unaffected: routing_bench is tool-only mode (forced
tool_choice=any), Bash hook injection happens outside that path — no P@1 regression possible.
- Plugin SessionStart auto-updates the hook registration via
${CLAUDE_PLUGIN_ROOT}path indirection. Disable per-session withCODE_GRAPH_QUIET_HOOKS=1(already gates the whole hook tier). No.code-graph/index.dbin CWD → hook exits silently regardless.
- adopt: MEMORY.md index-line tags renamed to MCP-tool-aligned multi-word
form (
impact-analysis,find-references,module-overview,semantic-search,dependency-graph,trace-http-chain,http-route,find-similar-code). Previous single-word tags (impact,refs,overview,semantic,deps,trace,route,similar) collided with release-notes and commit-message prose under the claudemd §11read-the-filehook's word-boundary + 0-2 char declension regex, producing false-positive denies on prose like "fail-open semantics" or "overview of changes".callgraph,ast-search,dead-coderetained (already multi-word). Affects four index-line variants inclaude-plugin/scripts/adopt.js(generic + web-* / frontend / rust-go-python-node) and the Rust drift mirror intests/routing_bench.rs.
- Existing adopted projects auto-refresh on next plugin SessionStart:
needsRefreshdoes bytewise compare of MEMORY.md against the newdesiredBlock;stripSentinelBlockcleans the old block (still v1 sentinel — no version bump needed) and the new block is written in place. Lock manual edits withCODE_GRAPH_NO_TEMPLATE_REFRESH=1.
- Hook-regex stress prose: OLD tags 3 FP (
impact,overview,semantics) → NEW tags 0 FP; legitimate references still match. adopt.test.js: 66/66 pass. New regression casestale INDEX_LINE → adopt rewrites in place without duplicating sentinel blockscovers the bump-without-strip-extension failure mode (would otherwise leave orphan v1 + new v2 blocks).routing_bench index_line_drift_check: pass (Rust mirror byte-aligned with JS source).- routing_bench context-rich (2026-05-11, OpenRouter sonnet-4.5,
domain=all, 3-run majority vote, 382s): Recall 41/42 = 97.6%,
FP-rate 0/10 = 0%, Overall 51/52 = 98.1% — zero regression vs
v0.17.3+pm-desc-dedup baseline (Backend 22/22 = 100% kept; Frontend
19/20 = 95% kept; same residual path-anchored
src/components/miss unrelated to this change). Confirms tag-rename preserves routing signal.
- callgraph: Rust qualified calls (
Type::method,crate::path::fn,self.method,Self::method, builder chains likeOpenOptions::new().create()) no longer route to unrelated project functions sharing the rightmost name. Eliminates phantom callers inimpact_analysisandfind_dead_codefor short-named functions (new/create/open/from). - parser:
impl crate::path::Type { ... }impl-block type now strips the leading path so qualified_name and SelfRecv payloads match (was producingcrate::path::Type.methodqualified_names that broke same-type LIKE matching).
- Existing
.code-graph/databases keep working (qualifier-aware resolution is a no-op whenedges.metadata IS NULL). Runcode-graph-mcp index --rebuildto populate qualifier metadata on existing Rust files; incremental indexing picks it up automatically as files change.
impact run_full_index: 36 → 33 transitive callers; the 3 documented phantoms (decompress_with_cap, try_acquire_index_lock, from_project_root) no longer appear.- routing_bench P@1: 22/22 (no regression).
- 558 tests pass with default +
--no-default-features. Clippy clean with--all-features.
Follow-up enhancements to v0.23.0 snapshot work plus an unrelated search-quality fix.
snapshot create --out <path> now auto-zstd-compresses when <path>
ends in .db.zst (level 9, matching the producer workflow template).
Raw .db output unchanged — the existing two-step --out foo.db && zstd -9 foo.db flow still works.
snapshot inspect <file> now accepts both .db and .db.zst (format
detected from magic bytes, not extension), so first-time users who run
snapshot create --out foo.db && snapshot inspect foo.db get sensible
output instead of zstd's cryptic "Unknown frame descriptor". Garbage or
wrong-format files now produce: "X is not a code-graph snapshot —
expected zstd-compressed (.db.zst) or raw SQLite (.db)". snapshot inspect <typo> also surfaces the file path in the error chain instead
of bare "No such file or directory (os error 2)".
Non-https [snapshot] url in .code-graph.toml now writes to stderr in
addition to tracing::warn!, so users see the rejection on CLI startup
paths that don't install a tracing subscriber.
fts5_search no longer OR-fallbacks when the user's single-word query
has zero AND-mode hits AND the original token doesn't appear anywhere
in the FTS index. This was returning noise via camelCase token splits —
a query like ZzzzNoMatchXyzzz matched any code containing the literal
--no-default-features (split on -) or the Rust match keyword.
Acronyms like RRF are unaffected: the original token is indexed, so
OR-fallback runs as before for legitimate recall expansion. Multi-word
queries are unchanged.
Team-shared graph artifact via GitHub Releases. New CLI subcommands
snapshot create and snapshot inspect. MCP server auto-fetches the
latest published snapshot on first start (when no local index exists) and
falls through to the existing full-index path on any failure — snapshot is
an optimization, not a dependency. Workflow template shipped at
claude-plugin/templates/code-graph-snapshot.yml. New CLI
reindex --from-snapshot forces a re-fetch. Snapshot status surfaces in
health-check --json. Snapshot file is symbols+edges+FTS5 only (no
node_vectors) to decouple from embedding model choice. Spec:
docs/superpowers/specs/2026-05-10-shared-graph-snapshot-design.md.
Defensive hardening for Database::open recovery. The existing
is_corruption_error retry branch covers files that error on open, but a
main DB file shorter than the SQLite header (100 bytes) can land in
SQLite-version-dependent territory — sometimes treated as fresh, sometimes
silently combined with stale .wal/.shm residue from a prior crashed
indexing pass.
The new sub_header_size_guard runs at the top of open_impl and wipes
the entire main+wal+shm triple whenever the main file exists but is < 100
bytes, so every recovery path starts from the same blank state.
Round 2 of the v0.22.x dogfood loop surfaced health-check exit codes that
varied across repeated runs against an interrupted indexing state. The
existing recovery branch was deterministic-by-luck — relying on a
particular SQLite version's tolerance for sub-header files. The guard
makes recovery deterministic-by-design.
Four new unit tests in src/storage/db.rs::tests document the safety
contract: 0-byte main alone, 0-byte main + stale wal/shm, partial-write
under header size, and the regression guard for valid databases. Full
suite: 303 lib + 198 integration = 501 passed, 0 failed.
fix(cli): preserve user --depth in callgraph requested_max_depth(73cd954) — CLI no longer clamps--depthbefore passing to the engine; the engine's ownCALL_GRAPH_MAX_DEPTHcap and therequested_max_depth/effective_max_depthenvelope fields surface truncation truthfully.
Five bug fixes from a 5-round structured dogfood pass. All fixes converge on
one root pattern: the test/prod source classification was implemented in five
sites independently, and result truncation in centralized_compress was
biased against production callers when source data ordering put tests at
the array head/tail.
-
get_ast_nodecalled_bypost-truncation bias (src/mcp/server/tools/ast_node.rs) Wheninclude_references=true include_tests=true, SQL row order withoutORDER BYclustered test callers at array start/end, andcentralized_compresskept first 10 + last 5 — leaving zero production callers visible for test-heavy targets likeconn(49 prod / 76 test). Stable-sort prod-first inside the tool before emitting. -
find_referencesreferences post-truncation bias (src/mcp/server/tools/refs.rs) Same pattern as above, but worse becausefind_referencesdefaultsinclude_tests=true(rename audits need test sites). 125-caller targets collapsed to a 10-prod-of-cli + 5-tests-of-tests/ window with allsrc/indexer/,src/mcp/,src/storage/prod callers silently dropped. Same prod-first stable sort inside the tool. -
module_overviewcaller_countincludes test sources (src/storage/queries/routes.rs)get_module_exportsccLEFT JOIN counted every incomingcallsedge — did not filter source-sideis_test.parse_codeshowedcaller_count=39whilefind_references include_tests=false/get_ast_node impact/project_map hot_functionsall reported 0 prod. Aligned with the four other prod-only counts via the same source-side filter pattern. -
ast_searchranking includes test sources (src/storage/queries/nodes.rs)get_nodes_with_files_by_filtersORDER BY (SELECT COUNT(*) FROM edges …)ranked test-only utility wrappers (e.g.extract_relations0 prod / 64 test) above genuinely hot prod symbols. Same source-side filter applied. -
find_references"Symbol not found" for test/bench symbols (src/mcp/server/tools/refs.rs)resolve_fuzzy_namefilters test/bench candidates upstream; previous error said "not found" even when the symbol was present. Re-query without the filter to detect the "found-but-filtered" case and surface a bypass hint with the actual file paths. Unblocks the dead-code → find_references reverse-trace flow.
src/storage/queries/{routes,nodes,project_map}.rs now share a single
SQL filter via src/domain.rs::prod_source_join_sql() +
PROD_SOURCE_FILTER_AND / TEST_SOURCE_FILTER_OR. Five duplicate LIKE
chains collapsed to one canonical source. New test/harness directory
conventions only need a single edit going forward.
tests/mcp_stdio_integration.rs(new, 245 LOC) — three end-to-end JSON-RPC stdio tests against a real spawnedcode-graph-mcp servesubprocess. Covers prod-first sort survival across centralized_compress truncation, caller_count prod-only correctness, and the new explanatory error message. Caught a real gap in the error-message fix during authoring (theFuzzyResolution::NotFoundbranch needed the same treatment asUnique).cargo test --release: 299 lib + 3 new mcp_stdio_integration + ~194 other integration = 496 total, 0 failed (1 pre-existing#[ignore]).cargo +1.95.0 clippy --no-default-features -- -D warningsclean.cargo +1.95.0 clippy --all-targets -- -D warningsclean.
Pure refactor release — zero behavior change, public surface preserved across all three splits. The three biggest source files (8049 lines as monoliths) are now decomposed into 26 per-concern submodules so future edits don't need to load 2000+ lines of context per touch.
| Original | Lines | New tree | Files |
|---|---|---|---|
src/storage/queries.rs |
2892 | src/storage/queries/ |
10 |
src/parser/relations.rs |
2783 | src/parser/relations/ |
9 |
src/indexer/pipeline.rs |
2374 | src/indexer/pipeline/ |
7 |
Submodule items use pub(super); mod.rs re-exports the items external callers
already depend on. External call sites in cli.rs, mcp/server, tests/,
benches/, and claude-plugin/ need zero edits — paths like
crate::storage::queries::upsert_file, crate::parser::relations::ParsedRelation,
crate::indexer::pipeline::run_full_index continue to resolve.
The three orchestrator-style functions stay whole in their respective mod.rs
or index_files.rs — walk_for_relations (~650 lines) and the Phase-0..3
indexer dispatch (~770 lines) share local state across their match arms /
phases that splitting would either duplicate or thread back via large arg
lists. Splitting per-language inside walk_for_relations would lose the
shared current_scope / current_class propagation; splitting per-phase
inside index_files would break the shared tx / atomics / batch_parsed
/ name_to_ids / global_name_map state. Both are kept whole deliberately.
cargo checkcleancargo +1.95.0 clippy --no-default-features -- -D warningscleancargo +1.95.0 clippy --all-targets -- -D warningscleancargo test --release: 292 lib + 129 integration = 421 tests, 0 failed (1 pre-existing#[ignore])- Pre-merge CI green on all three PRs (#15, #16, #17)
- Independent code-reviewer subagent passed each split with zero Critical / Important issues
- queries.rs: 657a1f9 (#15)
- relations.rs: 2dfbab9 (#16)
- pipeline.rs: aef55b2 (#17)
v0.21.0 — Opt-in plugin hooks (token discipline) + callgraph caller_count ordering + multi-model routing bench
Two LLM-visible default behaviors flipped to opt-in. Both have explicit env
opt-out paths; existing users who set the legacy CODE_GRAPH_QUIET_HOOKS=1 see
no change. Users on default settings will feel the new behavior on next session.
user-prompt-context.js(UserPromptSubmit hook) — default-quiet. Per-prompt CLI exec was costing 200–500 tokens/turn injecting outline/callgraph context the agent would have asked for via MCP itself. v0.20.0 routing-bench backend P@1 = 100% on Sonnet 4.5 proves the agent picks the right tool without push-injection. Restore the v0.20.0 noisy default: setCODE_GRAPH_VERBOSE_HOOKS=1in~/.claude/settings.jsonenv block. LegacyCODE_GRAPH_QUIET_HOOKS=0still forces noisy for back-compat.incremental-index.js(PostToolUse Edit/Write hook) — default-off. v0.18.0 added query-timeensure_file_indexed(single-file hash + sync reindex) inside MCP tools that takefile_path, so the PostToolUse hook spawning a fresh process per edit was redundant for the MCP-driven workflow and burnt ~80ms cold-start per edit. CLI-only workflows (runningcode-graph-mcp searchafter Bash-side edits without going through MCP) need the hook for freshness — opt back in withCODE_GRAPH_HOOK_INDEX=on.
The two knobs are independent: setting one does not affect the other. CLI-only
users typically want CODE_GRAPH_HOOK_INDEX=on only; users who relied on
per-prompt outline injection want CODE_GRAPH_VERBOSE_HOOKS=1 only.
One internal-but-user-perceptible change: get_call_graph (and the underlying
get_call_graph_query) now orders results within each depth by caller_count DESC. Previously ties broke by row order, which silently dropped the most-
relevant subtree under CALL_GRAPH_ROW_LIMIT truncation. Hot functions like
conn (51 callers + 72 test in this repo) are now guaranteed to surface
their high-connectivity subtrees first. No JSON shape change — only ordering.
claude-plugin/scripts/user-prompt-context.js — replaced 6 mixed-language
intent regex piles with per-keyword weighted patterns under INTENT_PATTERNS.
Each (regex, weight) row is testable in isolation; threshold 0.5 + uniform
weight 1.0 preserves the original OR-of-alternatives behavior 1:1. Future
tuning can downweight noisy short keywords (bug, 什么) once false-positive
data accumulates. Maintenance cost: ~150 lines of table vs 6 × 200-char
regexes — the regex form had two prior silent-bug regressions (#5754, #7713).
computeQuietHooks(env) priority chain (high → low):
CODE_GRAPH_QUIET_HOOKS=0→ forced noisy (legacy)CODE_GRAPH_QUIET_HOOKS=1→ forced quiet (legacy)CODE_GRAPH_VERBOSE_HOOKS=1→ opt-in noisy (new)- default → quiet (v0.21 flip)
claude-plugin/scripts/incremental-index.js — pure passthrough refactor
behind shouldRun(env) gate. CODE_GRAPH_HOOK_INDEX=on|1|true opts in;
default and any other value skip the binary exec. module.exports = { shouldRun } exposes the gate for the test file.
Both hook scripts gain dedicated *.test.js files: 91 new lines of tests
on user-prompt-context.js (covers the env-precedence chain + per-keyword
intent table) and 55 new lines on incremental-index.js (covers the env
gate + idempotent skip).
The recursive CTE in query_direction gained a caller_counts CTE
(non-correlated GROUP BY target_id over edges WHERE relation = ?4,
covered by idx_edges_target_rel) and a LEFT JOIN into the outer SELECT.
Final ORDER BY is now depth ASC, caller_count DESC. When the result set
saturates CALL_GRAPH_ROW_LIMIT, high-connectivity subtrees survive the
truncation instead of being silently dropped. Test:
test_callees_ordered_by_caller_count (3 callees, 5/1/0 external callers,
asserts the depth-1 ordering matches caller-count rank).
caller_count is computed for every node in the result, not just the
truncation boundary — small CPU overhead, big interpretability win for
module_overview and find_references consumers downstream that read
the same field for sort ordering.
New ROUTING_BENCH_MODELS env var accepts a comma-separated model list
(sonnet-4.5,sonnet-4.6,opus-4.7,haiku-4.5) and dispatches one Backend
per name, sharing a single API key. Single-model ROUTING_BENCH_MODEL
still works (legacy callers unchanged). When more than one backend ran,
the bench prints a multi-model summary table:
=== Multi-model P@1 summary (threshold 70%) ===
sonnet-4.5 backend recall 22/22 (100.0%) fp 0/10
sonnet-4.6 backend recall 22/22 (100.0%) fp 0/10
opus-4.7 backend recall 21/22 ( 95.5%) fp 0/10
haiku-4.5 backend recall 18/22 ( 81.8%) fp 0/10
Use case: weekly CI cron walking the Anthropic family to catch routing regression when Claude Code rotates the default model. v0.20.0 measured 100% P@1 on Sonnet 4.5 only — the rest of the family had no signal until this hook existed.
detect_backend() (legacy single-model) is preserved and still backs
the default ROUTING_BENCH_MODEL path. New detect_backends() returns
a Vec<Backend>; pure helpers parse_models_env(s) and
build_backends(models, anthropic_key, openrouter_key) are unit-tested
without API keys (4 new tests under multi_model_dispatch_tests).
Turns the README's "40-60% session token savings" vibe-claim into a
regression-tracked number. For each navigation task in the corpus, runs
the equivalent code-graph-mcp CLI command on a fixture project and
compares the byte count of the response to a hardcoded baseline_bytes
representing the historical Grep+Read approach. Asserts the overall
ratio stays ≤ 0.60 (matches the headline claim's worst case).
Bytes are a token proxy; for English / TS source they correlate ~1:3 with BPE tokens, so a 50% byte reduction maps to a 50% token reduction at the same ratio. The harness intentionally avoids a tokenizer dependency — bytes-as-proxy is good enough for tracking trend over releases. Run with:
cargo build --no-default-features
cargo test --test effectiveness_bench --no-default-features -- --ignored --nocapture
#[ignore]-gated like routing_bench, so it doesn't fire on default
cargo test — opt in with --ignored. New tasks added by hand-counting
(or by running grep/Read for the same intent and summing the bytes
touched), set baseline_bytes once, commit. Subsequent regressions move
the ratio without touching the baseline.
v0.20.0 — Adversarial tool descriptions + single-file outline + project-typed memdir + 100% routing P@1
No breaking changes. All edits are LLM-visible metadata, additive output fields, or new feature gates that fall back to the v0.19.0 behavior when not opted into. Three behaviors users feel automatically on next session:
- 7 MCP tool descriptions rewritten in adversarial style ("INSTEAD OF Grep", "Replaces N rounds of grep+Read") to compete with Claude Code's first-class Grep/Read/LSP tool prompts.
module_overview(and CLIoverview) on a single file path now emits an outline view:L<start>-<end> type name (callers×) signatureper symbol, sorted by line number. Replaces Read on 3000+ line source files.code-graph-mcp adoptnow detects project type (Rust/web-rs/web-node/web-py/ web-go/frontend/python/go/node/generic) and writes a per-type MEMORY.md index line — Web projects get HTTP-route-tracing priming, Rust CLIs get callgraph/impact priming, frontend projects get rename-audit priming.
To pin the generic INDEX_LINE behavior of v0.19.0, set
CODE_GRAPH_PROJECT_TYPE=generic in ~/.claude/settings.json env block.
To pin tool descriptions, downgrade — there is no env opt-out by design,
since LLM-visible metadata changes are the headline feature here.
src/mcp/tools.rs — all 7 visible tool descriptions rewritten following the
sdscc reference ("MCP tool description should compete with Grep/Read/LSP for
the same query"). Pattern: lead with the trigger phrase users actually type,
then state the alternative-tool replacement, then the boundary. Examples:
get_call_graph: "Multi-hop call chain. Replaces N rounds ofgrep \"X(\"+ Read. Pass route_path='GET /api/x' to trace HTTP handler → downstream."module_overview: "Symbols in a directory or file, grouped by type + caller count. Replaces Glob + Read×N for big dirs / huge files. Single file: include_deps=dep graph, include_dead=unreferenced."find_references: "Rename/remove audits — every site that imports/inherits/ implements/calls a symbol. Repo-wide cross-language (LSP needs file open). Literals → Grep; 'who calls X?' → get_call_graph."ast_search: "Enumerate symbols by typed filters (type/returns/params) Grep can't express. Use for 'all fns returning Result' / 'all structs implementing X'. ONE known symbol → get_ast_node."
Server instructions field gained one line: "Repo-wide AST index (LSP only handles open files; we don't). Replaces multi-round Grep+Read for structural queries." Compile-time assert!(NOISY.len() <= 1500) budget unchanged.
test_descriptions_are_concise (≤200 char) still passes for all 7 tools.
ModuleExport struct gained start_line + end_line fields, plumbed
through both SQL queries (sql_exports + sql_fallback). When overview /
module_overview resolves to exactly one file path, output switches from
"by-type compact list" to outline:
src/mcp/server/mod.rs
L1213-1254 fn handle_initialize fn handle_initialize(&self, ...)
L1256-1265 fn handle_tools_list (3×) fn handle_tools_list(&self, ...)
...
MCP module_overview.active_exports[] JSON gained start_line + end_line
(additive — existing clients ignore unknown keys).
claude-plugin/scripts/adopt.js gains detectProjectType(cwd, env) and
buildIndexLine(projectType). Detection state machine:
- Cargo.toml: strips
# ...comments, scans only[dependencies]section (skips[dev-dependencies]/[build-dependencies]/ target deps). Web frameworks: actix-web, axum, rocket, warp, poem, tide, salvo. (hyperexcluded — too commonly a CLI HTTP client.) - package.json:
JSON.parse+ checks onlydependenciesfield (skipsdevDependenciesto avoid false-promoting React component libraries). Frontend: next/react/vue/svelte/nuxt/astro/remix/solid-js. Web-node: express/fastify/koa/hono/@nestjs/core/@hapi/hapi. - pyproject.toml: scans
[tool.poetry.dependencies]+[project.dependencies][project](PEP 621 inline). Web: django/flask/fastapi/starlette/sanic/ tornado/quart.
- requirements.txt fallback with comment-strip.
- go.mod: skips
// indirectdeps and//comment lines. Web: gin-gonic/labstack-echo/gofiber/go-chi/gorilla-mux.
Per-type INDEX_LINE primes the right tools and demotes irrelevant ones —
e.g. a Rust CLI's INDEX_LINE no longer mentions trace_http_chain, freeing
attention budget for callgraph/impact/dead-code routing.
detectProjectType(cwd, env) honors CODE_GRAPH_PROJECT_TYPE env var when
set to a valid bucket name (PROJECT_TYPES Set is the allow-list).
Invalid/typo'd values silently fall through to file-based detection (so a
typo doesn't classify everything as generic). Use cases: power users who
want to pin a non-default classification, CI runs that want deterministic
typing across mixed repos, or opting out via =generic.
tests/routing_bench.rs ORACLE entries can now express "either of these
tools is correct" via |-separated expected: e.g.
("Who calls X?", "get_call_graph|find_references"). New helper
matches_expected(picked, expected) splits on | and accepts membership.
Wired through compute_recall, compute_overall,
assert_oracle_covers_registry, and the main benchmark miss-detection.
Why: at depth=1, find_references with relation=calls returns the same
caller list as get_call_graph. Pinning a single answer over-fitted the
oracle to a stylistic preference rather than measuring real routing capability.
Result: routing-bench P@1 went from 95.5% (21/22) → 100% (22/22) on the Backend oracle (OpenRouter Sonnet 4.5, ToolOnly mode).
claude-plugin/scripts/adopt.test.js: 43 → 65 tests (+22). Covers project-typed INDEX_LINE roundtrip + 12 detection-hardening tests (commented dep, dev-deps only, build-deps only, devDependencies,// indirect, malformed JSON, PEP 621, requirements.txt, env override valid/invalid/empty/forced-generic).tests/routing_bench.rsscoring tests: 40 → 43 (+3 alternates path coverage).- Rust suite: 469 → 470 passed, 0 failed, 1 ignored (routing_bench API key gate).
ModuleExportstruct: 2 new fields (start_line,end_line). SQL touched in 2 places (sql_exports + sql_fallback). All 5 ModuleExport call sites in cli.rs / overview.rs read the new fields.- One pre-existing clippy
iter_cloned_collectlint cleaned up (.iter().copied().collect()→.to_vec()in the new outline branch). cargo +1.95.0 clippy --all-targets -- -D warningsclean on both--no-default-featuresand default builds.
No breaking changes. All additions are backward-compatible. Existing indexes pick up new edges and test markers on the next incremental update — no rebuild required. Users feel three new behaviors automatically:
- New file extensions are now indexed:
.sh/.bash(Bash),.json(JSON, file-FTS only). - C/C++
#includedirectives now produce IMPORTS edges in the dependency graph. - gtest macro invocations (
TEST/TEST_F/TEST_P/TEST_CASE/TYPED_TEST/TYPED_TEST_P) are now markedis_test=trueand namedSuite.Name.
To revert any individual feature, pin to v0.18.4 (cargo install code-graph-mcp@0.18.4
or downgrade the npm-installed binary). No env-flag opt-out — the additions are
graph data shape, not behavior toggles.
- Bash (
tree-sitter-bash 0.23.3) — function definitions, command-style calls (with static-identifier filter rejecting$VAR/$(...)/ shell built-ins like[and:), and IMPORTS edges fromsource <file>/. <file>(path prefix and.sh/.bashextension stripped; dynamic paths skipped). - JSON (
tree-sitter-json 0.24.8) — file-FTS indexing only. No AST symbols extracted by design (JSON has no function/class concepts); files are searchable via FTS5 like any other indexed text.
#include "foo/bar.h"and#include <stdio.h>now emit IMPORTS edges from<module>to the bare module name. Path prefix and.h/.hpp/.hxx/.hhextensions stripped so cross-file resolution can match header file nodes. Closes a long-standing gap where C/C++ projects had near-empty import graphs.- gtest macros parsed by tree-sitter as
function_definitionnow extractSuite.Name(e.g.MathSuite.Addition) instead of colliding under the macro name (TEST), and forceis_test=trueon the resulting node. Six macros covered:TEST,TEST_F,TEST_P,TEST_CASE,TYPED_TEST,TYPED_TEST_P.
- Dart top-level function scope (silent call-graph hole): the
function_bodyscope_name arm inrelations.rspreviously only matchedmethod_signatureprev-siblings. Top-level Dart functions wrap as `declaration > function_signature- function_body` — that AST path was silently dropped, so every call inside any top-level Dart function was missing from the call graph. Now both top-level and class-method shapes resolve correctly.
README and project CLAUDE.md previously claimed "16 languages" as a flat list.
Reality is a continuum of extraction depth. Updated to a 5-tier breakdown:
- Full (calls + imports + inheritance + HTTP routes + test markers): TS/TSX, JS, Go, Python, Rust, Java
- Smoke-tested (calls + imports + inheritance): C#, Kotlin, Ruby, PHP, Swift, Dart
- Limited (functions + calls +
#includeimports + gtest test markers;Class::methodscope qualification still deferred): C, C++ - Scripting: Bash (with
source/.imports), Markdown (headings) - File-FTS only (no AST symbols extracted): HTML, CSS, JSON
Parser test suite: 65 → 87 (+22). New tests:
- 6 inheritance smoke tests for C#/Kotlin/Ruby/PHP/Swift/Dart (audit confirmed
baseline shapes work for
delegation_specifiers/inheritance_specifier/base_clause/class_interface_clause/superclass/base_listwith IFoo heuristic). - 12 calls + imports smoke tests for the same 6 languages (Tier 2).
- 3 tests for C/C++
#includeIMPORTS + gtest macro detection. - Test infrastructure now provides regression protection for the 6 Tier 2 languages that had zero specific tests before this release.
v0.18.4 — Hidden-5 fold + tools.rs split + Cargo default lite + routing-bench CI
Cargo default features changed (direct cargo install code-graph-mcp users):
the default build is now FTS5-only (~10 MB binary). Opt back into the full
hybrid (FTS5 + vector) with cargo install code-graph-mcp --features embed-model. npm/npx/plugin users see no change — release.yml now passes
--features embed-model explicitly, so shipped binaries keep the same
capabilities they had in v0.18.3.
MCP instructions shrunk from ~700 B to ~330 B (visible in initialize
response). Removes the "5 advanced tools CLI-only" caveat that was the
v0.18.3 reality but is no longer true after the fold below. Compile-time
guard at 1500 B is unchanged.
Hidden 5 names still callable (impact_analysis, find_similar_code,
dependency_graph, find_dead_code, trace_http_chain + alias
find_http_route). Dispatcher entries kept for raw JSON-RPC / SDK clients
and existing integration tests. Claude Code is expected to use the new flag
forms below.
Fold: hidden 5 → core 7 flags
The 5 niche tools that were registered-but-hidden from tools/list (and
therefore unreachable from Claude Code, which derives its callable set from
tools/list) are now reachable as flags on the core 7. Same backing logic;
new entry path:
| Old (hidden, still callable as alias) | New flag form (preferred) |
|---|---|
impact_analysis symbol_name=X |
get_ast_node symbol_name=X include_impact=true |
find_similar_code node_id=N |
get_ast_node node_id=N include_similar=true |
dependency_graph file_path=F |
module_overview path=F include_deps=true |
find_dead_code path=P |
module_overview path=P include_dead=true |
trace_http_chain route_path="GET /x" |
get_call_graph route_path="GET /x" |
get_ast_node include_impact was already in v0.18.3 — the other four flags
are new. CLI subcommands (code-graph-mcp impact|similar|deps|dead-code|trace)
are unchanged for Bash workflows.
The 2354-line tool dispatch file is now 9 focused modules under
src/mcp/server/tools/:
tools/
├── search.rs — semantic_code_search
├── callgraph.rs — get_call_graph + format helpers + truncation flags
├── ast_node.rs — get_ast_node + ast_node_by_id + impact summary + similar attach
├── ast_search.rs — ast_search
├── refs.rs — find_references
├── overview.rs — module_overview + compact + dep/dead fold
├── project_map.rs — project_map
├── advanced.rs — backing logic for the folded 5 (still pub(in server))
└── management.rs — start/stop watch, get_index_status, rebuild_index
Visibility for handler methods is now pub(in crate::mcp::server) so the
dispatcher in mod.rs can still reach them across the new module boundary.
No public API change; the matching commit is the bisect target if you're
cherry-picking.
New .github/workflows/routing-bench.yml runs tests/routing_bench.rs
weekly (Sunday 03:17 UTC), on every release tag, and on manual dispatch.
Asserts P@1 ≥ 0.70 against the live 7-tool MCP schema using OpenRouter
(Claude Sonnet 4.5 default, override via workflow input). Cost ~$0.10 per
run. Requires OPENROUTER_API_KEY repo secret; without it the job no-ops
gracefully. Per-release P@1 lands in the GitHub Actions step summary +
artifact retention 90 days.
claude-plugin/templates/plugin_code_graph_mcp.md reflects the fold —
core-7 decision table now shows the new include_* and route_path flags
inline, and the legacy "进阶 5 走 CLI" section is rewritten as "old names
still work; prefer flag form in Claude Code". Adopted projects with
CODE_GRAPH_NO_TEMPLATE_REFRESH unset will pick this up on next
SessionStart.
Maintenance release. No public-API or schema changes; CLI flags, MCP tool shapes, and SQLite schema all unchanged from v0.18.2. Output of comprehensive gstack audit run (/cso, /review, /retro) on v0.18.2 — every finding actioned or explicitly accepted with rationale.
Release pipeline — third-party action SHA pins, model revision pin
release.yml:dtolnay/rust-toolchain@stable→@e081816(1.95.0 branch SHA-pinned). Closes the asymmetry whererelease.ymlbuilt shipped binaries with whatever the latest stable Rust was at build time, whileci.ymltested with1.95.0. Also closes the supply-chain window where a moved@stabletag would have silently affected every release.release.yml:Swatinem/rust-cache@v2→@e18b497,softprops/action-gh-release@v3→@b430933. Both third-party, in the release path; cache-action poisoning could exfiltrateNPM_TOKEN, release-action substitution could swap GH Release artifacts. Floating major-version tags are mutable; commit SHAs aren't.release.ymlmodel-bundle step: HuggingFaceresolve/main/$f→resolve/c9745ed1d9f207416be6d2e6f8de32d1f16199bf/$fforsentence-transformers/all-MiniLM-L6-v2. Pluscurl --failso a 404 HTML page can no longer masquerade asmodel.safetensors(the bundle's downstream sha256 only validated the bundle against itself, not against a known-good upstream).
Supply-chain CVE coverage — new CI job + 6 RUSTSEC fixes
- New
auditjob inci.ymlrunscargo auditagainstCargo.lockon every push/PR. cargo-audit pinned to^0.22because0.21.xpanics on RustSec advisories using CVSS 4.0 (e.g.RUSTSEC-2026-0066) — fetch fails before any finding can be reported. Default behavior fails on vulnerabilities; informationalunmaintained/unsoundadvisories print but don't block (most are transitive and not under our control until upstream replacements ship). cargo update -p rustls-webpki -p tarresolved the 6 advisories the new audit job surfaced on a v0.18.2 baseline:rustls-webpki 0.103.9 → 0.103.13(RUSTSEC-2026-0099 wildcard cert name acceptance, RUSTSEC-2026-0098 URI name constraint, RUSTSEC-2026-0104 CRL parsing panic, RUSTSEC-2026-0049 CRL distribution-point matching). Used transitively viareqwest/quinn.tar 0.4.44 → 0.4.45(RUSTSEC-2026-0067unpack_insymlink chmod, RUSTSEC-2026-0068 PAX size header). Direct dep behindembed-modelfeature, used to unpack the bundled HF model tarball.
Indexer perf — pending sweep no longer full-scans nodes
resolve_pending_calls in src/indexer/pipeline.rs previously did
SELECT n.id, n.name, ... FROM nodes n JOIN files f ... over the full
nodes table to build the in-memory name → [(node_id, language)] map for
resolution. Even a 1-row pending table triggered a full scan on every
incremental pass. Narrowed by adding
AND n.name IN (SELECT DISTINCT target_name FROM pending_unresolved_calls)
to the SELECT — scope drops to ≤ |pending| names per sweep. All 5
v0.18.2 regression tests still pass; resolution semantics unchanged.
Closes the bug documented in memory feedback_incremental_edge_timing.md:
incremental indexing silently dropped REL_CALLS edges in two symmetric
scenarios that only rebuild-index recovered. v0.18.1 query-time
freshness was a band-aid for the file_path-aware tools; this is the
underlying fix, in both directions.
The bug, both directions
Direction A — callee added later: file B has caller_b() { foo(); }.
At B's Phase 2, foo has no same-file or same-language target → REL_CALLS
dropped (memory feedback_edge_resolution_same_language.md correctly
forbids cross-language fallback for calls). Later, file A is added with
function foo() {}. Incremental index reindexes A only; B isn't in
changed_paths, so B's bare-name foo() is never re-resolved. Edge
caller_b → foo permanently missing until full rebuild.
Direction B — callee removed: same setup but A is deleted. Cascade
delete on target_id FK strips B's edge to A.foo automatically; B isn't
in delete_paths, so Phase 2 doesn't re-extract it. If A is then re-added
later, B has neither a stale edge nor a way to know it should re-resolve.
The fix (schema v8)
New pending_unresolved_calls table buffers REL_CALLS that Phase 2 can't
resolve at extraction time, plus inbound REL_CALLS edges Phase 0 is about
to cascade-strip. The post-Phase-2 sweep promotes pending rows to real
edges as soon as a same-language target appears.
(source_id REFERENCES nodes ON DELETE CASCADE, target_name, source_language, metadata)with unique index on the triple — keeps inserts idempotent across repeated Phase 2 invocations.ON DELETE CASCADEmakes the table self-cleaning: when caller B is reindexed (Phase 1 deletes its old nodes), pending rows for B's old source_ids drain automatically.- Sweep scope is same-language only — cross-language is never
promoted (the canonical false-positive class from
feedback_edge_resolution_same_language.md). When multiple same-language candidates exist, the existingrefine_ambiguous_targetsapplies (path-proximity + non-test preference), so dense-fanout cases don't regress dead-code precision.
Direction A wiring (commit d172cae): at Phase 2's REL_CALLS drop
point in pipeline.rs, instead of silent continue we
insert_pending_unresolved_call. End of index_files runs
resolve_pending_calls which builds name → [(node_id, language)] and
node_id → path maps from current DB state (one indexed SELECT, not
per-row), iterates pending rows, applies refine_ambiguous_targets
where ambiguous, inserts edges, drops rows.
Direction B wiring (commit 9c27739): Phase 0 in pipeline.rs now
resolves file_ids before delete_files_by_paths drops them, calls new
queries::get_inbound_calls_for_pending to fetch inbound REL_CALLS
edges from non-deleted files, and writes pending rows for each before
letting cascade fire. Same post-Phase-2 sweep then handles the resolution.
Migration: SCHEMA_VERSION 7 → 8. INDEX_VERSION unchanged — existing edges remain valid; the pending table starts empty and fills naturally on next index pass. Migration is transactional (matches the pattern of every prior migration). Existing v0.18.1 DBs auto-upgrade on first open.
Test coverage (5 new pending-resolution tests + 1 migration test):
test_pending_unresolved_call_resolves_when_callee_added_later(direction A round-trip)test_pending_buffers_on_callee_file_deletion(direction B round-trip — edge → delete → buffered → re-add → edge restored)test_pending_unresolved_call_does_not_cross_language(TS pending vs Rust definition stays buffered; cross-language refused)test_pending_resolves_multiple_calls_in_same_caller(3 undefined calls in one caller → 3 pending rows → all drain on single sweep)test_pending_cascade_deletes_when_caller_file_reindexed(load-bearing schema FK behavior — explicit guard so a future migration weakening the FK fails loudly here)test_v7_to_v8_migration_adds_pending_table(asserts table + both indexes after v7-shape DB opened via Database::open)
Bonus: 2 new plugin-script test files
scripts/sync-versions.test.js— 4 tests, fixture-copy strategy, locks release-tooling contract (feedback_version_sync.md). Includes(9 files updated)count assertion to catch silent target drops.claude-plugin/scripts/mcp-launcher.test.js— 3 tests, end-to-end MCP initialize via launcher + 2 static-grep checks for plugin-env isolation (feedback_plugin_env_isolation.md) and macOS quarantine hint surface.
Test count: 272 default-features (was 265 in v0.18.1), 265 no-default-features (was 261), 182 JS tests (was 175). All clippy 1.95 clean on both feature profiles.
Compatibility: All 16 MCP tool schemas unchanged. CLI flags unchanged. Output JSON unchanged (no shape additions). Schema migration is transparent to consumers — query results match v0.18.1 plus previously-missing edges that should have been there all along.
Three additive improvements to MCP tool surfaces, no breaking changes. All output-shape changes are strictly additive — non-truncated / non-edit-aware paths return the exact prior shape.
1. Query-time freshness for file_path-aware tools (commits 30678d6,
82f1526).
When an MCP tool receives an explicit file_path argument, the agent is
signaling "I just edited this; please answer against the current bytes."
The 30s last_incremental_check debounce in the server is too coarse for
tight Edit→search loops — agents would see pre-edit call edges right after
saving a file.
- New
pipeline::ensure_file_indexed(db, root, rel_path, model): single-file hash-compare reindex, no-op when on-disk hash matches stored hash. Drops stale rows when the file is gone. Skips files we wouldn't index in the first place (binary / unrecognized language). Cross-file dirty-edge handling mirrorsrun_incremental_index(collect dirty node IDs BEFORE re-indexing so cascade delete doesn't strip the context-string regeneration target set). - New
McpServer::ensure_file_fresh_opt(path): server-side wrapper that's a no-op on read-only secondaries, on missing/empty/directory paths, and when the embedding lock is contended. Invalidatesproject_mapandmodule_overviewcaches only when a reindex actually fired. - Wired into 6 file_path-aware tools:
get_call_graph,get_ast_node,module_overview,find_references,dependency_graph,impact_analysis. Agents no longer have to remember which tools auto-refresh and which don't.
test_ensure_file_indexed_picks_up_post_edit_changes covers the no-op /
post-edit-pickup / repeat-no-op / file-deleted paths.
2. Call-graph truncation provenance (commit fd168fd).
The recursive CTE in get_call_graph silently caps at depth 10 and 200
rows. Agents reading partial results couldn't tell when truncation fired
vs. when the graph genuinely ended — a common failure mode for "who calls
X?" on hot functions where the real answer is "200+ across the codebase,
you're seeing a slice."
graph::query::CallGraphResultwrapsVec<CallGraphNode>withlimit_hit/depth_capped/effective_max_depth/requested_max_depthflags.CALL_GRAPH_MAX_DEPTH(10) andCALL_GRAPH_ROW_LIMIT(200) are now public constants — single source of truth (was a magic number in two places).query_directionreturns(nodes, limit_hit)so thedirection="both"merge can OR-combine saturation across both call directions.- New JSON fields appear only when truncation fires:
limit_hit,depth_capped,effective_max_depth,requested_max_depth,truncation_warning. The warning text gives the agent a recovery move ("pick a leaf node_id and re-query from there, or narrow with file_path"). - Wired into MCP
get_call_graph(incl. rollup branch) +trace_http_chain(call_chain_truncatedflag per handler), and CLIcode-graph-mcp callgraph/trace.
test_depth_capped_signal verifies clamp + flag wiring with a depth=99
request.
3. CHARS_PER_TOKEN clarified as bytes/token + CJK regression test
(commit 6dc10ff).
The constant has always been used with s.len() (UTF-8 byte count in
Rust), not Unicode char counts. The historical name suggested otherwise
and tempted "fixes" to char-count, which would silently halve the CJK
budget — one CJK char = 3 bytes ≈ 1 BPE token, so bytes/3 ≈ chars ≈ tokens (accidentally correct under the bytes interpretation).
- Doc on
CHARS_PER_TOKENnow explains ASCII vs CJK behavior and the conservative-overestimation property that makes earlier-fire compression the safe error direction. estimate_tokenslocal renametotal_chars → total_bytesto match.test_estimate_tokens_cjk_byte_based: 1000 CJK chars (3000 bytes) must estimate ~1000 tokens; ASCII 1000 chars (1000 bytes) must estimate lower, confirming the divisor is bytes-based. Regression guard against someone "fixing" the estimator to char-count.
No behavior change in this commit — doc + test only.
Test count: 265 default-features (was 264), 258 no-default-features (was 257). 3 new tests across the three changes.
Compatibility: All 16 MCP tool schemas unchanged. CLI flags unchanged. Output JSON additive only. Zero breaking changes for plugins or downstream consumers.
Two changes driven by the v0.17.3 30-day usage audit. The audit found
that all 728 code-graph MCP/CLI calls in 30 days came from the
plugin's own repo — frontend / non-Rust workflows had zero coverage
in the routing benchmark, so we couldn't tell whether tool descriptions
activated for them. It also found that project_map was being invoked
~11 times/30d via MCP after SessionStart had already injected the
same map at boot — pure redundancy.
1. routing_bench frontend domain (tests/routing_bench.rs).
Adds a second 20-query oracle (FRONTEND_ORACLE) covering the same 7
core tools with JS/TS/Vue/React phrasing (component / hook / Promise /
useEffect / Redux dispatch). Selectable via new env:
ROUTING_BENCH_DOMAIN=backend(default — preserves v0.17.2/v0.17.3 baselines comparable; runs only the original 22-query Rust pool).ROUTING_BENCH_DOMAIN=frontend— runs onlyFRONTEND_ORACLE.ROUTING_BENCH_DOMAIN=all— both pools (42 q), with separateBackend recall/Frontend recallbuckets in the report so frontend regressions don't hide behind backend wins.
The bench helpers (compute_recall, compute_overall, build_oracle)
were refactored to accept an oracle slice instead of hardcoding ORACLE,
so the same scoring path covers both domains. oracle_well_formed
guards backend coverage; new frontend_oracle_well_formed and
frontend_oracle_distinct_from_backend guard the frontend pool.
15 new tests added (42 total, was 27); no API key required for any of
them. Test count: cargo test --test routing_bench 42 passed,
1 ignored (the API-gated routing_recall_benchmark itself).
First frontend baseline (sonnet-4.5, 3-run majority vote, domain=all mode=context-rich, ~$0.80/run):
- Backend recall: 22/22 = 100% (was 21/22 in v0.17.2 — the historic
EmbeddingModel struct definitionmiss did not recur this run; v0.17.3's description tightening ofget_ast_nodeandsemantic_code_searchappears to have stuck on sonnet-4.5). - Frontend recall: 19/20 = 95.0%.
- FP-rate: 0/10 = 0%.
- Overall: 51/52 = 98.1%.
The single frontend miss is "List all React components in src/components/" →
routes to module_overview (3-run unanimous), expected ast_search. This
is borderline-by-design: the query contains a module-path prefix
(src/components/) which triggers the v0.17.0 description rule "if module
path is known, prefer module_overview" — the same rule guarded by backend
ORACLE's "How does the embedding pipeline work in src/embedding/?"
regression case. Two valid routings; model picked the path-anchored one.
Future regression gate: frontend recall ≥ 19/20.
Conclusion: frontend pool achieves near-backend recall with vanilla
sonnet — tool descriptions already activate on JS/TS/Vue/React
vocabulary. The "frontend project shows zero MCP calls" observation
from the usage audit is workflow/install shortfall (the audited project
hadn't enabled the plugin in .mcp.json), not a routing-description
failure.
2. project_map description: explicit dedup hint (src/mcp/tools.rs).
Description rewritten from
"Project architecture map. Use when: starting work on unfamiliar code, finding which module owns functionality, or needing cross-module dependency overview."
to
"Project architecture map. SessionStart hook already injects this at boot. Call only if structure changed mid-session: major refactor, rebuild-index, or many new modules."
170 bytes, fits the 200-byte per-description cap asserted by
mcp::tools::tests::test_descriptions_are_concise.
Re-bench (same methodology) post-description-change: zero regression. Backend 22/22, Frontend 19/20, FP 0/10, Overall 51/52 unchanged. Same single frontend miss. Conclusion: explicit "do-not-call-redundantly" framing in tool descriptions is regression- safe and reusable for any other MCP tool that already has SessionStart- hook coverage.
Expected impact: ~11 redundant project_map MCP calls/month
eliminated (~33K tokens/month saved) without any routing precision
trade-off. Will be visible in the next 30-day window's
code-graph-mcp stats tools.project_map.n count.
Bench-driven fix on published tool descriptions for the "named-symbol queries leak to semantic_code_search" boundary.
Background. v0.17.2's context-rich bench (haiku-4.5 stress test)
surfaced 3 systematic misses: EmbeddingModel struct definition,
weighted_rrf_fusion signature, format_call_graph_response implementation — all routing to semantic_code_search instead of
get_ast_node. The MEMORY.md hook already had 看 X 源码/签名 → get_ast_node but weak models ignored it at tool-selection.
Diagnosis. semantic_code_search's description led with "Search
code by concept" with no explicit handoff to get_ast_node for
named-symbol queries. v0.17.0 added analogous redirects for
semantic_code_search → module_overview and find_references → Grep;
the named-symbol boundary was the missing one.
Fix. Two description edits in src/mcp/tools.rs:
semantic_code_search(197 chars): rewritten to "Concept search when no symbol/module is named. If a symbol is named (e.g., 'show X struct'), use get_ast_node; if module path is known, use module_overview. Use when grep is noisy."get_ast_node(200 chars): "Inspect ONE named symbol: signature, full source, optional references/impact. Use when: query names a symbol asking for its definition/body/signature/implementation. PREFER over semantic_code_search."
Both fit the project's 200-char-per-description cap (asserted by
mcp::tools::tests::test_descriptions_are_concise). Tighter
example-list patterns were tested first but exceeded the cap.
Bench results (3-run majority vote on each model):
- Sonnet 4.5 context-rich: 22/22 / 0/10 / 32/32 = 100% pre-fix and post-fix. Zero regression.
- Haiku 4.5 context-rich: 19/22 → 20/22 (Recall 86.4% →
90.9%, Overall 90.6% → 93.8%).
weighted_rrf_fusion signaturerecovered toget_ast_node. Two queries still miss (EmbeddingModel struct,format_call_graph_response implementation) — they need stronger anchor patterns than fit in the 200-char budget; tracked as a follow-up.
Iteration history (recorded for future tuning). A higher-budget
description with three named example phrasings ('show X struct
definition', 'signature of Y', 'implementation of Z') recovered
EmbeddingModel struct on haiku but exceeded the 200-char cap and
caused a List all structs in storage module → module_overview
regression on the same model. Compressing to fit the cap dropped
the EmbeddingModel recovery but eliminated the regression. Net
haiku improvement: +1 query.
Drift test still passes — INDEX_LINE_MIRROR byte-equal to
adopt.js (no MEMORY.md hook change in this commit). Per-tool
descriptions are LLM-visible metadata (L3 published surface);
content is description-only with bench-verified outcomes on both
strong and weak models.
Adds a measurement capability the existing bench architecture lacked:
grading the MEMORY.md hook line quality (added in v0.17.1). The
existing tests/routing_bench.rs only consumed tool descriptions;
it could not detect routing changes from MEMORY.md, the adoption-
memory file, or MCP instructions. Stage 3 hook content tuning
needed an oracle.
ROUTING_BENCH_MODE=context-rich mode. Adds:
INDEX_LINE_MIRRORRust constant + drift-detection test that spawns Node and asserts byte-equality withadopt.js'sINDEX_LINEexport. Drift fails on everycargo test.- Decoy
GrepandReadtools added to the API call'stoolsarray (descriptions calibrated with "Prefer over code-graph" anchors to be measurement-fair). - 10-entry
FP_ORACLEof strict-boundary queries (literal text, file reads by path, doc/config content) that should route to decoys, not code-graph. - 3-run majority-vote aggregation per query (tie-break: first run); applied to both modes.
- Three reported metrics: Recall (out of 22 ORACLE), FP-rate (out of 10 FP_ORACLE), Overall (out of 32, loose summary).
temperature: 0 added to both backends (Anthropic + OpenRouter)
in tool-only mode too. Pre-existing latent ±3-5pp single-run noise
masked Stage 3-level differences. Reproducible from this version on.
First baselines (2026-04-30, OpenRouter anthropic/claude-sonnet-4.5):
- Tool-only: 21/22 = 95.5% (178s, 60 calls). Same residual miss
as v0.16.7 (
Show me the EmbeddingModel struct definition→ast_searchinstead ofget_ast_node— pre-existing semantic borderline). Note:feedback_routing_bench.mdhad been tracking 19/20 — that was against the 20-entry pre-v0.17.0 oracle; v0.17.0 added 2 regression-guard queries, real total is 22. - Context-rich: Recall 22/22 = 100% · FP-rate 0/10 = 0% ·
Overall 32/32 = 100% (255s, 90 calls). The historically-stuck
EmbeddingModel structquery routes correctly here — the MEMORY.md hook + Grep/Read decoys provide enough disambiguation context to flip it. Caveat: single bench run; Stage 3 will tell us how robust this is to hook-content variations.
Default mode unchanged. With ROUTING_BENCH_MODE unset or any
value other than context-rich, the bench behaves identically to
v0.17.1 except for temperature: 0 and 3-run aggregation. The
oracle_well_formed and index_line_drift_check tests run on
every cargo test; the live benchmark stays #[ignore]'d.
Single-file structural fix. claude-plugin/scripts/adopt.js
INDEX_LINE changes from an 11-line array.join('\n') block to a
single-line string, complying with the MEMORY.md spec ("each entry
should be one line"). The sentinel block written to
~/.claude/projects/<slug>/memory/MEMORY.md shrinks 11 lines → 1
line at next SessionStart per the v0.11.0 template-refresh
contract.
No behavior change. All 12 tool names (7 core + 5 hidden), all
6 中文 scene phrases (改 X 影响面 / 谁调用 X / X 被谁用 / 看 X 源码 /
Y 模块长啥样 / 概念查询), the 优先于 Grep anchor, and the
字面匹配走 Grep reverse signal are kept verbatim. New:
spec-canonical tag syntax
[impact, callgraph, refs, overview, semantic, ast-search, dead-code, similar, deps, trace]
for explicit keyword matching. Reduces always-loaded MEMORY.md
context by ~366 chars per session.
The adoption-memory detail file (plugin_code_graph_mcp.md) is
unchanged — it already holds the full decision table the
multi-line block was duplicating.
Bench scope. tests/routing_bench.rs only consumes tool
name + description + input_schema (verified at
tests/routing_bench.rs:224-233 + :50-52); it does not consume
MEMORY.md, the adoption-memory file, or MCP instructions. So it
cannot grade adoption-memory hook quality. A context-rich bench
(MEMORY.md in system prompt + Grep-decoy false-positive corpus) is
a separate change. Existing routing_bench is unaffected by this
PR.
Tests. cargo test 26/26, node --test claude-plugin/scripts/*.test.js 132/132,
adopt.test.js 43/43.
Two-part SessionStart context-cost reduction. The plugin used to inject
a ~2.3 KB project_map every session for un-adopted projects, plus a
1418-byte MCP instructions block packing 10 per-tool decision rules.
Both are redundant with what already exists: each tool's own
description carries its routing hint, and MEMORY.md → plugin_code_graph_mcp.md already holds the full decision table for
adopted projects. v0.17.0 cuts both, and tightens the two tool
descriptions whose phrasing demonstrably mis-routed in benchmarks.
1. SessionStart project_map injection: OFF by default.
Old contract (v0.9.0): adopted → quiet, un-adopted → noisy. The
assumption was that adoption installed the MEMORY.md decision table
so the dump only became redundant after adopt.
New contract: quiet unconditionally. The decision table + the
on-demand project_map MCP tool + the per-tool descriptions cover
every workflow that the SessionStart map dump used to support, so
paying ~2.3 KB of context per session is wasteful — even pre-adopt.
CODE_GRAPH_VERBOSE_HOOKS=1opts in to the dump (new).- Legacy
CODE_GRAPH_QUIET_HOOKS=0(force noisy) /=1(force quiet) still wins, preserving the v0.9.0 escape hatches. computeQuietHooks({ adopted, env })accepts but ignoresadopted— kept only to avoid breaking call-sites.
2. MCP instructions field trimmed 1418 → ~700 B.
The old noisy block packed all 10 routing rules with CLI aliases
inline. Compile-time guard was 1500 B, against an observed Claude
Code truncation cutoff of ~2048 B. The 10 rules now live in
per-tool description strings (where clients actually read them to
pick a tool) and in the adopted-project decision table.
What remains in instructions is the boundary signal — which 5
advanced tools are CLI-only (MCP integration can't call them by
name), what to still use Grep / Read for, and where to find the
adopted decision table.
3. Tool description tightening (LLM-visible).
semantic_code_searchnow adds "If module path is known, prefer module_overview" — closes the "I know the path AND a concept word" ambiguity that previously routed to semantic search and burned a vector lookup.find_referencesnow adds "For plain literals (string/regex), prefer Grep" —find_referencesonly tracks defined-symbol usage sites, not raw text. Before tightening it caught literal-string queries that should have gone to Grep.
4. routing_bench: +2 regression guards.
Two new oracle items in tests/routing_bench.rs directly probe the
tightened phrasings:
- "How does the embedding pipeline work in src/embedding/?" →
expects
module_overview(path > concept tie-breaker). - "I need to rename parse_tree to parse_ast — find every place I'd
update." → expects
find_references(rename-audit intent preserved despite the new "prefer Grep for literals" line).
Verification. OPENROUTER_API_KEY=… cargo test --release --test routing_bench -- --ignored against anthropic/claude-sonnet-4.5
returned P@1 = 21/22 = 95.5%, up from baseline v0.16.7's
19/20 = 95.0%. Both new guards passed. The single miss is "Show me
the EmbeddingModel struct definition" routing to ast_search
instead of get_ast_node — pre-existing oracle item, semantically
defensible (ast_search returns nodes by name+kind), not introduced
by this release.
Audit-driven fixes after sandboxed end-to-end testing of the install,
adopt, update, and uninstall flows. Three real bugs surfaced that the
existing 97-test suite couldn't see because none of them tested the
real user path: npm uninstall, post-upgrade binary resolution, and
adopt-from-fresh-clone. Plus a parity sweep on the MCP↔CLI surface.
1. npm uninstall left dangling hooks in ~/.claude/settings.json.
The package shipped a full lifecycle.js uninstall that strips our
hook entries from settings.json — but nothing wired it to npm. After
npm uninstall -g @sdsrs/code-graph the package files were gone but
settings.json still pointed PostToolUse / SessionStart hooks at the
deleted scripts. Claude Code subsequently failed to fire those hooks
or surfaced ENOENT spam.
Fix: added "preuninstall": "node claude-plugin/scripts/lifecycle.js uninstall || true" to package.json. npm now invokes the existing
uninstall path before removing files. The || true ensures a
lifecycle failure never blocks the uninstall itself. Verified end-to-end
in a sandboxed HOME: settings.json hooks containing code-graph paths
get stripped; foreign hooks and otherKey configuration are preserved
byte-for-byte.
2. find-binary cache shadowed fresh npm update binaries. The
cache priority was: dev mode → auto-update cache (~/.cache/code-graph/ bin/) → platform npm pkg. After npm update -g 0.16.7→0.16.8 the
platform-pkg binary was refreshed, but the auto-update cache still
held 0.16.7. find-binary returned the stale cache because it only
verified the binary was executable, never that the version matched.
Users kept running 0.16.7 until auto-update fired (up to 6h later).
Fix: when the auto-update cache hits, read its --version and
compare against the npm pkg version (require('../../package.json'). version). Cache wins when cache.ver >= pkg.ver (legitimate case:
auto-update fetched a newer release than npm has shipped). Cache loses
when older — find-binary falls through to platform-pkg. Includes a
3-digit semver compare helper that tolerates short / non-numeric input.
3. adopt couldn't bootstrap a fresh clone. The path required
~/.claude/projects/<slug>/memory/ to already exist (created by
Claude Code on first session that writes memory). Fresh-cloned project
with no memory dir → adopt errored no-memory-dir and told the user
to "run claude at least once". CI / scripted setup / first-time users
on a new project all hit the wall.
Fix: introduced a project-marker check (.git, Cargo.toml,
package.json, pyproject.toml, go.mod, pom.xml, build.gradle,
.code-graph). Memory dir missing AND cwd has any marker → mkdir -p
and proceed. No marker → return not-a-project with a clearer error
("cd into a real project before running adopt"). The slug-pollution
guard remains in place for /tmp / $HOME accidents.
Claude Code's slug encoding ([^a-zA-Z0-9-]→'-') is lossy: /foo/bar
and /foo bar resolve to the same memory dir. Two projects can
silently share state with no signal. Added: adopt writes
<!-- adopted-by: <abs-cwd> --> as the first line of
plugin_code_graph_mcp.md. Re-adopt from a different cwd surfaces
result.collisionWith and a stderr warning. needsRefresh's
bytewise compare strips the marker line first, so the marker doesn't
cause false-positive drift detection on every SessionStart.
Drove every MCP tool against its CLI counterpart on the same query and compared output. Three real divergences fixed:
hot_functions: CLI usedcallers/test_callers, MCP usedcaller_count/test_caller_count; CLI cap=15, MCP cap=10. Both now usecaller_count/test_caller_count. CLI honors--compactfor top-10 cap (matching MCPcompact:true); default returns top-15 (the underlying SQLLIMIT 15).module_overviewcompact: MCP renamedcaller_count→callersin compact mode but keptcaller_countin full mode. Aligned both tocaller_count.get_call_graphself-edge: CLI included the queried symbol itself withdirection=callersANDdirection=callees(count off by 2 fordirection=both). MCP filtereddepth > 0. CLI now filters seed in JSON output too. Human renderer keeps the seed for the tree root.project_mapcompacttypefield: MCP non-compact hadtypeon each hot_function, compact dropped it. Both surfaces now keeptypefor parity.
Real-world friction observed in another project where Claude typed
code-graph-mcp project_map --compact (the MCP tool name) verbatim
into Bash and hit "Unknown subcommand: project_map". The MCP
instructions had Start: project_map --compact without the
parens-form CLI alias hint that the other 10 rules use. Two-layer fix:
- Fixed the instructions text:
Start: project_map (map --compact)follows the existingMCP-name (CLI-alias)convention. - Defense in depth: CLI dispatch now accepts MCP tool names directly.
project_map/module_overview/get_ast_node/find_references/get_call_graph/impact_analysis/find_similar_code/dependency_graph/trace_http_chain/find_dead_code/ast_search/semantic_code_searchall map to existing short-name handlers.code-graph-mcp project_map --compactnow works. Typo suggester also learned the MCP names soproject_mapp→project_map.
scripts/release-smoke.test.js gained auto-update parses real GitHub releases/latest shape, gated on CODE_GRAPH_AUTO_UPDATE_E2E=1. The
existing 10 auto-update unit tests are all mocked — there's no
guardrail against GitHub API shape regression. Run once per release
to validate parseLatestRelease against the real payload.
- 165 node tests pass + 1 opt-in skip across 12 suites
- 391 cargo tests pass + 1 ignored (routing_bench needs API key)
- Sandbox lifecycle E2E: 16/16 pass with HOME-isolated mkdtemp (binary smoke / adopt / re-adopt / status / check / session-init / unadopt / residue audit, no orphan plugin file)
- A end-to-end: realistic settings.json with
code-graphhook paths →lifecycle uninstall || truestrips ours, preserves foreign hooks +otherKey
End-to-end usability pass: simulated a Claude Code session driving every MCP tool and CLI subcommand on real symbols. Five independent fixes for issues that surfaced — none blocking on their own, but each was eroding the trust-layer agents need to act on tool output.
1. callgraph rendered depth>1 nodes under the wrong parent. The
recursive CTE was collapsing duplicates with GROUP BY MIN(depth),
which lost the actual traversal parent and made every depth-N node
appear nested under the last depth-(N-1) sibling. So A→B→C plus
D→B printed as if D lived under A once B was already shown.
Fix: the CTE now tracks parent_id (the cg row that produced each
new node) on each inductive step, and dedup uses
ROW_NUMBER() OVER (PARTITION BY node_id ORDER BY depth) so the
shortest-path parent survives. CLI renderer builds a parent_id → children map per direction and recurses, so callers/callees subtrees
stay separate under --direction=both. JSON output now includes
parent_id (null for the root) for any consumer that wants to rebuild
the tree.
2. similar and deps violated the --json empty-result contract.
Both subcommands had paths that wrote nothing to stdout and exited
with stderr only — breaking machine consumers per
feedback_cli_json_empty_contract. Added: similar --json writes
[] when vector search returns no neighbors; deps --json writes a
JSON error object {"file":..., "depends_on":[], "depended_by":[], "error":"..."} when the file has no tracked imports. Two new
regression tests guard these paths.
Bonus: similar 1010 (digits as positional) used to print the
unhelpful "Symbol not found: 1010". Now nudges toward
similar --node-id 1010. And similar with an existing symbol that
hasn't been embedded yet ("No embedding for node_id 342") explains
why ((1033/1321 nodes embedded — embeddings still generating; try again shortly or pick a node with --node-id from \show X`)`).
3. MCP tool descriptions misled agents on subtle defaults. Two tools had descriptions that didn't match their actual behavior, so agents made decisions on stale info:
module_overview— caller counts include test callers, but the description didn't say so; agents reading "5 callers" couldn't tell if a function was prod-hot or only test-driven. Description now states "callers count includes tests" so the LLM picks a different tool when it actually needs prod-only callers.find_references— for constants, onlyimportsedges are recorded; usage sites where the const is read don't appear because Rust grammar emits them as identifiers without an import-context. Description now says "consts: imports only, not value-uses" so the agent escalates to grep when auditing a const for rename.
Also added one line to the MCP instructions payload telling the
agent that impact_analysis/find_dead_code/find_similar_code/
dependency_graph/trace_http_chain are CLI-only after the v0.10.0
core/advanced split — Claude Code only sees the 7 core tools, so
agents trying to invoke the advanced 5 directly via MCP would 404.
4. E2E suite was passing on dead queries. scripts/e2e-validate.js
called get_call_graph(handle_call_tool), impact_analysis( handle_call_tool), and dependency_graph(src/mcp/server.rs) —
all three symbols/paths had been renamed/moved sessions ago. The
assertions only checked "response contains non-empty text", so
"[code-graph] Symbol not found: handle_call_tool" passed as
success. 24/24 green, but actually testing zero-result paths. Real
response sizes told the story: get_call_graph 221 bytes (now 2628),
impact_analysis 220 bytes (now 498), dependency_graph 304 bytes
(now 2291).
Fix: swapped the queries to stable hot symbols (handle_message,
conn, src/mcp/server/mod.rs) and added two stricter assertions:
assertNotEmptyResult(resp, label) rejects 6 known empty-result
patterns ("Symbol not found", "No callers found", etc.); the MCP
dependency_graph returns JSON, not the human "Depends on" text, so
its assertion now JSON.parses and checks depends_on is a non-empty
array.
5. dead-code falsely flagged Criterion benchmarks as orphan.
benches/indexing.rs defines three bench functions, all referenced
only via criterion_group!(benches, bench_full_index, ...). The AST
relation extractor doesn't parse macro arguments as references, so
the benches showed up as ORPHAN every time — drowning out the four
real EXPORTED-UNUSED results worth attention.
Fix: added benches/ to domain::default_dead_code_ignores(),
mirroring the existing claude-plugin/ exclusion for shell-invoked
hook scripts. The rule generalizes: any directory whose entry points
are reached through tokens the AST can't resolve (macro arguments,
shell command strings, settings.json hook definitions) belongs in
the default ignore list. CLI --no-ignore still surfaces them. New
unit test pins the policy.
Together these don't change any external schema, but they materially improve the signal an agent gets per tool call — fewer phantom orphans, a callgraph tree that reads like one, and an E2E suite that actually fails when a hot symbol moves.
Reported on a fresh /plugin install code-graph-mcp on another
machine: MCP couldn't connect, the binary was nowhere to be found.
Triage found three independent breakages along the launcher chain;
each is fixed and tested separately so the chain is fault-tolerant
on first install.
1. find-binary.js: didn't search npm global node_modules.
require.resolve('@sdsrs/code-graph-{platform}-{arch}/package.json')
only walks the node_modules chain rooted at the requiring file —
it does NOT search global installs, because nvm and standard Unix
prefixes don't set NODE_PATH. So a working npm install -g @sdsrs/code-graph-linux-x64 was previously invisible to the
launcher even when the binary was sitting at
~/.nvm/.../lib/node_modules/@sdsrs/code-graph-linux-x64/code-graph-mcp.
Fix: new globalNodeModulesCandidates() probes 4 prefix
sources — process.execPath-derived (Linux/macOS:
<prefix>/lib/node_modules; Windows: next to node.exe),
NPM_CONFIG_PREFIX env, ~/.npm-global/lib/node_modules, and
npm root -g (last resort, ~50-200ms). New findPlatformBinary()
combines fast-path (require.resolve) + slow-path (global probe).
2. auto-update.js: trusted state file over filesystem. When
installedVersion === latestVersion, checkForUpdate short-circuited
to the no-update branch without verifying that
~/.cache/code-graph/bin/code-graph-mcp actually exists. Once the
state file recorded "installed v0.16.6", a wiped cache or a
silently-failed prior download would never be repaired. Real-world
artifact: update-state.json says "Up to date" while the cache
directory is empty.
Fix: new downloadBinary() helper extracted from
downloadAndInstall so the binary download can run in either
context. Throttle bypassed when cache binary is missing (a hard
failure overrides the 6h check window). No-update branch
self-heals by calling downloadBinary(latest) when binary is
absent. cachedBinaryPath() exported for test harnesses.
3. mcp-launcher.js: only one fallback strategy. When
findBinary() returned null, the launcher tried npm install -g @sdsrs/code-graph once and gave up if that didn't yield a binary.
But npm's optionalDependencies failure mode is to silently
accept partial installs (an OS-mismatch tolerance feature that
also masks transient registry/network errors), so the wrapper
package would install successfully while the platform binary
package was dropped.
Fix: second-stage fallback runs auto-update.js --silent
which downloads the platform binary directly from the GitHub
release into ~/.cache/code-graph/bin/. Bypasses npm registry
entirely. Final error message also names the platform-specific
package (@sdsrs/code-graph-{platform}-{arch}) for manual
recovery.
Tests: 7 new (find-binary.test.js × 4 covering candidate
derivation + dedup + integration; auto-update.test.js × 3
covering cachedBinaryPath + downloadBinary null safety).
117 plugin JS + 385 Rust = 502 total green.
Two MCP tool UX bugs surfaced during a user-simulation pass over the core 7 toolset on this very repo:
semantic_code_search: README headings outranked code. Query
merkle tree change detection returned README.md License
(h2, 0.45) / Features (h2, 0.44) / Build (h3, 0.42) ahead
of DirectoryCache struct in src/indexer/merkle.rs (0.37).
Root: markdown heading nodes get respectable vector-similarity
scores for unrelated queries (short heading text embeds close
to many concepts), and the re-ranker (name_boost /
size_factor) had no doc-tier preference. The tool is
semantic_code_*search*; for code-intent queries, prose should
not dominate.
Fix (src/mcp/server/tools.rs:193-209): doc_penalty = 0.4
multiplier applied when the candidate's language is markdown
AND the caller did not pass language="markdown". Same query
after fix: TOP 6 all from merkle.rs / watcher.rs, first
result DirectoryCache rose to 0.60. Explicit
language="markdown" bypasses the penalty (verified
Installation h2 comes back at 0.59 for "installation
instructions" queries).
find_references: no test-filter opt-out. upsert_file
query returned 27 references, 24 of them test_* callers,
drowning the 3 production usage sites. Inconsistent with
get_call_graph and get_ast_node include_impact=true, which
already default to hiding test callers.
Fix: new include_tests boolean parameter (default true
to preserve rename-audit semantics — tests ARE usage sites),
plus test_references_filtered count in the response when
callers opt out. Schema published in src/mcp/tools.rs:131.
Call with include_tests=false to get production-only refs;
call without the flag (or true) for the pre-v0.16.6
behavior.
Three impact-analysis paths (cmd_impact, tool_impact_analysis,
append_impact_summary) each maintained their own inline list of
"non-function" node types to flag as UNKNOWN. The lists had drifted:
two only matched struct|class|enum|interface|type_alias (missing
constant and trait), and append_impact_summary — the path
reached by the core-7 get_ast_node include_impact=true that Claude
Code actually uses — had no type check at all.
Symptom: code-graph-mcp impact REL_CALLS returned
risk_level: LOW, 0 callers even though 16 importers touch the
constant. An LLM acting on that signal would confidently change the
string and break every importer.
Fix (src/domain.rs): single source of truth
is_function_node_type() + NON_FUNCTION_IMPACT_WARNING constant.
All three paths share them. Non-function symbols with zero call-graph
callers now return risk_level: UNKNOWN plus an explicit warning
directing to find_references / code-graph-mcp refs <symbol>.
Function / method impact behavior is unchanged; HIGH/MEDIUM/LOW
still flow from compute_risk_level as before.
v0.16.3 canonicalized the watcher root on every platform to fix
macOS FSEvents; on Windows that regressed the watcher because
std::fs::canonicalize there returns UNC paths (\\?\C:\...) while
the ReadDirectoryChangesW backend emits plain C:\... — the same
strip_prefix silently-drop-all-events failure as before, mirrored.
The canonicalize step is now cfg-gated to non-Windows only.
Windows Release workflow (build + npm publish + smoke test) was always green because the watcher unit tests don't run there; this only surfaced on the CI matrix.
Follow-up to v0.16.2. After the path-normalization fixes landed,
Windows CI turned green but the two macOS watcher tests still
timed out. Root cause: FSEvents emits every event path via realpath,
so a watch registered on a non-canonical root like
/var/folders/xx/T/foo (the tempfile::TempDir default on macOS)
could never produce a prefix match against realpath output
/private/var/folders/... — every event was silently dropped at
strip_prefix.
Fix (src/indexer/watcher.rs): FileWatcher::start canonicalizes
the root path before passing it to notify. No-op on systems without
symlinks in the path; unblocks macOS CI and also hardens production
against project roots with symlinked ancestors (home-dir on systems
where /home is a symlink to /usr/home, chrooted containers, etc.).
Follow-up to v0.16.1. That release fixed Clippy on the 1.95 toolchain,
which let the Test step run for the first time on macOS and Windows
in this repo's CI matrix — and immediately surfaced a set of
pre-existing cross-platform bugs the previous red baseline had been
hiding. v0.16.2 addresses them.
Path normalization (fixes Windows runtime + tests):
src/indexer/merkle.rs— new internalnormalize_rel_path(&Path)helper converts\to/on Windows. All relative paths that land in the DB, CLI/MCP output, and gitignore-prefix checks now use/on every platform. Without this,starts_with(".git/")style filters only fired when the OS used/, and Windows users sawpkg\scripts\foo.jsin every tool response.src/indexer/watcher.rs— notify events go through the same normalizer before emission.- Fixes 4 pipeline tests and 2 merkle tests that were red on
windows-latestin v0.16.1 CI.
macOS FSEvents flake:
src/indexer/watcher.rs::tests::test_watcher_detects_file_changes— recv_timeout raised from 5s to 15s. macOS FSEvents coalescing on loaded GH runners routinely exceeded 5s.src/mcp/server/tests::test_watcher_detects_changes_and_reindexes— replaced fixed 300ms sleep with bounded polling (40 × 200ms ≈ 8s total), which is correct on slow hosts and instant on fast.
CI:
.github/workflows/release.yml— post-publish smoke now readsmap.jsonviafs.readFileSync('map.json',...)instead ofrequire('$tmpdir/map.json'). On Git Bash under Windows,mktemp -dreturns a POSIX-looking/tmp/tmp.XXXXthat Node.js on Win32 cannot resolve; therequirewas failing despite the file existing.
Parser / indexer correctness (JS/TS):
src/parser/relations.rs—walk_for_relationsno longer tags anonymous arrow functions (test(() => {...}),[1,2].map(x => x)) with the sentinel scope<anonymous>, which resolved to no source node and silently dropped every call inside such callbacks. Arrows without avariable_declaratorparent now inherit the enclosing scope; JS/TS/TSX calls at module top-level fall back to<module>so they produce resolvable same-file edges. Test-file helpers likewriteJson,mkHome,readCargoVersionthat are referenced only from insidetest(...)callbacks are no longer reported as orphan dead code.src/indexer/pipeline.rs— cross-file same-language resolution used to fan out an edge to every same-name target whenever no same-file match existed, turning a singlereadJson()call into N phantom edges across unrelated modules. Newrefine_ambiguous_targetsprefers non-test candidates (when the caller is non-test code) and the candidate with the longest byte-common path prefix; keeps the remaining pool on true ties so Rust bare-namecrate::x::foo()calls that always tie on prefix don't get dropped.
Before v0.16.1 this project indexed 28 cross-file JS calls
fan-out edges, all of them pointing at the wrong target in at least
one leg; after, 7 edges, each single-target and correct. refs writeJson rose from 2 → 5 (the 3 real test-callback callers
previously lost).
CI:
.github/workflows/ci.yml—dtolnay/rust-toolchain@1.95.0now installs theclippycomponent explicitly. Without this, the Clippy step failed with'cargo-clippy' is not installed for the toolchain '1.95.0'on every OS/feature-matrix cell in v0.16.0.
v0.16.0 — production hardening pass (RRF math, schema v7 dim guard, readonly secondary, bounded watcher, CI matrix)
Architecture audit surfaced nine correctness / safety gaps — this release addresses all of them plus four items flagged in a follow-up code review. Schema bump auto-migrates; no user action required.
Algorithmic correctness:
src/search/fusion.rs—SCORE_BLEND_FACTOR = 0.1silently dominated RRF by ~100× at k=30 (rank-0 RRF ≈ 0.0164 vs. max blend = 0.1), inverting the docstring's own "doesn't override rank ordering" contract and effectively converting RRF into per-source-raw-score ranking. Replaced with adaptiveblend_scale = 0.5 / ((k+1)(k+2))— mathematically half the smallest adjacent-rank RRF gap. Semantic search results will shift (for the better) on queries where one source returns a high-raw-score item at a late rank.
Data safety — schema v7 embedding-dim guard:
SCHEMA_VERSION6 → 7. Newmetatable records embedding_dim. On open, mismatch → atomic DROP + rebuildnode_vectorsat currentEMBEDDING_DIM. Prevents silent crash-on-INSERT when a user rebuilds the binary at a different dim (e.g., swaps embedding model).- v6 → v7 upgrade path introspects the on-disk vec0 DDL via
sqlite_master.sql(float[N]regex) and rebuilds if the existing table's dim ≠ current — the adversarial case wheremetais empty but a pre-existing vec0 is present.
Concurrency hardening:
src/indexer/watcher.rs— boundedsync_channel(4096)with overflow-drop policy (warn!). Unbounded channel had no cap on memory during bulk fs events (branch switches, IDE reformats). Merkle rescan is idempotent so dropped events don't lose data.src/storage/db.rs+src/mcp/server/mod.rs— secondary instances (flock denied) now open DB withSQLITE_OPEN_READ_ONLY | query_only=ON. Eliminates race where a secondary could run migrations +INDEX_VERSIONDELETE sweep against the primary's DB. Secondary polls up to 3s for the primary's bootstrap then bails with a clear error rather than falling through to read-write.
Contract strengthening:
src/parser/relations.rs—ParsedRelationcarriessource_language, stamped byextract_relations_from_tree. Resolver atsrc/indexer/pipeline.rshard-errors on mismatch (bail!, not debug_assert!) so parser regressions fail in release builds too.src/mcp/server/mod.rs—start_post_index_servicesspawns a once-per-process Phase-3 repair thread before background embedding. README's "Startup repair for incomplete indexing" claim was documentation-only until now;repair_null_context_stringsnow actually fires on every session start (primary-only, idempotent).
Documentation accuracy:
README.md— HTTP route tracing previously claimed Express, Flask/ FastAPI, Go, ASP.NET, Rails, Laravel, Vapor (8 frameworks). Only 3 are actually implemented inextract_route_pattern. Corrected.
CI + release:
.github/workflows/ci.yml— matrix {ubuntu, macos, windows} × {no-embed, with-embed} (was ubuntu-only), toolchain pinned@1.95.0..github/workflows/release.yml— newsmoke-verifyjob runs afterpublishon all 3 OSes: npm install with retry-backoff,--versionexact match,incremental-index+map --jsonon a tmp git repo. Catches missing platform binaries /find-binary.jsregressions / version-sync drift before users hit them.
Test delta: +18 unit tests (RRF invariants ×4, schema v7 paths ×5, readonly ×2, source_language stamp ×1, etc.). 250 unit + 56 integration
- 44 hardening + 19 parser + 6 cli + 6 plugin + 1 routing = 382 tests pass. Clippy 1.95 clean on both feature modes.
Deferred to a later release (L3 refactor): tools.rs (2236 LOC),
relations.rs (2174), queries.rs (2783) file splits — flagged in the
audit but require a dedicated session with plan-mode review.
User-driven QA pass exercising every MCP tool + CLI subcommand surfaced two bugs whose contract violations were silent — both regressions guard against recurrence.
Fixes:
src/storage/queries.rs—get_nodes_with_files_by_filters(the SQL backingast_search/ast-search) ordered byf.path ASConly, so theLIMITclause silently truncated alphabetically-late files (src/storage/queries.rsitself, with 54Result-returning fns) out of the top-N. New ordering iscaller_count DESC, path ASC, line ASCso high-value symbols surface first regardless of file path.src/cli.rs:2655—dead-code --jsonreturned only stderr (no stdout) when all results were filtered by--ignore, breaking JSON consumers piping stdout. Now emits[]to stdout before the human stderr message, matching the established empty-result contract used bysearch/grep/callgraph/show/trace/overview.
New regression tests:
test_get_nodes_with_files_by_filters_ranks_by_caller_count(src/storage/queries.rs) — alphabetically-first low-caller fn must not outrank alphabetically-last high-caller fn at anyLIMIT.test_cli_json_empty_dead_code(tests/cli_e2e.rs) — stdout must be[]and stderr must still surface "No dead code" when --ignore filters all results.
371 tests pass (was 369). Clippy 1.95 clean on both feature combos.
v0.15.0 audit of JS/TS support surfaced a silent breakage for .tsx
files: LanguageConfig::for_language("tsx") hit the default arm where
_ => "unknown", so every config.name == "tsx" branch was dead code.
Ripple effect: the describe/it is_test propagation added in v0.15.0
(scoped matches!(config.name, ... | "tsx")) silently skipped TSX.
Fixes:
src/parser/lang_config.rs— add"tsx" => "tsx"to the static-name match soconfig.nameis preserved through the default-config branch.src/parser/relations.rs:101—require()arm now matches"javascript" | "typescript" | "tsx"(was js/ts only).src/parser/relations.rs:1172—extract_route_patternnow routes"tsx"throughextract_express_routealongside js/ts.
Two new regression tests: test_extract_tsx_commonjs_require_and_route
(parser) and test_parse_tsx_describe_it_marks_nested_as_test
(treesitter). 369 total tests pass.
C/C++ coverage audit surfaced three parallel gaps — #include
not extracted, GoogleTest TEST/TEST_F/TEST_P macros not
recognized, no scope qualification for Class::method / obj.method /
obj->method. Tracked for v0.16.0.
v0.15.0 — same-language edge resolution, JS require() imports, markdown indexing, JS test-block detection
Multi-front accuracy pass motivated by user feedback that code-graph was useful in Rust projects but under-utilized in JS / mixed / claudemd projects. Traced to four compounding issues; all four fixed in this release with regression tests.
src/indexer/pipeline.rs resolved call/implements/imports target names
via a flat global bare-name lookup. In mixed-language projects this
produced catastrophic false positives: the Rust hasher.update(&buf)
call in src/indexer/merkle.rs:hash_file was resolving to the JS
function update() in claude-plugin/scripts/lifecycle.js, pulling
11 phantom Rust→JS edges into callgraph hash_file (verified via
dogfood before/after). Each same-named method (update, open,
init, run, read, write, etc.) was a collision vector.
Fix: edge resolution now uses a three-tier cascade — same-file →
same-language → (for calls: drop; for imports/implements: global
fallback to preserve the existing <external> sentinel path).
Non-call relations keep cross-language fallback because sentinel
nodes carry language "external" by design.
Mechanically, get_all_node_names_with_ids and the per-batch
node_id_to_path map now carry each node's language, enabling the
filter. Public type alias NameEntry = (i64, String, Option<String>)
added to keep clippy type_complexity happy.
Regression test test_cross_language_bare_name_call_resolution
plants an update collision across a Rust file and a JS file and
asserts that Rust caller_rs does not resolve any call edge to the
JS file.
src/parser/relations.rs handled ES module import statements but
had no branch for require(...) calls, the canonical CommonJS form.
Consequence: Node.js code bases (including this repo's own
claude-plugin/scripts/*.js) had 3 total imports edges across 19
JS files before the fix. After the fix: 286 edges (path 27, fs 24,
child_process 18, os 17, plus local modules).
Require detection inserted into the existing call_expression arm;
handles node:fs scheme normalization and strips .js/.ts/.mjs/
.cjs suffixes so require('./utils/version-utils.js') resolves to
the same target as an ES import binding named version-utils.
Unresolved imports flow into the existing Phase 2b-ext external-
sentinel mechanism (previously only wired for implements), so
<external>/fs nodes now exist and are discoverable via deps <file>
dependency graphs.
Two new tests: test_extract_js_commonjs_require (parser level,
covers node scheme + extension stripping + relative paths) and
test_js_require_creates_external_import_edges (pipeline level,
end-to-end DB assertion).
Added tree-sitter-md = "0.3" (pinned to 0.3 because 0.5.x ships
tree-sitter ABI 15 and this repo still runs tree-sitter 0.24 / ABI 14).
detect_language accepts .md / .mdx; LanguageConfig exposes
"markdown" for the default-config fallthrough; extract_nodes new arms
for atx_heading (walks marker children to infer level 1–6) and
setext_heading (paragraph + setext_h{1,2}_underline). Heading text
becomes the node name, h1..h6 the node type. Searchable via FTS;
visible in module_overview and project_map.
Dogfood: this repo's README, CHANGELOG, and 4 plugin docs now yield
145 heading nodes. code-graph-mcp search "Installation" returns
h2 Installation README.md:117 as the top hit.
Shell and JSON indexing deferred — tree-sitter-bash adds real value for hook-script projects; JSON alone is low-yield because the useful relations (hook → script name) cross file formats. Both tracked as follow-up.
LanguageConfig::has_test_attributes = false for JS/TS because the
test framework is function-call-driven, not attribute-driven. The
existing is_test_symbol file-path heuristic caught .test.js /
.spec.js / __tests__/ patterns but missed in-source test code
(Vitest in-source testing, Jest co-location without the suffix, or
any file that mixes prod + test definitions).
extract_nodes now intercepts call_expression nodes whose function
head is one of describe, it, test, suite, context,
beforeEach, beforeAll, afterEach, afterAll, before, after,
fdescribe, xdescribe, fit, xit (both bare and .only / .skip
/ .each member forms). Child argument nodes recurse with
in_test_context = true which flows into the existing is_test field
on every nested function / class / method.
Regression: test_parse_js_describe_it_marks_nested_as_test plants
6 definitions across describe / it / it.skip / beforeEach
nesting and asserts the is_test propagation is correct (plus a
top-level prod function stays is_test=false).
367 total tests pass (+4 net new). cargo +1.95.0 clippy --all-targets -- -D warnings clean. Full rebuild on this repo: 84 files → 1295
nodes → 2590 edges (was 1068 / 2300 pre-release). Net per-dimension:
- phantom Rust→JS call edges: 11 → 0
- JS imports edges: 3 → 286
- markdown heading nodes: 0 → 145
- indexed languages: 16 → 17
Patch release. Drops six observed bug classes surfaced by a full-fleet error-rate audit over 156 MCP sessions + 55 Claude Code transcripts.
Historical transcripts showed 6 agent-side FOREIGN KEY constraint failed
errors on project_map (4), module_overview (1), and
semantic_code_search (1). Root cause: run_incremental_with_cache_restore
caught FK violations and fell back to run_full_index, but the latter
only does per-file upsert — orphan rows from the failed incremental
survived and re-triggered FK on the retry, bubbling the raw SQLite
error to tool handlers.
Fix (src/mcp/server/mod.rs:987): the FK branch now DELETE FROM files
in a transaction before re-running full_index. CASCADE chains nodes →
edges → node_vectors via the schema's existing ON DELETE CASCADE.
Pattern lifted verbatim from tool_rebuild_index.
Regression test (test_fk_fallback_truncate_purges_stale_state_and_rebuild_recovers)
injects a phantom file + node + edge via PRAGMA foreign_keys = OFF
and asserts truncate + full_index purge it while restoring on-disk
symbols. Guards against future removal of the truncate step.
usage.jsonl showed rebuild_index err-rate 5/9 = 55%, with all 5
failures hitting max_ms ≈ 10009 — i.e. the embedding_in_progress
wait deadline, returning {status:"busy"} which session metrics count
as errors. Not a real failure mode; 30s accommodates larger projects
whose embedding pass exceeds 10s.
const _: () = assert!(...) and let _ = ... patterns are
compile-time-only bindings, never callable. They were being reported
as dead code. New filter in find_dead_code SQL: n.name != '_'.
SessionMetrics::record_tool_call now classifies failures into
ErrKind { Timeout, NotFound, Ambiguous, FkConstraint, EmptyInput, Other }
and emits per-tool breakdowns as tools.<name>.err_kinds:
"get_ast_node": {"n": 69, "ms": 4630, "err": 12, "max_ms": 2003,
"err_kinds": {"timeout": 7, "ambiguous": 3, "not_found": 2}}Additive — readers that only consume n/ms/err/max_ms are unaffected.
Success-only tools omit the err_kinds field entirely for compact
output. Unlocks post-hoc error analysis via jq instead of manual
transcript grep.
Persistent sampler that classifies code-graph-mcp search queries
issued by the agent (extracted from Claude Code transcripts) into
keyword-like vs concept-like. Used to validate decisions about
MCP-vs-CLI routing trade-offs without needing a round-trip through
routing_bench.
Patch release. Closes a CLI/MCP behavior gap discovered in the same end-to-end audit that produced v0.14.3.
MCP get_call_graph and get_ast_node already returned an
Ambiguous symbol error with suggestion list when a bare name
resolved to ≥2 non-test definitions in different files. The CLI
counterparts (callgraph, impact) did not — they silently merged
call graphs / caller lists across all same-named definitions,
misreporting risk_level and blast radius.
Example: this repo has two open functions (Database::open in
src/storage/db.rs and CliContext::open in src/cli.rs). Before
the fix:
$ code-graph-mcp impact open
Impact: open — Risk: HIGH
26 direct callers, 31 total, 9 files ...
The 26 callers are a union of both opens. After the fix:
$ code-graph-mcp impact open
[code-graph] Ambiguous symbol 'open': 2 matches in different files.
Specify --file or --node-id:
open (function) in src/storage/db.rs [node_id 5717]
open (function) in src/cli.rs [node_id 7055]
Exit code 1 signals script-level callers that disambiguation is
required. Qualified names (Database.open), --file, and --node-id
paths still work unchanged.
New helper detect_exact_ambiguity in src/cli.rs queries
get_nodes_with_files_by_name, filters non-test definitions, and
returns Some(candidates) only when ≥2 distinct files are present
(multiple definitions in one file, e.g. overloads, stay
non-ambiguous). Shared emit_exact_ambiguity formatter handles both
--json and human modes.
Both cmd_callgraph and cmd_impact gain a file_filter.is_none()
guard that invokes the helper before the downstream query runs.
cargo test 235/235, cargo +1.95.0 clippy --all-targets clean.
Patch release. Two UX bugs found during end-to-end tool audit.
Full mode already set active_capped/showing/total_active/hint
when a module had >30 active exports, but compact_module_overview
rebuilt the response by cherry-picking known fields and silently
dropped the conditional truncation fields. Users calling with
compact=true on a large module (e.g. src/parser/ with 54 active
exports) saw "summary": "54 active + 2 inactive" and 30 items — no
signal that 24 were missing.
Fix: forward the four conditional fields at the end of
compact_module_overview with a .get().cloned() loop so any future
addition of a conditional field stays forwarded by default.
dependency_graph in the MCP handler filters the <external> pseudo-
file (a container for unresolved third-party imports) from outgoing
deps. The CLI deps subcommand had the language-compat filter but not
the <external> guard, so CLI output at depth ≥2 could show
<external> as a fake file dependency.
Fix: add the one-line guard to cmd_deps's is_compatible_lang so
both entry points apply the same filter.
cargo test 235/235, cargo +1.95.0 clippy --lib -- -D warnings
clean. Before/after:
module_overview(path="src/parser/", compact=true)now returnsactive_capped: true, showing: 30, total_active: 54, hint: "..."deps src/mcp/server/tools.rs --jsondepends_on no longer contains{"file":"<external>","depth":2}
Patch release. Fixes observed silent truncation of the MCP initialize
response instructions field at Claude Code's ~2KB harness boundary — the
last 4 of 10 routing decision rules were being dropped, making Claude
fall back to Grep/Read where code-graph tools should have been invoked.
Old noisy-mode instructions were ~2.5KB with three section headers and
verbose workflow tips. Claude Code's initialize handler truncated near
~2048 bytes, cutting modifying a function signature, find_dead_code,
find_similar_code, dependency_graph, and the get_ast_node row — all
critical routing signals.
Rewrite compresses to 1292 bytes (~48% of original) while preserving
all 10 decision rules verbatim. Each rule now carries its CLI alias
inline (e.g. get_call_graph (callgraph X)), so the LLM learns the CLI
invocation from the same line it learns the routing intent — no separate
MEMORY.md cross-reference needed for the base case.
Also re-adds a Prompts: line enumerating the three registered MCP
prompts, and replaces the misleading "5 CLI-only tools" phrasing with
"5 advanced tools" — the hidden 5 are still callable via raw MCP
tools/call, they are just off tools/list by default to preserve
startup-token budget.
const _: () = assert!(NOISY.len() <= 1500, ...) added in
src/mcp/server/mod.rs. Any future edit that blows the budget fails
cargo check with rustc E0080: evaluation panicked — catches the
regression at build time, not debug-build test time. Verified by
tightening the cap to 1000 and observing the compile break.
CLI code-graph-mcp search <q> is FTS5-only; the MCP
semantic_code_search tool adds vector similarity + RRF fusion. On
non-JSON success paths, a stderr tip now points concept-query users to
the MCP tool. --json mode is untouched so script consumers still see
clean stdout.
366 tests pass across integration suites (v0.14.1 baseline + compile-time
assert test exercised via intentional budget-cap inversion). Clippy 1.95
clean on both --no-default-features and --all-targets. Routing bench
(tests/routing_bench.rs via OpenRouter anthropic/claude-sonnet-4.5):
P@1 = 19/20 = 95.0% — unchanged from the v0.14.1 baseline, confirming
the compression did not degrade routing quality. Single miss remains the
known-borderline ast_search vs get_ast_node on a struct-def lookup.
Patch release. Six targeted accuracy/UX fixes to MCP tool responses surfaced by a 3-round smoke test. All changes are additive or remove false-positive warnings; no schema changes, no behavior regressions.
The compression trigger estimated token cost from context_string (can exceed
2000 chars) but the actual result JSON only carries code_content capped at
MAX_SEARCH_CODE_LEN = 500. Small top_k queries (3, 5) were being forced into
compressed_nodes mode unnecessarily, losing relevance and signature fields.
Estimator now mirrors the output: it measures truncated code_content +
signature + name + path + ~80 chars JSON framing per result. Small top_k
responses return full arrays again.
Compressed responses (compressed_nodes / compressed_files /
compressed_directories) now include a rounded match_confidence float. When
< 0.5, a low_confidence_warning string explains that FTS found few matches
and results are likely vector-similarity noise, with advice to use concrete
identifiers or ast_search.
The FTS sparsity and source-intersection penalties used to over-fire on
precision queries (single-identifier FTS hits). The penalty now requires
fts_search.len() >= 5; below that, the query is treated as precision-mode
and not penalized.
Exact-name-match exemption: when any top-5 candidate's name or
qualified_name equals the query (case-insensitive), the warning is
suppressed. match_confidence is still returned so callers can judge.
When the target is a struct / enum / trait / type / interface /
class, the response now includes a type_definition_note explaining that
the edge index captures explicit imports/inherits/implements and
struct-literal instantiation, but NOT method-qualified calls
(Type::method()), field access, or type annotations. Guides the caller to
query each method via module_overview for a complete rename audit.
When embedding is in progress with a small fraction done (e.g. 2/1052),
integer percent rounded to 0 and looked stuck. Now floors to 1 whenever
vectors_done > 0, so embedding_status: in_progress stays consistent with
the percentage.
Node N not found replaced with a message that explains node_ids are
rebuild-scoped and suggests re-resolving via get_ast_node(symbol_name, file_path) or semantic_code_search.
43 mcp::server unit tests remain green. Routing bench
(tests/routing_bench.rs via OpenRouter anthropic/claude-sonnet-4.5):
P@1 = 19/20 = 95.0% (threshold 70%). Single miss is a semantic-neighbor
pick (ast_search vs get_ast_node for a struct-def lookup) unrelated to
this release.
Minor release. Addresses a long-standing fragility in the composite statusline
integration: when the user cleaned ~/.cache/code-graph/, the _previous
snapshot (pre-install statusline, e.g. GSD) was lost, leaving only code-graph
visible on the status bar.
writeRegistry() in claude-plugin/scripts/lifecycle.js now mirrors the
registry to ~/.claude/statusline-providers.json on every write. This file
lives outside the ~/.cache/ hierarchy, so routine cache cleanup no longer
strands third-party provider entries.
readRegistry() self-heals: if the primary ~/.cache/code-graph/statusline-registry.json
is missing or empty, it falls back to the durable backup and rewrites the
primary. No user action needed on upgrade — the first writeRegistry() call
after install writes both files; recovery from a prior cache wipe happens
automatically on next SessionStart.
Clearing the registry (e.g. during uninstall) clears both files.
claude-plugin/scripts/statusline-chain.js exposes a documented registration
surface for third-party plugins that want to coexist with code-graph's
composite statusline:
node <plugin-cache>/scripts/statusline-chain.js register <id> <command> [--stdin]
node <plugin-cache>/scripts/statusline-chain.js unregister <id>
node <plugin-cache>/scripts/statusline-chain.js list
Reserved ids (_previous, code-graph) are rejected with exit code 2. The
CLI uses existing registerStatuslineProvider / unregisterStatuslineProvider
so writes land in both primary + durable backup.
Motivating use case: GSD currently owns settings.json.statusLine
directly and is captured as _previous when code-graph installs. With this
CLI, GSD's install hook can instead call statusline-chain.js register gsd "<gsd-statusline-command>" --stdin and become a first-class provider in the
composite, independent of install order. Fallback path (call without --stdin
if the command doesn't read stdin; skip call entirely if code-graph isn't
installed) keeps standalone operation working.
Four new cases in lifecycle.test.js:
writeRegistrymirrors to durable backupreadRegistryself-heals primary from backup after simulated cache wipewriteRegistry([])clears both filesstatusline-chain.jsCLI register/list/unregister + reserved-id guard
12/12 lifecycle tests pass; 228/228 Rust lib tests green; clippy 1.95 clean on
both --no-default-features and --all-targets.
Minor release. Three changes driven by real-usage-data review:
code-graph-mcp stats aggregates .code-graph/usage.jsonl across sessions
and prints per-tool counts (n, avg_ms, err, max_ms), search totals
(queries, zero-result ratio, hybrid/FTS split, avg quality), and index
activity (full vs incremental, avg full-rebuild time). Flags: --last N
limits to the most recent N sessions, --json emits structured output.
Motivation: the metrics module has been writing JSONL for months (1MB
rotation), but there was no reader. Running on this repo's own history
surfaced the rebuild_index error pattern that motivates change #2.
When the server rejects a rebuild request because background embedding is
still running, it now returns Ok({status: "busy", retry_after_ms: 2000})
instead of Err("Background embedding still in progress"). This matches
the precedent in run_incremental_with_cache_restore (which returns
Ok(()) on the same condition) and keeps the usage-metrics err counter
from inflating on legitimate retry signals.
Contract change — SDK/script clients of the rebuild_index MCP tool
must now distinguish status: "busy" success payloads from actual errors.
JSON-RPC-level errors on rebuild_index now indicate real failures only
(missing confirm, no project root, DB error).
plugin_code_graph_mcp.md template previously listed search "Z" and
semantic_code_search as equivalent intents. They are not: the CLI
search command is FTS5-only (src/cli.rs:710 → fts5_search), while
the MCP semantic_code_search tool performs RRF fusion of FTS5 + vector
similarity (src/mcp/server/tools.rs:42 → 101). The template now states
this explicitly in the core-7 decision table and the CLI cheat sheet.
Adopted memory files auto-refresh from the template on the next SessionStart (v0.11.0+ behavior).
Four clippy::manual_checked_ops and one clippy::unnecessary_sort_by
flagged by the 1.95 toolchain in the new cmd_stats code path are fixed
before push (local baseline: cargo +1.95.0 clippy --no-default-features -- -D warnings && cargo +1.95.0 clippy --all-targets -- -D warnings,
both green).
Bugfix release: the PostToolUse incremental-index hook no longer creates
.code-graph/ in directories that are not project roots. In multi-repo
workspace layouts (one parent dir containing N independent git repos, parent
not itself a repo), the hook previously materialized a stray 16 MB+ index at
the workspace parent, overlapping every child repo.
src/main.rs incremental-index arm now bails silently when the resolved
project root has neither a .git anchor nor an existing
.code-graph/index.db (the index check preserves the explicit per-dir index
case where a user deliberately ran incremental-index in a non-git folder).
Silent-skip matches the prevailing hook-layer convention:
incremental-index.js swallows errors, CliContext::try_open returns None,
session-init.js returns 'skipped'.
claude-plugin/scripts/incremental-index.test.js — two cases:
- non-git tmpdir → exit 0,
.code-graph/not created - fake
.git/tmpdir → exit 0, guard does not block
Reported + fixed by @jgangemi (issue #8, PR #9). Re-landed on top of current
resolve_project_root_from helper with doc-comment scope creep removed.
Auto-adopt (claude-plugin/scripts/adopt.js) now seeds MEMORY.md's sentinel
block with a 5-row scenario→tool table in addition to the existing tool-name
list. The always-loaded context gap this closes: Claude Code knew the 7+5 tool
names but not the natural-language triggers ("who calls X?", "改 X 影响面")
that should route to them, so sessions silently slid to Grep / Read when a
code-graph tool would be more precise. The scenario phrases now live in the
200-line-capped MEMORY.md itself, not a second-hop plugin_code_graph_mcp.md.
Sentinel <!-- code-graph-mcp:begin v1 -->...<!-- code-graph-mcp:end --> grows
from 3 lines to 9. Added block (nested under the existing index entry):
- 场景速查(优先于 Grep):
- 改 X 影响面 → `get_ast_node symbol=X include_impact=true`(或 CLI `code-graph-mcp impact X`)
- 谁调用 X / X 被谁用 → `get_call_graph X` 或 `find_references X`
- 看 X 源码 / 签名 → `get_ast_node symbol=X`
- Y 模块长啥样 → `module_overview` 或 CLI `code-graph-mcp overview Y/`
- 概念查询(不知精确名)→ `semantic_code_search "Z"`;字面匹配用 Grep
needsRefresh() detects INDEX_LINE drift automatically; the sentinel block
rewrites once on next SessionStart. No user action required.
- Lock current MEMORY.md block against this refresh:
CODE_GRAPH_NO_TEMPLATE_REFRESH=1(shipped in v0.11.0) - Disable auto-adopt entirely for new projects:
CODE_GRAPH_NO_AUTO_ADOPT=1(shipped in v0.9.0) - Downgrade: reinstall
0.11.6to restore the 3-line INDEX_LINE
adopt.test.js: 37/37 green — tests reference theINDEX_LINEconstant, so the content extension is transparent.routing_bench: 19/20 = 95.0% onanthropic/claude-sonnet-4.5via OpenRouter — unchanged from v0.11.6. This release doesn't touchToolRegistrydescriptions, which is what the bench measures; the adopted MEMORY.md lives outside the oracle's prompt.
First run of the routing-recall benchmark landed v0.11.4 at P@1 = 18/20 = 90.0%
(anthropic/claude-sonnet-4.5 via OpenRouter). The two misses were both semantic
overlaps between adjacent tools. This release tightens 4 tool descriptions and
re-runs the bench: P@1 = 19/20 = 95.0%, a net +5.0 points with one miss
remaining (borderline — "show me the EmbeddingModel struct" routes to ast_search
with type=struct, which returns the right answer albeit via the "enumerate"
tool rather than the "inspect ONE" tool).
All stay under the 200-char registry limit.
get_call_graph— leads with"Who calls X, what X calls"+"Returns a graph (not a flat list)". Fixed routing for "Who calls ensure_indexed?" (was →find_references, now →get_call_graph).find_references— leads with"Flat enumeration of all usage sites"+ explicit deflection:"For 'who calls X?', use get_call_graph.".get_ast_node— leads with"Inspect ONE named symbol"+"you have a symbol name (or node_id) and want its definition/body"to claim the "show me X / signature of Y" intent.ast_search— leads with"Enumerate MULTIPLE symbols by structural criteria"+ deflection:"For ONE known symbol, use get_ast_node.".
Pattern: each description now leads with a shape verb (who calls, flat enumeration, inspect ONE, enumerate MULTIPLE) and points at the
adjacent tool when a query drifts into overlap.
Auto-detects ANTHROPIC_API_KEY (native Messages API) or OPENROUTER_API_KEY
(OpenAI-compatible /chat/completions). Tool schemas re-packaged as
{type: "function", function: {...}} for the OpenRouter path. Model default
anthropic/claude-sonnet-4.5; override with ROUTING_BENCH_MODEL. Anthropic
wins if both keys present.
| Run | Backend / Model | P@1 |
|---|---|---|
| v0.11.4 baseline | openrouter / anthropic/claude-sonnet-4.5 | 18/20 (90.0%) |
| v0.11.6 post-tightening | openrouter / anthropic/claude-sonnet-4.5 | 19/20 (95.0%) |
Cost ≈ $0.10/run. Threshold stays at 0.70; consider raising to 0.85 after two more releases confirm 95% as stable baseline (20-query sample is within model stochasticity range).
-D warnings on stable clippy 1.95 flagged the two sort_by(|a, b| b.0.cmp(&a.0))
calls added in v0.11.4 rollup. Local clippy (0.1.91, ~4 months behind stable)
accepted them. Functional behavior unchanged.
src/mcp/server/tools.rs:503-504:sort_by(|a, b| b.0.cmp(&a.0))→sort_by_key(|e| std::cmp::Reverse(e.0))(applied exactly as clippy suggested).
Local pre-push ran cargo clippy --all-targets -- -D warnings — passed on 0.1.91.
CI uses dtolnay/rust-toolchain@stable which pulls whatever's latest
(1.95.0 at ship time), catching clippy::unnecessary_sort_by which landed post-0.1.91.
Functional code from v0.11.4 is unaffected; only the -D warnings gate broke.
v0.11.4 tag + release left pointing at the failing commit as a historical artifact.
Integration-test pass against Claude Code found three specific friction points where tool responses forced a second round-trip or missed relevant nodes. All three fixed. Additive — no schema change, no re-index.
-
ast_searchgeneric-fallback hint. Whenreturns="Vec<Relation>"yields zero hits because the codebase usesVec<ParsedRelation>, the response now carrieshint+suggested_queryinstead of a barecount: 0. Example:{ "count": 0, "hint": "No match for returns='Vec<Relation>'. Substring 'Relation' has 7 matches — try that.", "suggested_query": {"returns": "Relation", "type": "fn"} }. Strip rule: innermost<…>wins; multi-param types take the last comma-separated param. Seesrc/mcp/server/helpers.rs::strip_outer_generic. -
Acronym query expansion.
fts5_searchpreprocessing now expands common CS/IR/DB acronyms into full-form terms alongside the original:RRF→RRF+reciprocal+rank+fusion; same forBM25,FTS,AST,LSP,MCP,RPC,SQL,ORM,CTE,JWT,TTL,DAG,RBAC,CRUD,CORS. Benchmark before/after on query"RRF fusion BM25":weighted_rrf_fusionnow appears at rank 3 (previously absent from top-5). New static dict insrc/search/acronyms.rs; expansions deduped via the existing BTreeSet pass. -
semantic_code_searchacronym-heavy FTS bias. Queries that are entirely short uppercase tokens (≤3 tokens, each ≤5 chars, all[A-Z0-9]) now run withfts_weight=2.0, vec_weight=0.8instead of the default1.0/1.2. Rationale: embeddings handle letter-exact acronyms poorly while FTS5's token-exact match is reliable; shift the weight toward the precise channel. -
get_call_graphfile-level rollup replacescompressed_call_graph. When the flat node list exceedsCOMPRESSION_TOKEN_THRESHOLD(previously this mode dumped the raw list anyway), group by(file_path, direction)and emit{file, count, names[], node_ids[], min_depth, max_depth}sorted by count desc. New mode string"rollup_call_graph". Measured onensure_indexed(86 nodes): previously 86 flat entries → now 2 caller rollups + 5 callee rollups, preservingnode_idsforget_ast_nodedrill-down. Contract Δ: consumers matching onmode == "compressed_call_graph"must update to"rollup_call_graph".
strip_outer_genericunit tests (4/4) coverVec<T>, nested generics, multi-param (Result<T, E>), and no-bracket cases.acronyms::expand_acronymunit tests (4/4) cover case-insensitivity, unknown tokens,BM25numeric acronym, and an FTS-length-filter guardrail.- 230 lib tests + 44 integration tests all green.
New module src/search/acronyms.rs. strip_outer_generic in
src/mcp/server/helpers.rs. All other edits localized to tool_ast_search,
tool_semantic_search, and format_call_graph_response in
src/mcp/server/tools.rs, plus one flat_map augmentation in
storage::queries::fts5_search_impl.
tests/routing_bench.rs — turns "does Claude Code naturally call our tools
for the right intents?" from vibe-check into a P@1 number. 20 oracle queries
(3 per tool for 6 tools + 2 for find_references), each sent to the Claude
API with the live 7-tool schemas from ToolRegistry; asserts the picked
tool matches the oracle expectation.
oracle_well_formedruns in defaultcargo testand verifies every oracle entry references a real tool and every registered tool has at least one oracle query — catches drift when tools are renamed/added.routing_recall_benchmarkis#[ignore](requiresANTHROPIC_API_KEY). Run locally:ANTHROPIC_API_KEY=sk-... cargo test --test routing_bench -- --ignored --nocapture. Cost ≈ $0.10/run withclaude-sonnet-4-6(20 queries × ~1.2K in + ~150 out). Threshold starts at P@1 ≥ 0.70; tighten as descriptions improve.- New dev-dep
reqwest(blocking + rustls-tls, no TLS-OpenSSL pulled in). - CI wiring deliberately not added yet — run manually or add a gated step
(
env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}) when ready.
v0.11.3 — Doc: "hidden but callable" clarified (Claude Code vs. raw MCP)
User-facing: no behavior change; corrects a misleading claim in the adopted plugin memory after a 12-tool UX audit.
v0.10.0 trimmed tools/list to 7 core tools and documented the other 5
(impact_analysis, trace_http_chain, dependency_graph, find_similar_code,
find_dead_code) as "hidden but callable by name". UX audit found this holds
only for clients that invoke tools/call with a literal tool name (raw JSON-RPC,
MCP SDKs, CLI). Claude Code's MCP integration derives its callable set from
tools/list — ToolSearch returns No matching deferred tools found for the
hidden 5, and direct invocation errors with No such tool available.
claude-plugin/templates/plugin_code_graph_mcp.md"进阶 5" table reworded: CLI is now the primary column for Claude Code users; raw MCP name calls annotated as SDK/scripts-only. v0.11.0 template auto-refresh pushes this to previously-adopted projects on next SessionStart.src/mcp/tools.rsdoc comment: spells out which MCP clients can reach hidden tools and points to CLI fallback for Claude Code.
Misleading docs caused agents to attempt mcp__…__impact_analysis /
mcp__…__trace_http_chain and hit a terminal "No such tool available" error
instead of routing to code-graph-mcp impact|trace|deps|similar|dead-code
via Bash.
Follow-up audit on top of v0.11.1. All additive/tightening — no schema breakage.
-
module_overviewno longer leaks inline#[cfg(test)]test fns. Name-heuristicis_test_symbolcouldn't catch#[cfg(test)] mod tests { #[test] fn anything_goes }whose names don't prefixtest_. Root fix:get_module_exportsSQL nowWHERE n.is_test = 0on both the explicit-exports (JS/TS) path and the fallback (Rust / Go / Python) path — AST-level flag propagates through. -
Disambiguation suggestions carry
node_id+start_line.resolve_fuzzy_nameanddisambiguate_symbolsuggestions now include both fields so callers can pick a specific definition when multiple same-name functions live in one file (e.g. twofn new()in differentimplblocks of the same module).disambiguate_symbolalso fires on same-file multi-def, not just cross-file collisions. -
find_referencesgainsnode_idparameter. Lets callers pass thenode_idfrom a suggestion directly, skipping the ambiguous name-lookup step. When a name is ambiguous within one file, the tool now returns a per-definition suggestion list (withstart_line) instead of silently merging refs across defs. -
find_dead_codegetsignore_paths(MCP) /--ignore(CLI). Shell-invoked plugin entry points (lifecycle/hook scripts inclaude-plugin/) are not in the static AST call graph, so they surfaced as false-positive orphans. Added prefix-match exclusions with a sensible default (["claude-plugin/"]). Passignore_paths: []or--no-ignoreto see the full list. Response carriesignored_count,ignore_paths_applied,ignore_paths_defaultedfor transparency.
plugin_code_graph_mcp.md: hidden-5 tools now have an explicit required/optional parameter table (notablytrace_http_chaintakesroute_path, notroute) — users calling by name no longer need to trigger the error message to discover arg names.
+4 new (+1 unit in queries.rs, +3 integration covering Bug #1 / Issue #3 /
Bug #2). Full suite: 347 passed / 0 failed default features,
340 passed / 0 failed --no-default-features; clippy
-D warnings clean under both feature configs.
Post-audit fixes for tool output correctness. All changes are additive/tightening — no consumer schema breakage.
-
find_references— critical bugfix for exact-name resolution.resolve_fuzzy_namewas matching substrings before exact names, sofind_references("handle_tool")falsely reported ambiguity withhandle_tools_list/handle_tools_call. Now exact-name matches win first; same-name-in-multiple-files still producesAmbiguousbut scoped to exact matches only. Same fix benefitsimpact_analysisandget_call_graphfuzzy-fallback paths. -
Centralized truncation keeps arrays homogeneous. The
centralized_compresspipeline used to splice a string sentinel ("... [N items truncated]") into the middle of object arrays, breaking type consistency for strict JSON consumers and hiding how much was dropped. Arrays now truncate silently tofirst-10 + last-5(15 homogeneous items), and a new_array_truncations: {<field>: {original, kept}}sibling records the true pre-truncation length so callers can reconcilecount/totalsiblings against what was actually returned. -
project_mapschema sharpened.hot_functionsSQL tightened ton.type IN ('function','method')so structs/classes no longer leak into the "hot functions" bucket.entry_points[].kindadded:"main"for program entry points,"http_route"for framework-registered handlers. Lets LLMs skipmainwhen scanning the HTTP surface without sniffing theroutestring.
-
dependency_graphfilters the<external>sentinel. The synthetic bucket for unresolved imports now no longer surfaces as a fake file dependency. -
find_similar_codereports cutoff-driven shortfalls. Whenmax_distancedrops candidates belowtop_k, the response now carriescutoff_applied: true,cutoff_dropped: N, and ahintsuggesting the user widenmax_distance. Also echoestop_kandmax_distancein every response for transparency. -
impact_analysison types returnsrisk_level: "UNKNOWN". When the target is a struct/class/enum/interface/type_alias and the call graph finds zero callers, the risk level is nowUNKNOWNinstead ofLOW— so LLMs don't mistake "call graph can't see type usage" for "no one uses this". The existing type_warning still explains why and points tosemantic_code_searchfor broader coverage.
- +2 unit tests in
src/mcp/server/helpers.rs(truncation homogeneity, no-op when arrays < 20). - +6 integration tests in
tests/integration.rscovering each fix above. - Full suite: lib 221 + integration 41 + cli_e2e 50 + parser 19 + plugin 6 + hardening 6 = 343 passed, clippy clean.
v0.10.0 shipped the 7-core/5-hidden tool surface in the Rust binary but left the adopted plugin_code_graph_mcp.md decision table file — and the MEMORY.md sentinel block — stuck at the v0.8.x/v0.9.x 12-tool content for any project that had already auto-adopted. The plugin's maybeAutoAdopt() short-circuited on isAdopted() == true and never refreshed the template. Two related holes were also fixed:
- The shipped source template (
claude-plugin/templates/plugin_code_graph_mcp.md) was not updated in v0.10.0 — new/plugin install+ first-adopt users were also getting the stale 12-tool table. - The
INDEX_LINEconstant inadopt.js(which drives theMEMORY.mdsentinel block) was likewise still the v0.8.x 12-tool line.
- Source template synced to match the 7-core / 5-hidden surface. Fresh
/plugin installgets the correct decision table on first adopt. INDEX_LINEsynced to the v0.10.0 wording.- Auto-refresh on drift: when a project is already adopted but the shipped template hash ≠ the project's copy (or the
MEMORY.mdsentinel block's content ≠ currentINDEX_LINE), the next plugin SessionStart refreshes both silently. One-time stderr notice:[code-graph] Refreshed decision table to latest shipped version. - Hand-edited decision tables are overwritten by default. To lock:
CODE_GRAPH_NO_TEMPLATE_REFRESH=1in~/.claude/settings.jsonenv.
CODE_GRAPH_NO_TEMPLATE_REFRESH=1— preserves your local edits ofplugin_code_graph_mcp.md; also pinsMEMORY.mdsentinel to whatever it was. Does not affect first-adopt (only the refresh path).CODE_GRAPH_NO_AUTO_ADOPT=1— still gates the first-adopt path as in v0.9.0.code-graph-mcp unadopt— unchanged; strips sentinel + deletes target file.
Without this fix, an already-adopted v0.8.x/v0.9.x user who upgrades to v0.10.x gets mixed state: the Rust binary serves 7 tools in tools/list but the MEMORY.md index + decision-table file still instruct the LLM to route through the full 12-tool surface as if they were peers. Functionally nothing breaks (hidden tools remain callable by name), but the decision guidance is misaligned. v0.11.0 closes the loop so the three surfaces — binary, index pointer, decision table — all move together on upgrade.
MCP tools/list now advertises 7 tools instead of 12. The 5 hidden tools remain fully callable by name (aliases preserved) — only their visibility to the LLM at session start is removed, to shrink tools/list payload (~40% reduction) and cut decision fatigue in daily coding flows.
Core 7 (exposed in tools/list):
semantic_code_search, get_call_graph, get_ast_node, module_overview, project_map, find_references, ast_search.
Hidden but callable by name / CLI (backward-compatible aliases):
impact_analysis, trace_http_chain, dependency_graph, find_similar_code, find_dead_code.
Rationale: these 5 are niche (cleanup, duplicate detection, HTTP routing, file-level imports, blast-radius pre-check) — high value when needed, low daily frequency. For the primary blast-radius use case, prefer get_ast_node symbol_name=X include_impact=true which is in the core 7.
Reverse / opt-out: call any hidden tool by name via MCP tools/call or the matching code-graph-mcp <subcommand> CLI. All handlers, schemas, and CLI paths unchanged — only the tools/list catalog shrunk.
Memory sync: projects that auto-adopted v0.9.x will see updated plugin_code_graph_mcp.md decision tables on next session.
CI-only cleanup; no runtime behavior changes, no user-visible differences. Fixes 9 clippy errors surfaced by Rust 1.95.0's stricter lints (pre-existing since ~v0.8.1, was shipping with red CI):
collapsible_match(4): mergematch arm => if condintomatch arm if cond =>insrc/parser/relations.rsC# arms + Python decorator scan.unnecessary_sort_by(4):.sort_by(|a,b| b.x.cmp(&a.x))→.sort_by_key(|e| Reverse(e.x))insrc/mcp/server/tools.rsandsrc/storage/queries.rs.useless_conversion(1): drop redundant.into_iter()in a chained iterator insrc/graph/query.rs.
Verified with cargo +1.95.0 clippy -- -D warnings on both --no-default-features and default feature sets.
Plugin-mode installs (/plugin install in Claude Code) now auto-adopt into the project's MEMORY.md on first SessionStart. Previously adoption required running the adopt script manually, which most users never discovered — so the tool-invocation contract never got loaded and MCP tools stayed underused.
What changes on first upgrade (plugin mode):
~/.claude/projects/<slug>/memory/plugin_code_graph_mcp.mdis written (tool-decision rules).- A sentinel-bracketed pointer line is appended to
MEMORY.md. quietHooksflips totrueautomatically — per-sessionproject_mapinjection (~60 lines) is skipped; tools are loaded on-demand instead.- A single stderr notice fires on the first adoption showing how to opt out or reverse.
Opt-outs (in ~/.claude/settings.json → env):
CODE_GRAPH_NO_AUTO_ADOPT=1— prevents future auto-adoption; does not affect already-adopted projects.CODE_GRAPH_QUIET_HOOKS=0— forcesproject_mapinjection back on, even if adopted.CODE_GRAPH_QUIET_HOOKS=1— forces silent mode, even if not adopted.
Reverse adoption: code-graph-mcp unadopt (now a real CLI subcommand — see below).
What does NOT auto-adopt:
- npm global installs (
npm install -g @sdsrs/code-graph) npx ./tarball.tgzinvocations- Bare dev checkouts / test fixtures
- CI / agent short-session contexts
Detection uses the script's __dirname (checks for ~/.claude/plugins/ prefix), not CLAUDE_PLUGIN_ROOT — the env var leaks across concurrent plugins.
code-graph-mcp adopt/unadoptCLI subcommands: previously only callable vianode claude-plugin/scripts/adopt.js. Now uniform across plugin / npm / npx installs viabin/cli.jsinterception.CODE_GRAPH_NO_AUTO_ADOPT=1: explicit opt-out env for auto-adopt.
code-graph-mcp show <file-path>nudge: when the positional argument is an existing code file on disk, emit a clear pointer tooverview <file>instead of silently returning no rows.showis for symbols;overviewis for files.code-graph-mcp depsbarrel fallback: files with no tracked dependency edges (Rustmod.rs,index.tsbarrels, Python__init__.py) now scan source for language-appropriate re-export / import lines and surface them — previously a hard error.- Impact / references filter
<external>placeholders: stub nodes synthesized for unresolved external symbols no longer surface inimpact_analysis/find_referencesresults.
The default meaning of "plugin installed but not adopted" changed from "inject project_map every session, user must find /adopt to opt into the contract" to "adopted implicitly from the install action, quiet by default". Hence the minor bump. Users who preferred the v0.8.x noisy default can pin it with CODE_GRAPH_QUIET_HOOKS=0.
See release notes.
See GitHub Releases.