CoreGraph: Overview

CoreGraph builds an in-memory code symbol graph for multi-language and monorepo codebases. It combines tree-sitter (symbol extraction) with stack-graphs (cross-file name resolution), serves the graph from a background daemon, and answers questions a single-file search cannot — who calls this, what breaks if I change it, what is dead, and where do two languages disagree about the same value.

What it answers

Ask "who calls compute_impact?" and you get the callers, not text matches:

$ coregraph query compute_impact --direction incoming --edge-kind calls --hop-limit 1

── query: compute_impact ──────────────────────────────────

✓ compute_impact [crates/query/src/impact.rs:27]
  kind: Function | package: query (cargo)

  Incoming (14):
  ├── calls ← run [Function] @ crates/cli/src/commands/diff.rs      [0.85] ✓
  ├── calls ← run [Function] @ crates/cli/src/commands/impact.rs      [0.85] ✓
  ├── calls ← cached_impact [Function] @ crates/cli/src/dispatch.rs      [0.85] ✓
  ├── calls ← api_impact [Function] @ crates/server/src/handlers.rs      [0.85] ✓
  └── ... (14 total)
  ✓ trust: all paths verified

── page 1/1 | 14 edges total | budget: 506/5600 tokens ──

Each edge carries a confidence score ([0.85]) and a trust mark (✓), so an LLM — or a human — knows how much weight to give the relationship. That trust tagging is the core idea; everything else builds on it.

Why it exists

Existing tools each miss something:

Tool	Limitation
`grep` / regex	No syntax awareness. Can't follow renames, re-exports, or dynamic dispatch; matches text, not symbols.
`ctags`	Indexes definition locations only. No reference edges, no value-level tracking.
Single-language LSP	Sees one language at a time. Can't connect cross-language references or wire config to code.

CoreGraph pairs tree-sitter parsing with stack-graphs name resolution to build a project-wide graph that spans languages, then exposes queries tuned for feeding context to an LLM.

Identity: a graph builder, not a compiler

CoreGraph is deliberately not a compiler.

A compiler's consumer is a machine, and a 0.01% error fails the build.
CoreGraph's consumer is an LLM (or a person), which reasons usefully from partial information.
Dropping a real relationship because it couldn't be proven is more harmful than surfacing it with a lower confidence score.

So CoreGraph prefers trust-tagging plus fast answers over compiler-grade certainty. Every edge is annotated with how it was derived and how much to trust it (see Confidence), instead of being silently discarded.

What you can do that `grep` can't

Capability	What it does	Try
Dead-code detection	Lists public symbols with no semantic edges in either direction (no callers and no callees) — accounting for renames, re-exports, and dynamic dispatch. Structural edges (Contains/BelongsTo/Documents/DescribedIn) are excluded from this count.	`coregraph orphans`
Cross-language linking	Connects code across language boundaries through dedicated mediators (Spring DI/config, React Router, Docker Compose, Go DI) and shared API paths (`ApiPathMatch` edges).	`coregraph query <symbol>`
Config ↔ code consistency	Relates config keys (`application.yml`, `docker-compose.yml`, …) to the code that reads them.	`coregraph inconsistencies --category config-key`
Enum / value mismatches	Flags the same logical value spelled differently across languages (e.g. a Java enum constant vs. a TypeScript string).	`coregraph inconsistencies --category enum-mismatch`
Impact analysis	Computes the transitive set a change can reach, with optional risk scoring and affected tests.	`coregraph impact <symbol> --risk`
LLM context shaping	Extracts an N-hop subgraph around a symbol and serializes it to fit a token budget.	`coregraph query <symbol> --depth 2`

CoreGraph also tracks documentation: /// / /** */ doc comments and Markdown sections become nodes, so inconsistencies --category doc-drift can catch a @param that names an argument the signature no longer has. Doc-drift detection currently covers JS/TS/Java/Python @param/:param conventions only; Rust rustdoc and Go doc comments are not yet checked.

All seven code languages — Rust, Java, Kotlin, TypeScript, JavaScript, Go, Python — have stack-graphs name-resolution rules (Java/TS/JS/Python from upstream, Go/Rust/Kotlin hand-authored here), so cross-file resolution works the same way across the stack.

Glossary

Evidence — the source file that justifies an edge. A calls edge's evidence is the file containing the call site. When that file changes, the edge becomes stale.
Stale — a symbol or edge is stale when its source file changed and the extracted data is no longer current.
Healing — re-parsing stale files to refresh the graph. CoreGraph heals on-demand at query time: before answering, it re-extracts every content-hash-changed file known to the project graph (within a time budget), not only the files the query touches. Pass --no-heal to skip it and read the graph as-is.
Epoch — a monotonic version number bumped after each invalidate-and-heal cycle. It keys the query cache and signals staleness; it is not an atomically swapped immutable graph version. The graph itself sits behind an RwLock: queries take the read lock, while healing and invalidation take the write lock and mutate the graph in place. See Architecture for the concurrency model.
Server (daemon) — a single background process that holds the in-memory graph for one or more projects. It serves the IPC socket and the HTTP API, and backs the LSP and MCP stdio bridges (which reuse its in-memory graph when running).
Client (thin client) — the CLI. It forwards queries to the daemon over an IPC socket and auto-starts the daemon on first use (or runs in-process with --no-auto-start).
ACTIVE / LOADING — a project's load state in server status. ACTIVE means its graph is in memory; LOADING means it is being loaded. When a project sits idle, its graph is snapshotted to disk (if dirty) and dropped from memory, and the entry is removed from the daemon entirely — so an idle project simply disappears from the listing rather than appearing as a separate state. See Architecture.

Back to index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CoreGraph: Overview

What it answers

Why it exists

Identity: a graph builder, not a compiler

What you can do that `grep` can't

Glossary

Uh oh!

FilesExpand file tree

overview.md

Latest commit

History

overview.md

File metadata and controls

CoreGraph: Overview

What it answers

Why it exists

Identity: a graph builder, not a compiler

What you can do that grep can't

Glossary

What you can do that `grep` can't