Skip to content

Commit 4555db3

Browse files
authored
Merge pull request #58 from optave/fix/default-embedding-model
feat: add Mermaid output to diff-impact
2 parents e0c2c06 + d2d767f commit 4555db3

7 files changed

Lines changed: 516 additions & 12 deletions

File tree

.github/workflows/publish.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ jobs:
2222
preflight:
2323
name: Preflight checks
2424
runs-on: ubuntu-latest
25-
# Skip dev publish when the push is a stable release version bump
26-
if: "${{ github.event_name != 'push' || !startsWith(github.event.head_commit.message, 'chore: release v') }}"
25+
# Skip dev publish when the push is a stable release version bump (direct push or merged PR)
26+
if: "${{ github.event_name != 'push' || (!startsWith(github.event.head_commit.message, 'chore: release v') && !contains(github.event.head_commit.message, 'release/v')) }}"
2727
permissions:
2828
contents: read
2929
steps:

docs/llm-integration.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# LLM Integration — Feature Planning
2+
3+
> **Core principle:** Compute once at build time, serve compressed at query time. The graph tells you what's connected, the LLM tells you what it means, and the consuming AI gets both without reading raw code.
4+
5+
## Architecture
6+
7+
Two layers:
8+
9+
1. **Build-time LLM enrichment** — during `codegraph build`, an LLM annotates each function/class with semantic metadata (summaries, purpose, side effects, etc.) and stores it in the graph DB.
10+
2. **Query-time token savings** — the consuming AI model (via MCP) gets pre-digested context instead of raw source code.
11+
12+
```
13+
Code changes → codegraph build (+ LLM enrichment) → SQLite DB with semantic metadata
14+
15+
AI model queries via MCP
16+
17+
Gets structured summaries,
18+
not raw code → saves tokens
19+
```
20+
21+
---
22+
23+
## Features by Category
24+
25+
### Understanding & Documentation
26+
27+
#### "What problem does this function solve?"
28+
- `summaries` table — LLM-generated one-liner per node, stored at build time
29+
- MCP tool: `explain_purpose <name>` — returns summary + caller context ("it's called by X to do Y")
30+
31+
#### "Summarize this module in plain English"
32+
- Module-level rollup summaries — aggregate function summaries + dependency direction into a module narrative
33+
- MCP tool: `explain_module <file>` — returns module purpose, key exports, role in the system
34+
35+
#### "Auto-generate meaningful docstrings"
36+
- `docstrings` column on nodes — LLM-generated, aware of callers/callees/types
37+
- CLI command: `codegraph annotate` — generates or updates docstrings for changed functions
38+
- Diff-aware: only regenerate for functions whose code or dependencies changed
39+
40+
---
41+
42+
### Code Review & Quality
43+
44+
#### "Is this function doing too much?"
45+
- `complexity_notes` column — LLM assessment stored at build time: responsibility count, cohesion rating
46+
- Graph metrics feed into the assessment: fan-in, fan-out, edge count
47+
- MCP tool: `assess <name>` — returns complexity rating + specific concerns
48+
49+
#### "Are there naming inconsistencies?"
50+
- `naming_conventions` metadata per module — detected patterns (camelCase, snake_case, verb-first, etc.)
51+
- CLI command: `codegraph lint-names` — LLM compares names against detected conventions, flags outliers
52+
53+
#### "Smart PR review"
54+
- `diff-review` command — takes a diff, walks the graph for affected nodes, fetches their summaries
55+
- Returns: what changed, what's affected, risk assessment, suggested review focus areas
56+
- MCP tool: `review_diff <ref>` — structured review the consuming AI can relay to the user
57+
58+
#### "Show me a visual impact graph for this PR"
59+
- **Foundation (implemented):** `codegraph diff-impact <base> --format mermaid -T` generates a Mermaid flowchart showing changed functions, transitive callers, and blast radius — color-coded by new/modified/blast-radius
60+
- **CI automation:** GitHub Action that runs on every PR:
61+
1. `codegraph build .` (incremental, fast on CI cache)
62+
2. `codegraph diff-impact $BASE_REF --format mermaid -T` to generate the graph
63+
3. Post as a PR comment — GitHub renders Mermaid natively in markdown
64+
4. Update on new pushes (edit the existing comment)
65+
- **LLM-enriched annotations:** Overlay the graph with semantic context:
66+
- For each changed function: one-line summary of WHAT changed (from diff hunks)
67+
- For each affected caller: WHY it's affected — what behavior might change downstream
68+
- Risk labels per node: `low` (cosmetic / internal), `medium` (behavior change), `high` (breaking / public API)
69+
- Node colors shift from green → yellow → red based on risk, replacing the static new/modified styling
70+
- **Diff-aware narrative:** LLM reads the diff + graph and generates a structured PR summary:
71+
- "What changed and why it matters" per function
72+
- Potential breaking changes and side effects (from `side_effects` metadata)
73+
- Overall PR risk score (aggregate of node risks weighted by centrality)
74+
- **Review focus:** Prioritize reviewer attention:
75+
- Rank affected files by risk × blast radius — "review this file first"
76+
- Highlight critical paths: the shortest path from a changed function to a high-fan-in entry point
77+
- Flag test coverage gaps for affected code (cross-reference with test file graph edges)
78+
- **Historical context overlay:**
79+
- Annotate nodes with churn data: "this function changed 12 times in the last 30 days"
80+
- Highlight fragile nodes: high churn + high fan-in = high breakage risk
81+
- Track blast radius trends over time: "this PR's blast radius is 2× larger than your average"
82+
- **Interactive rendering (stretch):**
83+
- Render as SVG with clickable nodes linking to file:line in the PR diff view
84+
- Collapse/expand depth levels to manage large graphs
85+
- Filter by risk level or file path
86+
87+
**Infrastructure needed:**
88+
| What | Where | Depends on |
89+
|------|-------|------------|
90+
| GitHub Action workflow | `.github/workflows/impact-graph.yml` | `diff-impact --format mermaid` (done) |
91+
| LLM diff summarizer | `llm.js` + `queries.js` | LLM provider abstraction, `summaries` table |
92+
| Risk scoring per node | `nodes` table column | LLM assessment + graph centrality metrics |
93+
| Churn tracking | `metadata` table | Git log integration at build time |
94+
| SVG renderer | New module or external tool | Mermaid CLI (`mmdc`) or D3-based renderer |
95+
96+
---
97+
98+
### Refactoring Assistance
99+
100+
#### "Can I safely split this file?"
101+
- `split_analysis <file>` — graph identifies clusters of tightly-coupled functions within the file, LLM suggests groupings
102+
- Returns: proposed split, edges that would cross file boundaries, risk of circular imports
103+
104+
#### "Which functions are extraction candidates?"
105+
- `extraction_candidates` query — find functions called from multiple modules (high fan-in, low internal coupling)
106+
- LLM ranks them by utility: "this is a pure helper" vs "this has side effects, risky to move"
107+
108+
#### "Suggest backward-compatible signature change"
109+
- `signature_impact <name>` — graph provides all call sites, LLM reads each one
110+
- Returns: suggested new signature, adapter pattern if needed, list of call sites that need updating
111+
112+
---
113+
114+
### Architecture & Design
115+
116+
#### "Why does module A depend on module B?"
117+
- `dependency_path <A> <B>` — graph finds shortest path(s), LLM narrates each hop
118+
- Returns: "A imports X from B because A needs to validate tokens, and B owns the token schema"
119+
120+
#### "What's the most fragile part of the codebase?"
121+
- `fragility_report` — combines graph metrics (high fan-in + high fan-out + on many paths) with LLM reasoning
122+
- `risk_score` column per node — computed at build time from graph centrality + LLM complexity assessment
123+
- CLI command: `codegraph hotspots` — ranked list of riskiest nodes with explanations
124+
125+
#### "Suggest better module boundaries"
126+
- `boundary_analysis` — graph clustering algorithm identifies tightly-coupled groups that span modules
127+
- LLM suggests reorganization: "these 4 functions in 3 different files all deal with auth, consider consolidating"
128+
129+
---
130+
131+
### Onboarding & Navigation
132+
133+
#### "Where should I start reading?"
134+
- `entry_points` query — graph finds roots (high fan-out, low fan-in) + LLM ranks by importance
135+
- `onboarding_guide` command — generates a reading order based on dependency layers
136+
- MCP tool: `get_started` — returns ordered list: "start here, then read this, then this"
137+
138+
#### "What's the flow when a user clicks submit?"
139+
- `trace_flow <entry_point>` — graph walks the call chain, LLM narrates each step
140+
- Returns sequential narrative: "1. handler validates input → 2. calls createOrder → 3. writes to DB → 4. emits event"
141+
- `flow_narratives` table — pre-computed for key entry points at build time
142+
143+
#### "What would I need to change to add feature X?"
144+
- `change_plan <description>` — LLM reads the description, graph identifies relevant modules, LLM maps out touch points
145+
- Returns: files to modify, functions to change, new functions needed, test coverage gaps
146+
147+
---
148+
149+
### Bug Investigation
150+
151+
#### "What upstream functions could cause this bug?"
152+
- `trace_upstream <name>` — graph walks callers recursively, LLM reads each and flags suspects
153+
- `side_effects` column per node — pre-computed: "mutates state", "writes DB", "calls external service"
154+
- Returns ranked list: "most likely cause is X because it modifies the same state"
155+
156+
#### "What are the side effects of calling this function?"
157+
- `effect_analysis <name>` — graph walks the full callee tree, aggregates `side_effects` from every descendant
158+
- Returns: "calling X will: write to DB (via Y), send email (via Z), log to file (via W)"
159+
- Pre-computed at build time, invalidated when any descendant changes
160+
161+
---
162+
163+
## New Infrastructure Required
164+
165+
| What | Where | When computed |
166+
|------|-------|---------------|
167+
| `summaries` — one-line purpose per node | `nodes` table column | Build time, incremental |
168+
| `side_effects` — mutation/IO tags | `nodes` table column | Build time, incremental |
169+
| `complexity_notes` — risk assessment | `nodes` table column | Build time, incremental |
170+
| `risk_score` — fragility metric | `nodes` table column | Build time, from graph + LLM |
171+
| `flow_narratives` — traced call stories | New table | Build time for entry points |
172+
| `module_summaries` — file-level rollups | New table | Build time, re-rolled on change |
173+
| `naming_conventions` — detected patterns | Metadata table | Build time per module |
174+
| LLM provider abstraction | `llm.js` | Config: local/API/none |
175+
| Cascade invalidation | `builder.js` | When a node changes, mark dependents for re-enrichment |

src/cli.js

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,13 +216,15 @@ program
216216
.option('--depth <n>', 'Max transitive caller depth', '3')
217217
.option('-T, --no-tests', 'Exclude test/spec files from results')
218218
.option('-j, --json', 'Output as JSON')
219+
.option('-f, --format <format>', 'Output format: text, mermaid, json', 'text')
219220
.action((ref, opts) => {
220221
diffImpact(opts.db, {
221222
ref,
222223
staged: opts.staged,
223224
depth: parseInt(opts.depth, 10),
224225
noTests: !opts.tests,
225226
json: opts.json,
227+
format: opts.format,
226228
});
227229
});
228230

src/index.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ export {
4141
ALL_SYMBOL_KINDS,
4242
contextData,
4343
diffImpactData,
44+
diffImpactMermaid,
4445
explainData,
4546
FALSE_POSITIVE_CALLER_THRESHOLD,
4647
FALSE_POSITIVE_NAMES,

src/mcp.js

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import { createRequire } from 'node:module';
99
import { findCycles } from './cycles.js';
1010
import { findDbPath } from './db.js';
11-
import { ALL_SYMBOL_KINDS } from './queries.js';
11+
import { ALL_SYMBOL_KINDS, diffImpactMermaid } from './queries.js';
1212

1313
const REPO_PROP = {
1414
repo: {
@@ -201,6 +201,11 @@ const BASE_TOOLS = [
201201
ref: { type: 'string', description: 'Git ref to diff against (default: HEAD)' },
202202
depth: { type: 'number', description: 'Transitive caller depth', default: 3 },
203203
no_tests: { type: 'boolean', description: 'Exclude test files', default: false },
204+
format: {
205+
type: 'string',
206+
enum: ['json', 'mermaid'],
207+
description: 'Output format (default: json)',
208+
},
204209
},
205210
},
206211
},
@@ -467,12 +472,21 @@ export async function startMCPServer(customDbPath, options = {}) {
467472
});
468473
break;
469474
case 'diff_impact':
470-
result = diffImpactData(dbPath, {
471-
staged: args.staged,
472-
ref: args.ref,
473-
depth: args.depth,
474-
noTests: args.no_tests,
475-
});
475+
if (args.format === 'mermaid') {
476+
result = diffImpactMermaid(dbPath, {
477+
staged: args.staged,
478+
ref: args.ref,
479+
depth: args.depth,
480+
noTests: args.no_tests,
481+
});
482+
} else {
483+
result = diffImpactData(dbPath, {
484+
staged: args.staged,
485+
ref: args.ref,
486+
depth: args.depth,
487+
noTests: args.no_tests,
488+
});
489+
}
476490
break;
477491
case 'semantic_search': {
478492
const { searchData } = await import('./embedder.js');

0 commit comments

Comments
 (0)