Skip to content

Commit a6b65f1

Browse files
authored
feat: prepare v1.5.0 trust and indexing foundation (#21)
* feat: prepare v1.5.0 reliability test build Add evidence-locked preflight, memory confidence/failure signals, git-derived memories, and manifest-based incremental indexing with coverage tests so branch testing can validate trust claims before release. * chore: keep internal-docs submodule pointer on master baseline * fix: mark invalid memory dates as stale evidence * chore: format indexer and manifest for CI checks
1 parent 93ee42f commit a6b65f1

22 files changed

+2037
-119
lines changed

AGENTS.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
11
# Agent Instructions
22

3-
## Internal Documentation
3+
## Codebase Context
4+
5+
**At start of each task:** Call `get_memory` to load team conventions.
6+
7+
**CRITICAL:** When user says "remember this" or "record this":
8+
- STOP immediately and call `remember` tool FIRST
9+
- DO NOT proceed with other actions until memory is recorded
10+
- This is a blocking requirement, not optional
11+
12+
## Internal Documentation (Submodule)
413

514
This repository uses a private git submodule for internal notes.
615

@@ -20,6 +29,6 @@ git pull --recurse-submodules
2029
git submodule update --remote --merge
2130
```
2231

23-
### Privacy & Security
32+
### Privacy
2433

25-
The `internal-docs` repository is **Private**. It returns a 404 to unauthenticated users/APIs. Access requires a GitHub PAT or SSH keys with repository permissions.
34+
The `internal-docs` repository is private. It returns a 404 to unauthenticated users. Access requires a GitHub PAT or SSH keys with repository permissions.

MOTIVATION.md

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Motivation: Why This Exists
22

3-
> **TL;DR**: AI coding assistants are smart but dangerous. Without guidance, they "vibe code" their way into technical debt. This MCP gives them **Context** (to know your patterns) and **Wisdom** (to keep your codebase healthy).
3+
> **TL;DR**: AI coding assistants increase throughput but often degrade stability. Without codebase context, they generate code that works but violates team conventions and architectural rules. This MCP provides structured pattern data and recorded rationale so agents produce code that fits.
44
55
---
66

@@ -28,7 +28,7 @@ AI drastically increases **Throughput** (more code/hour) but often kills **Stabi
2828

2929
## What This Does
3030

31-
We provide **Active Context**—not just raw data, but the *judgment* of a Senior Engineer.
31+
This MCP provides **active context** - not raw data, but structured intelligence derived from actual codebase state.
3232

3333
### 1. Pattern Discovery (The "Map")
3434
- **Frequency Detection**: "97% use `inject()`, 3% use `constructor`." (Consensus)
@@ -40,19 +40,18 @@ We provide **Active Context**—not just raw data, but the *judgment* of a Senio
4040
- **Health Context**: "⚠️ Careful, `UserService.ts` is a high-churn hotspot with circular dependencies. Add tests."
4141

4242
### Works with AGENTS.md
43-
> **AGENTS.md is the Law. MCP is the Map.**
44-
- **AGENTS.md** says: "We prefer functional functional programming."
45-
- **MCP** shows: "Here are the 5 most recent functional patterns we actually used."
43+
- **AGENTS.md** defines intent: "Use functional patterns."
44+
- **MCP** provides evidence: "Here are the 5 most recent functional patterns actually used."
4645

4746
---
4847

4948
## Known Limitations
5049

5150
| Limitation | Mitigation |
5251
|------------|--------|
53-
| **Pattern frequency ≠ pattern quality** | We added **Pattern Momentum** (Rise/Fall trends) to fix this. |
52+
| **Pattern frequency ≠ pattern quality** | **Pattern Momentum** (Rise/Fall trends) distinguishes adoption direction from raw count. |
5453
| **Stale index risk** | Manual re-indexing required for now. |
55-
| **Framework coverage** | Angular-specialized. React/Vue analyzers extensible. |
54+
| **Framework coverage** | Deep analysis for Angular. Generic analyzer covers 30+ languages. React/Vue specialized analyzers extensible. |
5655
| **File-level trend detection** | Trend is based on file modification date, not line-by-line content. A recently modified file may still contain legacy patterns on specific lines. Future: AST-based line-level detection. |
5756

5857
---
@@ -61,7 +60,7 @@ We provide **Active Context**—not just raw data, but the *judgment* of a Senio
6160

6261
1. **Context alone is dangerous**: Giving AI "all the context" just confuses it or teaches it bad habits (Search Contamination).
6362
2. **Decisions > Data**: AI needs *guidance* ("Use X"), not just *options* ("Here is X and Y").
64-
3. **Governance through Discovery**: We don't need to block PRs to be useful. If we show the AI that a pattern is "Declining" and "Dangerous," it self-corrects.
63+
3. **Governance through Discovery**: Blocking PRs is not required. If the AI sees that a pattern is "Declining" and "Dangerous," it self-corrects.
6564

6665
---
6766

@@ -76,7 +75,3 @@ We provide **Active Context**—not just raw data, but the *judgment* of a Senio
7675
- **Search Contamination**: Without MCP, models copied legacy patterns 40% of the time.
7776
- **Momentum Success**: With "Trending" signals, models adopted modern patterns even when they were the minority (3%).
7877

79-
---
80-
81-
*Last updated: December 2025*
82-

README.md

Lines changed: 192 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,183 @@
11
# codebase-context
22

3-
**AI coding agents don't know your codebase. This MCP fixes that.**
3+
[![npm version](https://img.shields.io/npm/v/codebase-context)](https://www.npmjs.com/package/codebase-context) [![license](https://img.shields.io/npm/l/codebase-context)](./LICENSE) [![node](https://img.shields.io/node/v/codebase-context)](./package.json)
44

5-
Your team has internal libraries, naming conventions, and patterns that external AI models have never seen. This MCP server gives AI assistants real-time visibility into your codebase: which libraries your team actually uses, how often, and where to find canonical examples.
5+
A second brain for AI coding agents. MCP server that remembers team decisions, tracks pattern evolution, and guides every edit with evidence.
66

77
## Quick Start
88

9-
Add this to your MCP client config (Claude Desktop, VS Code, Cursor, etc.).
9+
### Claude Desktop
10+
11+
Add to `claude_desktop_config.json`:
12+
13+
```json
14+
{
15+
"mcpServers": {
16+
"codebase-context": {
17+
"command": "npx",
18+
"args": ["-y", "codebase-context", "/path/to/your/project"]
19+
}
20+
}
21+
}
22+
```
23+
24+
### VS Code (Copilot)
25+
26+
Add `.vscode/mcp.json` to your project root:
1027

1128
```json
12-
"mcpServers": {
13-
"codebase-context": {
14-
"command": "npx",
15-
"args": ["codebase-context", "/path/to/your/project"]
29+
{
30+
"servers": {
31+
"codebase-context": {
32+
"command": "npx",
33+
"args": ["-y", "codebase-context", "${workspaceFolder}"]
34+
}
1635
}
1736
}
1837
```
1938

20-
If your environment prompts on first run, use `npx --yes ...` (or `npx -y ...`) to auto-confirm.
39+
### Cursor
40+
41+
Add to `.cursor/mcp.json` in your project:
42+
43+
```json
44+
{
45+
"mcpServers": {
46+
"codebase-context": {
47+
"command": "npx",
48+
"args": ["-y", "codebase-context", "/path/to/your/project"]
49+
}
50+
}
51+
}
52+
```
53+
54+
### Windsurf
55+
56+
Open Settings > MCP and add:
57+
58+
```json
59+
{
60+
"mcpServers": {
61+
"codebase-context": {
62+
"command": "npx",
63+
"args": ["-y", "codebase-context", "/path/to/your/project"]
64+
}
65+
}
66+
}
67+
```
68+
69+
### Claude Code
70+
71+
No config file needed. Add to `.claude/settings.json` or run:
72+
73+
```bash
74+
claude mcp add codebase-context -- npx -y codebase-context /path/to/your/project
75+
```
76+
77+
## What Makes It a Second Brain
78+
79+
Other tools help AI find code. This one helps AI make the right decisions — by remembering what your team does, tracking how patterns evolve, and warning before mistakes repeat.
2180

22-
## What You Get
81+
### Remembers
2382

24-
- **Internal library discovery**`@mycompany/ui-toolkit`: 847 uses vs `primeng`: 3 uses
25-
- **Pattern frequencies**`inject()`: 97%, `constructor()`: 3%
26-
- **Pattern momentum**`Signals`: Rising (last used 2 days ago) vs `RxJS`: Declining (180+ days)
27-
- **Golden file examples** → Real implementations showing all patterns together
28-
- **Testing conventions**`Jest`: 74%, `Playwright`: 6%
29-
- **Framework patterns** → Angular signals, standalone components, etc.
30-
- **Circular dependency detection** → Find toxic import cycles between files
31-
- **Memory system** → Record "why" behind choices so AI doesn't repeat mistakes
83+
Decisions, rationale, and past failures persist across sessions. Not just what the team does — why.
84+
85+
- Internal library usage: `@mycompany/ui-toolkit` (847 uses) vs `primeng` (3 uses) — and _why_ the wrapper exists
86+
- "Tried direct PrimeNG toast, broke event system" — recorded as a failure memory, surfaced before the next agent repeats it
87+
- Conventions from git history auto-extracted: `refactor:`, `migrate:`, `fix:`, `revert:` commits become memories with zero manual effort
88+
89+
### Reasons
90+
91+
Quantified pattern analysis with trend direction. Not "use inject()" — "97% of the team uses inject(), and it's rising."
92+
93+
- `inject()`: 97% adoption vs `constructor()`: 3% — with trend direction (rising/declining)
94+
- `Signals`: rising (last used 2 days ago) vs `RxJS BehaviorSubject`: declining (180+ days)
95+
- Golden files: real implementations scoring highest on modern pattern density — canonical examples to follow
96+
- Pattern conflicts detected: when two approaches in the same category both exceed 20% adoption
97+
98+
### Protects
99+
100+
Before an edit happens, the agent gets a preflight briefing: what to use, what to avoid, what broke last time.
101+
102+
- Preflight card on `search_codebase` with `intent: "edit"` — risk level, preferred/avoid patterns, failure warnings, golden files, impact candidates
103+
- Failure memories bump risk level and surface as explicit warnings
104+
- Confidence decay: memories age (90-day or 180-day half-life). Stale guidance gets flagged, not blindly trusted
105+
- Epistemic stress detection: when evidence is contradictory, stale, or too thin, the preflight card says "insufficient evidence" instead of guessing
106+
107+
### Discovers
108+
109+
Hybrid search (BM25 keyword 30% + vector embeddings 70%) with structured filters across 30+ languages:
110+
111+
- **Framework**: Angular, React, Vue
112+
- **Language**: TypeScript, JavaScript, Python, Go, Rust, and 25+ more
113+
- **Component type**: component, service, directive, guard, interceptor, pipe
114+
- **Architectural layer**: presentation, business, data, state, core, shared
115+
- Circular dependency detection, style guide auto-detection, architectural layer classification
116+
117+
## Measured Results
118+
119+
Tested against a real enterprise Angular codebase (~30k files):
120+
121+
| What was measured | Result |
122+
| ---------------------------------- | -------------------------------------------------------- |
123+
| Internal library detection | 336 uses of `@company/ui-toolkit` vs 3 direct PrimeNG |
124+
| DI pattern consensus | 98% `inject()` adoption detected, constructor DI flagged |
125+
| Test framework detection | 74% Jest, 26% Jasmine/Karma, per-module awareness |
126+
| Wrapper discovery | `ToastEventService`, `DialogComponent` surfaced over raw |
127+
| Golden file identification | Top 5 files scoring 4-6 modern patterns each |
128+
129+
Without this context, AI agents default to generic patterns: raw PrimeNG imports, constructor injection, Jasmine syntax. With the second brain active, generated code matches the existing codebase on first attempt.
32130

33131
## How It Works
34132

35-
When generating code, the agent checks your patterns first:
133+
The difference in practice:
36134

37-
| Without MCP | With MCP |
135+
| Without second brain | With second brain |
38136
| ---------------------------------------- | ------------------------------------ |
39137
| Uses `constructor(private svc: Service)` | Uses `inject()` (97% team adoption) |
40138
| Suggests `primeng/button` directly | Uses `@mycompany/ui-toolkit` wrapper |
41139
| Generic Jest setup | Your team's actual test utilities |
42140

141+
### Preflight Card
142+
143+
When using `search_codebase` with `intent: "edit"`, `"refactor"`, or `"migrate"`, the response includes a preflight card alongside search results:
144+
145+
```json
146+
{
147+
"preflight": {
148+
"intent": "refactor",
149+
"riskLevel": "medium",
150+
"confidence": "fresh",
151+
"evidenceLock": {
152+
"mode": "triangulated",
153+
"status": "pass",
154+
"readyToEdit": true,
155+
"score": 100,
156+
"sources": [
157+
{ "source": "code", "strength": "strong", "count": 5 },
158+
{ "source": "patterns", "strength": "strong", "count": 3 },
159+
{ "source": "memories", "strength": "strong", "count": 2 }
160+
]
161+
},
162+
"preferredPatterns": [
163+
{ "pattern": "inject() function", "category": "dependencyInjection", "adoption": "98%", "trend": "Rising" }
164+
],
165+
"avoidPatterns": [
166+
{ "pattern": "Constructor injection", "category": "dependencyInjection", "adoption": "2%", "trend": "Declining" }
167+
],
168+
"goldenFiles": [
169+
{ "file": "src/features/auth/auth.service.ts", "score": 6 }
170+
],
171+
"failureWarnings": [
172+
{ "memory": "Direct PrimeNG toast broke event system", "reason": "Must use ToastEventService" }
173+
]
174+
},
175+
"results": [...]
176+
}
177+
```
178+
179+
One call. The second brain composes patterns, memories, failures, and risk into a single response.
180+
43181
### Tip: Auto-invoke in your rules
44182

45183
Add this to your `.cursorrules`, `CLAUDE.md`, or `AGENTS.md`:
@@ -59,18 +197,22 @@ Now the agent checks patterns automatically instead of waiting for you to ask.
59197

60198
## Tools
61199

62-
| Tool | Purpose |
63-
| ------------------------------ | --------------------------------------------- |
64-
| `search_codebase` | Semantic + keyword hybrid search |
65-
| `get_component_usage` | Find where a library/component is used |
66-
| `get_team_patterns` | Pattern frequencies + canonical examples |
67-
| `get_codebase_metadata` | Project structure overview |
68-
| `get_indexing_status` | Indexing progress + last stats |
69-
| `get_style_guide` | Query style guide rules |
70-
| `detect_circular_dependencies` | Find import cycles between files |
71-
| `remember` | Record memory (conventions/decisions/gotchas) |
72-
| `get_memory` | Query recorded memory by category/keyword |
73-
| `refresh_index` | Re-index the codebase |
200+
| Tool | Purpose |
201+
| ------------------------------ | -------------------------------------------------------------------- |
202+
| `search_codebase` | Hybrid search with filters. Pass `intent: "edit"` for preflight card |
203+
| `get_component_usage` | Find where a library/component is used |
204+
| `get_team_patterns` | Pattern frequencies, golden files, conflict detection |
205+
| `get_codebase_metadata` | Project structure overview |
206+
| `get_indexing_status` | Indexing progress + last stats |
207+
| `get_style_guide` | Query style guide rules |
208+
| `detect_circular_dependencies` | Find import cycles between files |
209+
| `remember` | Record memory (conventions/decisions/gotchas/failures) |
210+
| `get_memory` | Query memory with confidence decay scoring |
211+
| `refresh_index` | Re-index the codebase + extract git memories |
212+
213+
## Language Support
214+
215+
The Angular analyzer provides deep framework-specific analysis (signals, standalone components, control flow syntax, lifecycle hooks, DI patterns). A generic analyzer covers 30+ languages and file types as a fallback: JavaScript, TypeScript, Python, Java, Kotlin, C/C++, C#, Go, Rust, PHP, Ruby, Swift, Scala, Shell, and common config/markup formats.
74216

75217
## File Structure
76218

@@ -97,22 +239,27 @@ The MCP creates the following structure in your project:
97239
Patterns tell you _what_ the team does ("97% use inject"), but not _why_ ("standalone compatibility"). Use `remember` to capture rationale that prevents repeated mistakes:
98240

99241
```typescript
100-
// AI won't change this again after recording the decision
101242
remember({
102243
type: 'decision',
103244
category: 'dependencies',
104245
memory: 'Use node-linker: hoisted, not isolated',
105-
reason:
106-
"Some packages don't declare transitive deps. Isolated forces manual package.json additions."
246+
reason: "Some packages don't declare transitive deps."
107247
});
108248
```
109249

110-
Memories surface automatically in `search_codebase` results and `get_team_patterns` responses.
250+
**Memory types:** `convention` (style rules), `decision` (architecture choices), `gotcha` (things that break), `failure` (tried X, failed because Y).
251+
252+
**Confidence decay:** Memories age. Conventions never decay. Decisions have a 180-day half-life. Gotchas and failures have a 90-day half-life. Memories below 30% confidence are flagged as stale in `get_memory` responses.
253+
254+
**Git auto-extraction:** During indexing, conventional commits (`refactor:`, `migrate:`, `fix:`, `revert:`) from the last 90 days are auto-recorded as memories. Zero manual effort.
255+
256+
**Pattern conflicts:** `get_team_patterns` detects when two patterns in the same category are both above 20% adoption with different trends, and surfaces them as conflicts with both sides.
257+
258+
Memories surface automatically in `search_codebase` results, `get_team_patterns` responses, and preflight cards.
111259

112-
**Early baseline — known quirks:**
260+
**Known quirks:**
113261

114262
- Agents may bundle multiple things into one entry
115-
- Duplicates can happen if you record the same thing twice
116263
- Edit `.codebase-context/memory.json` directly to clean up
117264
- Be explicit: "Remember this: use X not Y"
118265

@@ -125,19 +272,19 @@ Memories surface automatically in `search_codebase` results and `get_team_patter
125272
| `CODEBASE_ROOT` | - | Project root to index (CLI arg takes precedence) |
126273
| `CODEBASE_CONTEXT_DEBUG` | - | Set to `1` to enable verbose logging (startup messages, analyzer registration) |
127274

128-
## Performance Note
275+
## Performance
129276

130-
This tool runs **locally** on your machine using your hardware.
277+
This tool runs locally on your machine.
131278

132-
- **Initial Indexing**: The first run works hard. It may take several minutes (e.g., ~2-5 mins for 30k files) to compute embeddings for your entire codebase.
133-
- **Caching**: Subsequent queries are instant (milliseconds).
134-
- **Updates**: Currently, `refresh_index` re-scans the codebase. True incremental indexing (processing only changed files) is on the roadmap.
279+
- **Initial indexing**: First run may take several minutes (e.g., 2-5 min for 30k files) to compute embeddings.
280+
- **Subsequent queries**: Instant (milliseconds) from cache.
281+
- **Updates**: `refresh_index` re-scans the codebase. True incremental indexing (processing only changed files) is on the roadmap.
135282

136283
## Links
137284

138-
- 📄 [Motivation](./MOTIVATION.md)Why this exists, research, learnings
139-
- 📋 [Changelog](./CHANGELOG.md) — Version history
140-
- 🤝 [Contributing](./CONTRIBUTING.md) — How to add analyzers
285+
- [Motivation](./MOTIVATION.md)Research and design rationale
286+
- [Changelog](./CHANGELOG.md) — Version history
287+
- [Contributing](./CONTRIBUTING.md) — How to add analyzers
141288

142289
## License
143290

0 commit comments

Comments
 (0)