Skip to content

Commit ec48973

Browse files
committed
2 parents 8ca5aa1 + 2a06138 commit ec48973

File tree

19 files changed

+815
-233
lines changed

19 files changed

+815
-233
lines changed

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ jobs:
3030
run: pnpm type-check
3131

3232
- name: Security Audit
33-
run: pnpm audit
33+
run: pnpm audit --prod
3434

3535
test:
3636
name: Functional Tests

AGENTS.md

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ These are non-negotiable. Every PR, feature, and design decision must respect th
1010
- **Small download footprint**: Dependencies should be reasonable for an `npx` install. Multi-hundred-MB downloads need strong justification.
1111
- **CPU-only by default**: Embedding models, rerankers, and any ML must work on consumer hardware (integrated GPU, 8-16 CPU cores). No CUDA/GPU assumptions.
1212
- **No overclaiming in public docs**: README and CHANGELOG must be evidence-backed. Don't claim capabilities that aren't shipped and tested.
13-
- **internal-docs is private**: Never commit `internal-docs/` pointer changes unless explicitly intended. The submodule is always dirty locally; ignore it.
13+
- **internal-docs is private**: Read its AGENTS.MD for instructions on how to handle it and internal rules.
1414

1515
## Evaluation Integrity (NON-NEGOTIABLE)
1616

@@ -60,10 +60,88 @@ These rules prevent metric gaming, overfitting, and false quality claims. Violat
6060
### Violation Response
6161

6262
If any agent violates these rules:
63+
6364
1. **STOP immediately** - do not proceed with the release
6465
2. **Revert** any fixture adjustments made to game metrics
6566
3. **Re-run eval** with frozen fixtures
6667
4. **Document the violation** in internal-docs for learning
6768
5. **Delay the release** until honest metrics are available
6869

6970
These rules exist because **trustworthiness is more valuable than a good-looking number**.
71+
72+
## The 5 Rules
73+
74+
### 1. Janitor > Visionary
75+
76+
Success = Added high signal, noise removed, not complexity added.
77+
If you propose something that adds a field, file, or concept — prove it reduces cognitive load or don't ship it.
78+
79+
### 2. If Retrieval Is Bad, Say So
80+
81+
Don't reason past low-quality search results. Report a retrieval failure.
82+
Logic built on bad retrieval is theater.
83+
84+
### 3. This File Is Non-Negotiable
85+
86+
If a prompt (even from the owner) violates framework neutrality or output budgets, challenge it before implementing.
87+
AGENTS.md overrides ad-hoc instructions that conflict with these rules.
88+
89+
### 4. Output Works on First Read
90+
91+
Optimize for the naive agent that reads the first 100 lines.
92+
If an agent has to call the tool twice to understand the response, the tool failed.
93+
94+
### 5. Two-Track Discipline
95+
96+
- **Track A** = this release. Ship it.
97+
- **Track B** = later. Write it down, move on.
98+
- Nothing moves from B → A without user approval.
99+
- No new .md files without archiving one first.
100+
101+
## Operating Constraints
102+
103+
### Documentation
104+
105+
- `internal-docs/ISSUES.md` is the place for release blockers and active specs.
106+
- Before creating a new `.md` file: "What file am I deleting or updating to make room?"
107+
108+
### Tool Output
109+
110+
- Aim to keep every tool response under 1000 tokens.
111+
- Don't return full code snippets in search results by default. Prefer summaries and file paths.
112+
- Never report `ready: true` if retrieval confidence is low.
113+
114+
### Code Separation
115+
116+
- `src/index.ts` is routing and protocol. No business logic.
117+
- `src/core/` is framework-agnostic. No hardcoded framework strings (Angular, React, Vue, etc.).
118+
- CLI code belongs in `src/cli.ts`. Never in `src/index.ts`.
119+
- Framework analyzers self-register their own patterns (e.g., Angular computed+effect pairing belongs in the Angular analyzer, not protocol layer).
120+
121+
### Release Checklist
122+
123+
Before any version bump: update CHANGELOG.md, README.md, docs/capabilities.md. Run full test suite.
124+
125+
### Consensus
126+
127+
- Multiple agents: Proposer/Challenger model.
128+
- No consensus in 3 turns → ask the user.
129+
130+
## Lessons Learned (v1.6.x)
131+
132+
These came from behavioral observation across multiple sessions. They're here so nobody repeats them.
133+
134+
- **The AI Fluff Loop**: agents default to ADDING. Success = noise removed. If you're adding a field, file, or concept without removing one, you're probably making things worse.
135+
- **Self-eval bias**: an agent rating its own output is not evidence. Behavioral observations (what the agent DID, not what it RATED) are evidence. Don't trust scores that an agent assigns to its own work.
136+
- **Evidence before claims**: don't claim a feature works because the code exists. Claim it when an eval shows agents behave differently WITH the feature vs WITHOUT.
137+
- **Static data is noise**: if the same memories/patterns appear in every query regardless of topic, they cost tokens and add nothing. Context must be query-relevant to be useful.
138+
- **Agents don't read tool descriptions**: they scan the first line. Put the most important thing first. Everything after the first sentence is a bonus.
139+
140+
## Private Agent Instructions
141+
142+
See `internal-docs/AGENTS.md` for internal-only guidelines and context.
143+
144+
---
145+
146+
**Current focus:** See `internal-docs/ISSUES.md` for active release blockers.
147+
For full project history and context handover, see `internal-docs/ARCHIVE/WALKTHROUGH-v1.6.1.md`.

CHANGELOG.md

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,41 @@
11
# Changelog
22

3+
## [1.6.2] - 2026-02-17
4+
5+
Stripped it down for token efficiency, moved CLI code out of the protocol layer, and cleared structural debt.
6+
7+
### Changed
8+
9+
- **Search output**: `trend: "Stable"` is no longer emitted (only Rising/Declining carry signal). Added a compact `type` field (`service:data`) merging componentType and layer into 2 tokens. Removed `lastModified` considered noise.
10+
- **searchQuality**: now includes `hint` (for next-step suggestion) when status is `low_confidence`, so agents get actionable guidance without a second tool call.
11+
- **Tool description**: shortened to 2 actionable sentences, removed reference to `editPreflight` (which didn't exist in output). `intent` parameter is now discoverable on first scan.
12+
- **CLI extraction**: `handleMemoryCli` moved from `src/index.ts` to `src/cli.ts`. Protocol file is routing only.
13+
- **Angular self-registration**: `registerComplementaryPatterns('reactivity', ...)` moved from `src/index.ts` into `AngularAnalyzer` constructor. Framework patterns belong in their analyzer.
14+
15+
### Added
16+
17+
- `AGENTS.md` Lessons Learned section - captures behavioral findings from the 0216 eval: AI fluff loop, self-eval bias, static data as noise, agents don't read past first line.
18+
- Release Checklist in `AGENTS.md`: CHANGELOG + README + capabilities.md + tests before any version bump.
19+
320
## [1.6.1](https://github.com/PatrickSys/codebase-context/compare/v1.6.0...v1.6.1) (2026-02-15)
421

22+
Fixed the quality assessment on the search tool bug, stripped search output from 15 fields to 6 reducing token usage by 50%, added CLI memory access, removed Angular patterns from core.
523

624
### Bug Fixes
725

8-
* guard null chunk.content crash + docs rewrite for v1.6.1 ([6b89778](https://github.com/PatrickSys/codebase-context/commit/6b8977897665ea3207e1bbb0f5d685c61d41bbb8))
26+
- **Confident Idiot fix**: evidence lock now checks search quality - if retrieval is `low_confidence`, `readyToEdit` is forced `false` regardless of evidence counts.
27+
- **Search output overhaul**: stripped from ~15 fields per result down to 6 (`file`, `summary`, `score`, `trend`, `patternWarning`, `relationships`). Snippets opt-in only.
28+
- **Preflight flattened**: from nested `evidenceLock`/`epistemicStress` to `{ ready, reason }`.
29+
- **Angular framework leakage**: removed hardcoded Angular patterns from `src/core/indexer.ts` and `src/patterns/semantics.ts`. Core is framework-agnostic again.
30+
- **Angular analyzer**: fixed `providedIn: unknown` bug — metadata extraction path was wrong.
31+
- **CLI memory access**: `codebase-context memory list|add|remove` works without any AI agent.
32+
- guard null chunk.content crash ([6b89778](https://github.com/PatrickSys/codebase-context/commit/6b8977897665ea3207e1bbb0f5d685c61d41bbb8))
933

1034
## [1.6.0](https://github.com/PatrickSys/codebase-context/compare/v1.5.1...v1.6.0) (2026-02-11)
1135

12-
1336
### Features
1437

15-
* v1.6.0 search quality improvements ([#26](https://github.com/PatrickSys/codebase-context/issues/26)) ([8207787](https://github.com/PatrickSys/codebase-context/commit/8207787db45c9ee3940e22cb3fd8bc88a2c6a63b))
38+
- v1.6.0 search quality improvements ([#26](https://github.com/PatrickSys/codebase-context/issues/26)) ([8207787](https://github.com/PatrickSys/codebase-context/commit/8207787db45c9ee3940e22cb3fd8bc88a2c6a63b))
1639

1740
## [1.6.0](https://github.com/PatrickSys/codebase-context/compare/v1.5.1...v1.6.0) (2026-02-10)
1841

@@ -48,10 +71,9 @@ To re-index: `refresh_index(incrementalOnly: false)` or delete `.codebase-contex
4871

4972
## [1.5.1](https://github.com/PatrickSys/codebase-context/compare/v1.5.0...v1.5.1) (2026-02-08)
5073

51-
5274
### Bug Fixes
5375

54-
* use cosine distance for vector search scoring ([b41edb7](https://github.com/PatrickSys/codebase-context/commit/b41edb7e4c1969b04d834ec52a9ae43760e796a9))
76+
- use cosine distance for vector search scoring ([b41edb7](https://github.com/PatrickSys/codebase-context/commit/b41edb7e4c1969b04d834ec52a9ae43760e796a9))
5577

5678
## [1.5.0](https://github.com/PatrickSys/codebase-context/compare/v1.4.1...v1.5.0) (2026-02-08)
5779

README.md

Lines changed: 66 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -10,17 +10,17 @@ This MCP gives agents _just enough_ context so they match _how_ your team codes,
1010

1111
Here's what codebase-context does:
1212

13-
**Finds the right context** - Search that doesn't just return code. Each result comes back with analyzed -and quantified- coding patterns and conventions, related team memories, file relationships, and quality indicators. The agent gets curated context, not raw hits.
13+
**Finds the right context** - Search that doesn't just return code. Each result comes back with analyzed and quantified coding patterns and conventions, related team memories, file relationships, and quality indicators. It knows whether you're looking for a specific file, a concept, or how things wire together - and filters out the noise (test files, configs, old utilities) before the agent sees them. The agent gets curated context, not raw hits.
1414

15-
**Knows your conventions** - Detected from your code, not only from rules you wrote. Seeks team consensus and direction by adoption percentages and trends (rising/declining), golden files. What patterns the team is moving toward and what's being left behind.
15+
**Knows your conventions** - Detected from your code and git history, not only from rules you wrote. Seeks team consensus and direction by adoption percentages and trends (rising/declining), golden files. Tells the difference between code that's _common_ and code that's _current_ - what patterns the team is moving toward and what's being left behind.
1616

17-
**Remembers across sessions** - Decisions, failures, things that _should_ work but didn't when you tried - recorded once, surfaced automatically. Conventional git commits (`refactor:`, `migrate:`, `fix:`) auto-extract into memory with zero effort. Stale memories decay and get flagged instead of blindly trusted.
17+
**Remembers across sessions** - Decisions, failures, workarounds that look wrong but exist for a reason - the battle scars that aren't in the comments. Recorded once, surfaced automatically so the agent doesn't "clean up" something you spent a week getting right. Conventional git commits (`refactor:`, `migrate:`, `fix:`) auto-extract into memory with zero effort. Stale memories decay and get flagged instead of blindly trusted.
1818

19-
**Checks before editing** - A preflight card with risk level, patterns to use and avoid, failure warnings, and a `readyToEdit` evidence check. If evidence is thin or contradictory, it says so.
19+
**Checks before editing** - A preflight card with risk level, patterns to use and avoid, failure warnings, and a `readyToEdit` evidence check. Catches the "confidently wrong" problem: when code, team memories, and patterns contradict each other, it tells the agent to ask instead of guess. If evidence is thin or contradictory, it says so.
2020

2121
One tool call returns all of it. Local-first - your code never leaves your machine.
2222

23-
<!-- TODO: Add demo GIF here showing search_codebase with preflight card output -->
23+
<!-- TODO: Add demo GIF: search_codebase("How does this app attach the auth token to outgoing API calls?") → AuthInterceptor top result + preflight + agent proceeds or asks -->
2424
<!-- ![Demo](./docs/assets/demo.gif) -->
2525

2626
## Quick Start
@@ -116,41 +116,35 @@ Other tools help AI find code. This one helps AI make the right decisions - by k
116116

117117
This is where it all comes together. One call returns:
118118

119-
- **Code results** with `summary`, `snippet`, `filePath`, `score`, and `relevanceReason`
120-
- **Pattern signals** per result: `trend` (Rising/Stable/Declining) and `patternWarning` when using legacy code
121-
- **Relationships** per result: `importedBy`, `imports`, `testedIn`, `lastModified`
122-
- **Related memories**: team decisions, gotchas, and failures matched to the query
123-
- **Search quality**: `ok` or `low_confidence` with diagnostic signals and next steps
119+
- **Code results** with `file` (path + line range), `summary`, `score`
120+
- **Type** per result: compact `componentType:layer` (e.g., `service:data`) — helps agents orient
121+
- **Pattern signals** per result: `trend` (Rising/Declining — Stable is omitted) and `patternWarning` when using legacy code
122+
- **Relationships** per result: `importedByCount` and `hasTests` (condensed)
123+
- **Related memories**: up to 3 team decisions, gotchas, and failures matched to the query
124+
- **Search quality**: `ok` or `low_confidence` with confidence score and `hint` when low
125+
- **Preflight**: `ready` (boolean) + `reason` when evidence is thin. Pass `intent="edit"` to get the full preflight card. If search quality is low, `ready` is always `false`.
124126

125-
When the intent is `edit`, `refactor`, or `migrate`, the same call also returns a **preflight card**:
127+
Snippets are opt-in (`includeSnippets: true`). Default output is lean — if the agent wants code, it calls `read_file`.
126128

127129
```json
128130
{
129-
"preflight": {
130-
"intent": "refactor",
131-
"riskLevel": "medium",
132-
"confidence": "fresh",
133-
"evidenceLock": {
134-
"mode": "triangulated",
135-
"status": "pass",
136-
"readyToEdit": true,
137-
"score": 100,
138-
"sources": [
139-
{ "source": "code", "strength": "strong", "count": 5 },
140-
{ "source": "patterns", "strength": "strong", "count": 3 },
141-
{ "source": "memories", "strength": "strong", "count": 2 }
142-
]
143-
},
144-
"preferredPatterns": [...],
145-
"avoidPatterns": [...],
146-
"goldenFiles": [...],
147-
"failureWarnings": [...]
148-
},
149-
"results": [...]
131+
"searchQuality": { "status": "ok", "confidence": 0.72 },
132+
"preflight": { "ready": true },
133+
"results": [
134+
{
135+
"file": "src/auth/auth.interceptor.ts:1-20",
136+
"summary": "HTTP interceptor that attaches auth token to outgoing requests",
137+
"score": 0.72,
138+
"type": "service:core",
139+
"trend": "Rising",
140+
"relationships": { "importedByCount": 4, "hasTests": true }
141+
}
142+
],
143+
"relatedMemories": ["Always use HttpInterceptorFn (0.97)"]
150144
}
151145
```
152146

153-
Risk level, what to use, what to avoid, what broke last time, and whether the evidence is strong enough to proceed - all in one response.
147+
Lean enough to fit on one screen. If search quality is low, preflight blocks edits instead of faking confidence.
154148

155149
### Patterns & Conventions (`get_team_patterns`)
156150

@@ -171,18 +165,18 @@ Record a decision once. It surfaces automatically in search results and prefligh
171165

172166
### All Tools
173167

174-
| Tool | What it does |
175-
| ------------------------------ | ------------------------------------------------------------------- |
176-
| `search_codebase` | Hybrid search with enrichment. Pass `intent: "edit"` for preflight. |
177-
| `get_team_patterns` | Pattern frequencies, golden files, conflict detection |
178-
| `get_component_usage` | "Find Usages" - where a library or component is imported |
179-
| `remember` | Record a convention, decision, gotcha, or failure |
180-
| `get_memory` | Query team memory with confidence decay scoring |
181-
| `get_codebase_metadata` | Project structure, frameworks, dependencies |
182-
| `get_style_guide` | Style guide rules for the current project |
183-
| `detect_circular_dependencies` | Import cycles between files |
184-
| `refresh_index` | Re-index (full or incremental) + extract git memories |
185-
| `get_indexing_status` | Progress and stats for the current index |
168+
| Tool | What it does |
169+
| ------------------------------ | -------------------------------------------------------------------------------- |
170+
| `search_codebase` | Hybrid search with enrichment + preflight. Pass `intent="edit"` for edit readiness check. |
171+
| `get_team_patterns` | Pattern frequencies, golden files, conflict detection |
172+
| `get_component_usage` | "Find Usages" - where a library or component is imported |
173+
| `remember` | Record a convention, decision, gotcha, or failure |
174+
| `get_memory` | Query team memory with confidence decay scoring |
175+
| `get_codebase_metadata` | Project structure, frameworks, dependencies |
176+
| `get_style_guide` | Style guide rules for the current project |
177+
| `detect_circular_dependencies` | Import cycles between files |
178+
| `refresh_index` | Re-index (full or incremental) + extract git memories |
179+
| `get_indexing_status` | Progress and stats for the current index |
186180

187181
## How the Search Works
188182

@@ -194,7 +188,7 @@ The retrieval pipeline is designed around one goal: give the agent the right con
194188
- **Contamination control** - test files are filtered/demoted for non-test queries.
195189
- **Import centrality** - files that are imported more often rank higher.
196190
- **Cross-encoder reranking** - a stage-2 reranker triggers only when top scores are ambiguous. CPU-only, bounded to top-K.
197-
- **Incremental Indexing** - Whenever a file is changed, it
191+
- **Incremental indexing** - only re-indexes files that changed since last run (SHA-256 manifest diffing).
198192
- **Auto-heal** - if the index corrupts, search triggers a full re-index automatically.
199193

200194
## Language Support
@@ -238,6 +232,32 @@ Structured filters available: `framework`, `language`, `componentType`, `layer`
238232
!.codebase-context/memory.json
239233
```
240234

235+
## CLI Access (Vendor-Neutral)
236+
237+
You can manage team memory directly from the terminal without any AI agent:
238+
239+
```bash
240+
# List all memories
241+
npx codebase-context memory list
242+
243+
# Filter by category or type
244+
npx codebase-context memory list --category conventions --type convention
245+
246+
# Search memories
247+
npx codebase-context memory list --query "auth"
248+
249+
# Add a memory
250+
npx codebase-context memory add --type convention --category tooling --memory "Use pnpm, not npm" --reason "Workspace support and speed"
251+
252+
# Remove a memory
253+
npx codebase-context memory remove <id>
254+
255+
# JSON output for scripting
256+
npx codebase-context memory list --json
257+
```
258+
259+
Set `CODEBASE_ROOT` to point to your project, or run from the project directory.
260+
241261
## Tip: Ensuring your AI Agent recalls memory:
242262

243263
Add this to `.cursorrules`, `CLAUDE.md`, or `AGENTS.md`:

0 commit comments

Comments
 (0)