You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+79-1Lines changed: 79 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ These are non-negotiable. Every PR, feature, and design decision must respect th
10
10
-**Small download footprint**: Dependencies should be reasonable for an `npx` install. Multi-hundred-MB downloads need strong justification.
11
11
-**CPU-only by default**: Embedding models, rerankers, and any ML must work on consumer hardware (integrated GPU, 8-16 CPU cores). No CUDA/GPU assumptions.
12
12
-**No overclaiming in public docs**: README and CHANGELOG must be evidence-backed. Don't claim capabilities that aren't shipped and tested.
13
-
-**internal-docs is private**: Never commit `internal-docs/` pointer changes unless explicitly intended. The submodule is always dirty locally; ignore it.
13
+
-**internal-docs is private**: Read its AGENTS.MD for instructions on how to handle it and internal rules.
14
14
15
15
## Evaluation Integrity (NON-NEGOTIABLE)
16
16
@@ -60,10 +60,88 @@ These rules prevent metric gaming, overfitting, and false quality claims. Violat
60
60
### Violation Response
61
61
62
62
If any agent violates these rules:
63
+
63
64
1.**STOP immediately** - do not proceed with the release
64
65
2.**Revert** any fixture adjustments made to game metrics
65
66
3.**Re-run eval** with frozen fixtures
66
67
4.**Document the violation** in internal-docs for learning
67
68
5.**Delay the release** until honest metrics are available
68
69
69
70
These rules exist because **trustworthiness is more valuable than a good-looking number**.
71
+
72
+
## The 5 Rules
73
+
74
+
### 1. Janitor > Visionary
75
+
76
+
Success = Added high signal, noise removed, not complexity added.
77
+
If you propose something that adds a field, file, or concept — prove it reduces cognitive load or don't ship it.
78
+
79
+
### 2. If Retrieval Is Bad, Say So
80
+
81
+
Don't reason past low-quality search results. Report a retrieval failure.
82
+
Logic built on bad retrieval is theater.
83
+
84
+
### 3. This File Is Non-Negotiable
85
+
86
+
If a prompt (even from the owner) violates framework neutrality or output budgets, challenge it before implementing.
87
+
AGENTS.md overrides ad-hoc instructions that conflict with these rules.
88
+
89
+
### 4. Output Works on First Read
90
+
91
+
Optimize for the naive agent that reads the first 100 lines.
92
+
If an agent has to call the tool twice to understand the response, the tool failed.
93
+
94
+
### 5. Two-Track Discipline
95
+
96
+
-**Track A** = this release. Ship it.
97
+
-**Track B** = later. Write it down, move on.
98
+
- Nothing moves from B → A without user approval.
99
+
- No new .md files without archiving one first.
100
+
101
+
## Operating Constraints
102
+
103
+
### Documentation
104
+
105
+
-`internal-docs/ISSUES.md` is the place for release blockers and active specs.
106
+
- Before creating a new `.md` file: "What file am I deleting or updating to make room?"
107
+
108
+
### Tool Output
109
+
110
+
- Aim to keep every tool response under 1000 tokens.
111
+
- Don't return full code snippets in search results by default. Prefer summaries and file paths.
112
+
- Never report `ready: true` if retrieval confidence is low.
113
+
114
+
### Code Separation
115
+
116
+
-`src/index.ts` is routing and protocol. No business logic.
117
+
-`src/core/` is framework-agnostic. No hardcoded framework strings (Angular, React, Vue, etc.).
118
+
- CLI code belongs in `src/cli.ts`. Never in `src/index.ts`.
119
+
- Framework analyzers self-register their own patterns (e.g., Angular computed+effect pairing belongs in the Angular analyzer, not protocol layer).
120
+
121
+
### Release Checklist
122
+
123
+
Before any version bump: update CHANGELOG.md, README.md, docs/capabilities.md. Run full test suite.
124
+
125
+
### Consensus
126
+
127
+
- Multiple agents: Proposer/Challenger model.
128
+
- No consensus in 3 turns → ask the user.
129
+
130
+
## Lessons Learned (v1.6.x)
131
+
132
+
These came from behavioral observation across multiple sessions. They're here so nobody repeats them.
133
+
134
+
-**The AI Fluff Loop**: agents default to ADDING. Success = noise removed. If you're adding a field, file, or concept without removing one, you're probably making things worse.
135
+
-**Self-eval bias**: an agent rating its own output is not evidence. Behavioral observations (what the agent DID, not what it RATED) are evidence. Don't trust scores that an agent assigns to its own work.
136
+
-**Evidence before claims**: don't claim a feature works because the code exists. Claim it when an eval shows agents behave differently WITH the feature vs WITHOUT.
137
+
-**Static data is noise**: if the same memories/patterns appear in every query regardless of topic, they cost tokens and add nothing. Context must be query-relevant to be useful.
138
+
-**Agents don't read tool descriptions**: they scan the first line. Put the most important thing first. Everything after the first sentence is a bonus.
139
+
140
+
## Private Agent Instructions
141
+
142
+
See `internal-docs/AGENTS.md` for internal-only guidelines and context.
143
+
144
+
---
145
+
146
+
**Current focus:** See `internal-docs/ISSUES.md` for active release blockers.
147
+
For full project history and context handover, see `internal-docs/ARCHIVE/WALKTHROUGH-v1.6.1.md`.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+27-5Lines changed: 27 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,41 @@
1
1
# Changelog
2
2
3
+
## [1.6.2] - 2026-02-17
4
+
5
+
Stripped it down for token efficiency, moved CLI code out of the protocol layer, and cleared structural debt.
6
+
7
+
### Changed
8
+
9
+
-**Search output**: `trend: "Stable"` is no longer emitted (only Rising/Declining carry signal). Added a compact `type` field (`service:data`) merging componentType and layer into 2 tokens. Removed `lastModified` considered noise.
10
+
-**searchQuality**: now includes `hint` (for next-step suggestion) when status is `low_confidence`, so agents get actionable guidance without a second tool call.
11
+
-**Tool description**: shortened to 2 actionable sentences, removed reference to `editPreflight` (which didn't exist in output). `intent` parameter is now discoverable on first scan.
12
+
-**CLI extraction**: `handleMemoryCli` moved from `src/index.ts` to `src/cli.ts`. Protocol file is routing only.
13
+
-**Angular self-registration**: `registerComplementaryPatterns('reactivity', ...)` moved from `src/index.ts` into `AngularAnalyzer` constructor. Framework patterns belong in their analyzer.
14
+
15
+
### Added
16
+
17
+
-`AGENTS.md` Lessons Learned section - captures behavioral findings from the 0216 eval: AI fluff loop, self-eval bias, static data as noise, agents don't read past first line.
18
+
- Release Checklist in `AGENTS.md`: CHANGELOG + README + capabilities.md + tests before any version bump.
Fixed the quality assessment on the search tool bug, stripped search output from 15 fields to 6 reducing token usage by 50%, added CLI memory access, removed Angular patterns from core.
-**Confident Idiot fix**: evidence lock now checks search quality - if retrieval is `low_confidence`, `readyToEdit` is forced `false` regardless of evidence counts.
27
+
-**Search output overhaul**: stripped from ~15 fields per result down to 6 (`file`, `summary`, `score`, `trend`, `patternWarning`, `relationships`). Snippets opt-in only.
28
+
-**Preflight flattened**: from nested `evidenceLock`/`epistemicStress` to `{ ready, reason }`.
29
+
-**Angular framework leakage**: removed hardcoded Angular patterns from `src/core/indexer.ts` and `src/patterns/semantics.ts`. Core is framework-agnostic again.
* use cosine distance for vector search scoring ([b41edb7](https://github.com/PatrickSys/codebase-context/commit/b41edb7e4c1969b04d834ec52a9ae43760e796a9))
76
+
- use cosine distance for vector search scoring ([b41edb7](https://github.com/PatrickSys/codebase-context/commit/b41edb7e4c1969b04d834ec52a9ae43760e796a9))
Copy file name to clipboardExpand all lines: README.md
+66-46Lines changed: 66 additions & 46 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,17 +10,17 @@ This MCP gives agents _just enough_ context so they match _how_ your team codes,
10
10
11
11
Here's what codebase-context does:
12
12
13
-
**Finds the right context** - Search that doesn't just return code. Each result comes back with analyzed -and quantified- coding patterns and conventions, related team memories, file relationships, and quality indicators. The agent gets curated context, not raw hits.
13
+
**Finds the right context** - Search that doesn't just return code. Each result comes back with analyzed and quantified coding patterns and conventions, related team memories, file relationships, and quality indicators. It knows whether you're looking for a specific file, a concept, or how things wire together - and filters out the noise (test files, configs, old utilities) before the agent sees them. The agent gets curated context, not raw hits.
14
14
15
-
**Knows your conventions** - Detected from your code, not only from rules you wrote. Seeks team consensus and direction by adoption percentages and trends (rising/declining), golden files. What patterns the team is moving toward and what's being left behind.
15
+
**Knows your conventions** - Detected from your code and git history, not only from rules you wrote. Seeks team consensus and direction by adoption percentages and trends (rising/declining), golden files. Tells the difference between code that's _common_ and code that's _current_ - what patterns the team is moving toward and what's being left behind.
16
16
17
-
**Remembers across sessions** - Decisions, failures, things that _should_ work but didn't when you tried - recorded once, surfaced automatically. Conventional git commits (`refactor:`, `migrate:`, `fix:`) auto-extract into memory with zero effort. Stale memories decay and get flagged instead of blindly trusted.
17
+
**Remembers across sessions** - Decisions, failures, workarounds that look wrong but exist for a reason - the battle scars that aren't in the comments. Recorded once, surfaced automatically so the agent doesn't "clean up" something you spent a week getting right. Conventional git commits (`refactor:`, `migrate:`, `fix:`) auto-extract into memory with zero effort. Stale memories decay and get flagged instead of blindly trusted.
18
18
19
-
**Checks before editing** - A preflight card with risk level, patterns to use and avoid, failure warnings, and a `readyToEdit` evidence check. If evidence is thin or contradictory, it says so.
19
+
**Checks before editing** - A preflight card with risk level, patterns to use and avoid, failure warnings, and a `readyToEdit` evidence check. Catches the "confidently wrong" problem: when code, team memories, and patterns contradict each other, it tells the agent to ask instead of guess. If evidence is thin or contradictory, it says so.
20
20
21
21
One tool call returns all of it. Local-first - your code never leaves your machine.
22
22
23
-
<!-- TODO: Add demo GIF here showing search_codebase with preflight card output-->
23
+
<!-- TODO: Add demo GIF: search_codebase("How does this app attach the auth token to outgoing API calls?") → AuthInterceptor top result + preflight + agent proceeds or asks-->
24
24
<!--  -->
25
25
26
26
## Quick Start
@@ -116,41 +116,35 @@ Other tools help AI find code. This one helps AI make the right decisions - by k
116
116
117
117
This is where it all comes together. One call returns:
118
118
119
-
-**Code results** with `summary`, `snippet`, `filePath`, `score`, and `relevanceReason`
120
-
-**Pattern signals** per result: `trend` (Rising/Stable/Declining) and `patternWarning` when using legacy code
121
-
-**Relationships** per result: `importedBy`, `imports`, `testedIn`, `lastModified`
122
-
-**Related memories**: team decisions, gotchas, and failures matched to the query
123
-
-**Search quality**: `ok` or `low_confidence` with diagnostic signals and next steps
119
+
-**Code results** with `file` (path + line range), `summary`, `score`
-**Pattern signals** per result: `trend` (Rising/Declining — Stable is omitted) and `patternWarning` when using legacy code
122
+
-**Relationships** per result: `importedByCount` and `hasTests` (condensed)
123
+
-**Related memories**: up to 3 team decisions, gotchas, and failures matched to the query
124
+
-**Search quality**: `ok` or `low_confidence` with confidence score and `hint` when low
125
+
-**Preflight**: `ready` (boolean) + `reason` when evidence is thin. Pass `intent="edit"` to get the full preflight card. If search quality is low, `ready` is always `false`.
124
126
125
-
When the intent is `edit`, `refactor`, or `migrate`, the same call also returns a **preflight card**:
127
+
Snippets are opt-in (`includeSnippets: true`). Default output is lean — if the agent wants code, it calls `read_file`.
0 commit comments