Skip to content

Commit fadeecb

Browse files
committed
sanitize docs, one last moment fix
1 parent 397d509 commit fadeecb

File tree

5 files changed

+50
-35
lines changed

5 files changed

+50
-35
lines changed

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ These are non-negotiable. Every PR, feature, and design decision must respect th
2222
- **Never stage/commit `.planning/**`\*\* (or any other local workflow artifacts) unless the user explicitly asks in that message.
2323
- **Never use `gsd-tools ... commit` wrappers** in this repo. Use plain `git add <exact files>` and `git commit -m "..."`.
2424
- **Before every commit:** run `git status --short` and confirm staged files match intent; abort if any `.planning/**` is staged.
25-
25+
- **Avoid using `any` Type AT ALL COSTS.
2626
## Evaluation Integrity (NON-NEGOTIABLE)
2727

2828
These rules prevent metric gaming, overfitting, and false quality claims. Violation of these rules means the feature CANNOT ship.

CHANGELOG.md

Lines changed: 12 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,30 +8,27 @@
88
- **Scope headers in code snippets**: When requesting snippets (`includeSnippets: true`), each code block now starts with a comment like `// UserService.login()` so agents know where the code lives without extra file reads.
99
- **Edit decision card**: When searching with `intent="edit"`, `intent="refactor"`, or `intent="migrate"`, results now include a decision card telling you whether there's enough evidence to proceed safely. The card shows: whether you're ready (`ready: true/false`), what to do next if not (`nextAction`), relevant team patterns to follow, a top example file, how many callers appear in results (`impact.coverage`), and what searches would help close gaps (`whatWouldHelp`).
1010
- **Caller coverage tracking**: The decision card shows how many of a symbol's callers are in your search results. Low coverage (less than 40% when there are lots of callers) triggers an alert so you know to search more before editing.
11+
- **Index versioning**: Index artifacts are versioned via `index-meta.json`. Mixed-version indexes are never served; version mismatches or corruption trigger automatic rebuild.
12+
- **Crash-safe rebuilds**: Full rebuilds write to `.staging/` and swap atomically only on success. Failed rebuilds don't corrupt the active index.
13+
- **Relationship sidecar**: New `relationships.json` artifact containing file import graph, reverse imports, and symbol export index. Updated incrementally alongside the main index.
14+
- **References confidence + hints**: `get_symbol_references` now includes `confidence: "syntactic"` and `isComplete: boolean` to help agents assess result completeness. `search_codebase` results now include a structured `hints` object (capped callers/consumers/tests ranked by frequency) drawn from the relationships sidecar. **`get_component_usage` removed from MCP surface (11→10 tools).** If you previously used `get_component_usage`, use `get_symbol_references` for symbol usage evidence (usageCount, top snippets, callers/consumers).
15+
- Tree-sitter-backed symbol extraction is now used by the Generic analyzer when available (with safe fallbacks).
16+
- Expanded language/extension detection to improve indexing coverage (e.g. `.pyi`, `.php`, `.kt`/`.kts`, `.cc`/`.cxx`, `.cs`, `.swift`, `.scala`, `.toml`, `.xml`).
17+
- New tool: `get_symbol_references` for concrete symbol usage evidence (usageCount + top snippets).
18+
- Multi-codebase eval runner: `npm run eval -- <codebaseA> <codebaseB>` with per-codebase reports and combined summary.
19+
- Shared eval scoring/reporting module (`src/eval/*`) used by both the CLI runner and the test suite.
20+
- Second frozen eval fixture plus an in-repo controlled TypeScript codebase for fully-offline eval runs.
21+
- Regression tests covering Tree-sitter Unicode slicing, parser cleanup/reset behavior, and large/generated file skipping.
1122

1223
### Changed
1324

1425
- **Preflight response shape**: Renamed `reason` to `nextAction` for clarity. Removed internal fields (`evidenceLock`, `riskLevel`, `confidence`) so the output is stable and doesn't change shape unexpectedly.
15-
26+
1627
### Fixed
1728

1829
- Null-pointer crash in GenericAnalyzer when chunk content is undefined.
1930
- Tree-sitter symbol extraction now treats node offsets as UTF-8 byte ranges and evicts cached parsers on failures/timeouts.
2031

21-
### More improvements (Phases 06–08)
22-
23-
- **Index versioning (Phase 06)**: Index artifacts are versioned via `index-meta.json`. Mixed-version indexes are never served; version mismatches or corruption trigger automatic rebuild.
24-
- **Crash-safe rebuilds (Phase 06)**: Full rebuilds write to `.staging/` and swap atomically only on success. Failed rebuilds don't corrupt the active index.
25-
- **Relationship sidecar (Phase 07)**: New `relationships.json` artifact containing file import graph, reverse imports, and symbol export index. Updated incrementally alongside the main index.
26-
- **References confidence + hints (Phase 08)**: `get_symbol_references` now includes `confidence: "syntactic"` and `isComplete: boolean` to help agents assess result completeness. `search_codebase` results now include a structured `hints` object (capped callers/consumers/tests ranked by frequency) drawn from the relationships sidecar. `get_component_usage` removed from MCP surface (11→10 tools).
27-
- Tree-sitter-backed symbol extraction is now used by the Generic analyzer when available (with safe fallbacks).
28-
- Expanded language/extension detection to improve indexing coverage (e.g. `.pyi`, `.php`, `.kt`/`.kts`, `.cc`/`.cxx`, `.cs`, `.swift`, `.scala`, `.toml`, `.xml`).
29-
- New tool: `get_symbol_references` for concrete symbol usage evidence (usageCount + top snippets).
30-
- Multi-codebase eval runner: `npm run eval -- <codebaseA> <codebaseB>` with per-codebase reports and combined summary.
31-
- Shared eval scoring/reporting module (`src/eval/*`) used by both the CLI runner and the test suite.
32-
- Second frozen eval fixture plus an in-repo controlled TypeScript codebase for fully-offline eval runs.
33-
- Regression tests covering Tree-sitter Unicode slicing, parser cleanup/reset behavior, and large/generated file skipping.
34-
3532
## [1.6.2] - 2026-02-17
3633

3734
Stripped it down for token efficiency, moved CLI code out of the protocol layer, and cleared structural debt.

README.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -119,12 +119,21 @@ This is where it all comes together. One call returns:
119119
- **Code results** with `file` (path + line range), `summary`, `score`
120120
- **Type** per result: compact `componentType:layer` (e.g., `service:data`) — helps agents orient
121121
- **Pattern signals** per result: `trend` (Rising/Declining — Stable is omitted) and `patternWarning` when using legacy code
122-
- **Relationships** per result: `importedByCount` and `hasTests` (condensed) + **hints** (capped ranked callers, consumers, tests)
122+
- **Relationships** per result: `importedByCount` and `hasTests` (condensed) + **hints** (capped ranked callers, consumers, tests) — so you see suggested next reads and know what you haven't looked at yet
123123
- **Related memories**: up to 3 team decisions, gotchas, and failures matched to the query
124124
- **Search quality**: `ok` or `low_confidence` with confidence score and `hint` when low
125125
- **Preflight**: `ready` (boolean) with decision card when `intent="edit"|"refactor"|"migrate"`. Shows `nextAction` (if not ready), `warnings`, `patterns` (do/avoid), `bestExample`, `impact` (caller coverage), and `whatWouldHelp` (next steps). If search quality is low, `ready` is always `false`.
126126

127-
Snippets are opt-in (`includeSnippets: true`). Default output is lean — if the agent wants code, it calls `read_file`.
127+
Snippets are optional (`includeSnippets: true`). When enabled, snippets that have symbol metadata (e.g. from the Generic analyzer's AST chunking or Angular component chunks) start with a scope header so you know where the code lives (e.g. `// AuthService.getToken()` or `// SpotifyApiService`). Example:
128+
129+
```ts
130+
// AuthService.getToken()
131+
getToken(): string {
132+
return this.token;
133+
}
134+
```
135+
136+
Default output is lean — if the agent wants code, it calls `read_file`.
128137

129138
```json
130139
{
@@ -189,7 +198,7 @@ Record a decision once. It surfaces automatically in search results and prefligh
189198
| ------------------------------ | ------------------------------------------------------------------------------------------- |
190199
| `search_codebase` | Hybrid search + decision card. Pass `intent="edit"` to get `ready`, `nextAction`, patterns, caller coverage, and `whatWouldHelp`. |
191200
| `get_team_patterns` | Pattern frequencies, golden files, conflict detection |
192-
| `get_symbol_references` | Find concrete references to a symbol (usageCount + top snippets + confidence + completeness) |
201+
| `get_symbol_references` | Find concrete references to a symbol (usageCount + top snippets). `confidence: "syntactic"` = static/source-based only; no runtime or dynamic dispatch. |
193202
| `remember` | Record a convention, decision, gotcha, or failure |
194203
| `get_memory` | Query team memory with confidence decay scoring |
195204
| `get_codebase_metadata` | Project structure, frameworks, dependencies |
@@ -200,7 +209,7 @@ Record a decision once. It surfaces automatically in search results and prefligh
200209

201210
## Evaluation Harness (`npm run eval`)
202211

203-
Reproducible evaluation with frozen fixtures so ranking/chunking changes are measured honestly and regressions get caught.
212+
Reproducible evaluation with frozen fixtures so ranking/chunking changes are measured honestly and regressions get caught. **For contributors and CI:** run before releases or after changing search/ranking/chunking to guard against regressions.
204213

205214
- Two codebases: `npm run eval -- <codebaseA> <codebaseB>`
206215
- Defaults: fixture A = `tests/fixtures/eval-angular-spotify.json`, fixture B = `tests/fixtures/eval-controlled.json`
@@ -214,11 +223,13 @@ npm run eval -- tests/fixtures/codebases/eval-controlled tests/fixtures/codebase
214223
```
215224

216225
- Flags: `--help`, `--fixture-a`, `--fixture-b`, `--skip-reindex`, `--no-rerank`, `--no-redact`
226+
- To save a report for later comparison, redirect stdout (e.g. `pnpm run eval -- <path-to-angular-spotify> --skip-reindex > internal-docs/tests/eval-runs/angular-spotify-YYYY-MM-DD.txt`).
217227

218228
## How the Search Works
219229

220230
The retrieval pipeline is designed around one goal: give the agent the right context, not just any file that matches.
221231

232+
- **Definition-first ranking** - for exact-name lookups (e.g. a symbol name), the file that *defines* the symbol ranks above files that only use it.
222233
- **Intent classification** - knows whether "AuthService" is a name lookup or "how does auth work" is conceptual. Adjusts keyword/semantic weights accordingly.
223234
- **Hybrid fusion (RRF)** - combines keyword and semantic search using Reciprocal Rank Fusion instead of brittle score averaging.
224235
- **Query expansion** - conceptual queries automatically expand with domain-relevant terms (auth → login, token, session, guard).
@@ -229,13 +240,15 @@ The retrieval pipeline is designed around one goal: give the agent the right con
229240
- **Version gating** - index artifacts are versioned; mismatches trigger automatic rebuild so mixed-version data is never served.
230241
- **Auto-heal** - if the index corrupts, search triggers a full re-index automatically.
231242

243+
**Index reliability:** Rebuilds write to a staging directory and swap atomically only on success, so a failed rebuild never corrupts the active index. Version mismatches or corruption trigger an automatic full re-index (no user action required).
244+
232245
## Language Support
233246

234-
Over **30+ languages** are supported for indexing + retrieval: TypeScript/JavaScript, Python (incl `.pyi`), PHP, Ruby, Java, Kotlin (`.kt`/`.kts`), Go, Rust, C/C++ (incl `.cc`/`.cxx`), C#, Swift, Scala, Shell, plus common config/markup formats (JSON/YAML/TOML/XML, etc.).
247+
**10 languages** have full symbol extraction (Tree-sitter): TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, Go, Rust. **30+ languages** have indexing and retrieval coverage (keyword + semantic), including PHP, Ruby, Swift, Scala, Shell, and config/markup (JSON/YAML/TOML/XML, etc.).
235248

236249
Enrichment is framework-specific: right now only **Angular** has a dedicated analyzer for rich conventions/context (signals, standalone components, control flow, DI patterns).
237250

238-
For non-Angular projects, the **Generic** analyzer still provides broad coverage, and will use Tree-sitter symbol extraction when a grammar is available (otherwise it falls back to safe parsing).
251+
For non-Angular projects, the **Generic** analyzer uses **AST-aligned chunking** when a Tree-sitter grammar is available: symbol-bounded chunks with **scope-aware prefixes** (e.g. `// ClassName.methodName`) so snippets show where code lives. Without a grammar it falls back to safe line-based chunking.
239252

240253
Structured filters available: `framework`, `language`, `componentType`, `layer` (presentation, business, data, state, core, shared).
241254

docs/capabilities.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ Technical reference for what `codebase-context` ships today. For the user-facing
44

55
## Tool Surface
66

7-
10 MCP tools + 1 optional resource (`codebase://context`).
7+
10 MCP tools + 1 optional resource (`codebase://context`). **Migration:** `get_component_usage` was removed; use `get_symbol_references` for symbol usage evidence.
88

99
### Core Tools
1010

1111
| Tool | Input | Output |
1212
| ----------------------- | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
1313
| `search_codebase` | `query`, optional `intent`, `limit`, `filters`, `includeSnippets` | Ranked results (`file`, `summary`, `score`, `type`, `trend`, `patternWarning`, `relationships`, `hints`) + `searchQuality` + decision card (`ready`, `nextAction`, `patterns`, `bestExample`, `impact`, `whatWouldHelp`) when `intent="edit"`. Hints capped at 3 per category. |
1414
| `get_team_patterns` | optional `category` | Pattern frequencies, trends, golden files, conflicts |
15-
| `get_symbol_references` | `symbol`, optional `limit` | Concrete symbol usage evidence: `usageCount` + top usage snippets + `confidence` ("syntactic") + `isComplete` boolean |
15+
| `get_symbol_references` | `symbol`, optional `limit` | Concrete symbol usage evidence: `usageCount` + top usage snippets + `confidence` + `isComplete`. `confidence: "syntactic"` means static/source-based only (no runtime or dynamic dispatch). Replaces the removed `get_component_usage`. |
1616
| `remember` | `type`, `category`, `memory`, `reason` | Persists to `.codebase-context/memory.json` |
1717
| `get_memory` | optional `category`, `type`, `query`, `limit` | Memories with confidence decay scoring |
1818

@@ -121,12 +121,12 @@ Returned as `preflight` when search `intent` is `edit`, `refactor`, or `migrate`
121121
## Analyzers
122122

123123
- **Angular**: signals, standalone components, control flow syntax, lifecycle hooks, DI patterns, component metadata
124-
- **Generic**: 30+ languages TypeScript, JavaScript, Python, Java, Kotlin, C/C++, C#, Go, Rust, PHP, Ruby, Swift, Scala, Shell, config/markup formats
124+
- **Generic**: 30+ have indexing/retrieval coverage including PHP, Ruby, Swift, Scala, Shell, config/markup., 10 languages have full symbol extraction (Tree-sitter: TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, Go, Rust).
125125

126126
Notes:
127127

128128
- Language detection covers common extensions including `.pyi`, `.kt`/`.kts`, `.cc`/`.cxx`, and config formats like `.toml`/`.xml`.
129-
- When Tree-sitter grammars are present, the Generic analyzer can derive symbol components from Tree-sitter extraction (with fallbacks).
129+
- When Tree-sitter grammars are present, the Generic analyzer uses AST-aligned chunking and scope-aware prefixes for symbol-aware snippets (with fallbacks).
130130

131131
## Evaluation Harness
132132

src/tools/search-codebase.ts

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -176,9 +176,8 @@ export async function handle(
176176
text: JSON.stringify(
177177
{
178178
status: 'error',
179-
message: `Auto-heal retry failed: ${
180-
retryError instanceof Error ? retryError.message : String(retryError)
181-
}`
179+
message: `Auto-heal retry failed: ${retryError instanceof Error ? retryError.message : String(retryError)
180+
}`
182181
},
183182
null,
184183
2
@@ -313,11 +312,13 @@ export async function handle(
313312

314313
function buildRelationshipHints(result: SearchResult): RelationshipHints {
315314
const rPath = result.filePath;
315+
// Graph keys are relative paths with forward slashes; normalize for comparison
316+
const rPathNorm = path.relative(ctx.rootPath, rPath).replace(/\\/g, '/') || rPath.replace(/\\/g, '/');
316317

317318
// importedBy: files that import this result (reverse lookup), collect with counts
318319
const importedByMap = new Map<string, number>();
319320
for (const [dep, importers] of reverseImports) {
320-
if (dep.endsWith(rPath) || rPath.endsWith(dep)) {
321+
if (dep === rPathNorm || dep.endsWith(rPathNorm) || rPathNorm.endsWith(dep)) {
321322
for (const importer of importers) {
322323
importedByMap.set(importer, (importedByMap.get(importer) || 0) + 1);
323324
}
@@ -326,7 +327,7 @@ export async function handle(
326327

327328
// testedIn: heuristic — same basename with .spec/.test extension
328329
const testedIn: string[] = [];
329-
const baseName = path.basename(rPath).replace(/\.[^.]+$/, '');
330+
const baseName = path.basename(rPathNorm).replace(/\.[^.]+$/, '');
330331
if (importsGraph) {
331332
for (const file of Object.keys(importsGraph)) {
332333
const fileBase = path.basename(file);
@@ -616,8 +617,8 @@ export async function handle(
616617
}
617618

618619
// Add patterns (do/avoid, capped at 3 each, with adoption %)
619-
const doPatterns = preferredPatternsForOutput.slice(0, 3).map((p) => `${p.pattern}${p.frequency || 'N/A'}`);
620-
const avoidPatterns = avoidPatternsForOutput.slice(0, 3).map((p) => `${p.pattern}${p.frequency || 'N/A'} (declining)`);
620+
const doPatterns = preferredPatternsForOutput.slice(0, 3).map((p) => `${p.pattern}${p.adoption ? ` ${p.adoption}% adoption` : ''}`);
621+
const avoidPatterns = avoidPatternsForOutput.slice(0, 3).map((p) => `${p.pattern}${p.adoption ? ` ${p.adoption}% adoption` : ''} (declining)`);
621622
if (doPatterns.length > 0 || avoidPatterns.length > 0) {
622623
decisionCard.patterns = {
623624
...(doPatterns.length > 0 && { do: doPatterns }),
@@ -688,6 +689,10 @@ export async function handle(
688689
if (metadata?.functionName) {
689690
return metadata.functionName;
690691
}
692+
// component chunk fallback (component or pipe name)
693+
if (metadata?.componentName) {
694+
return metadata.componentName;
695+
}
691696
return null;
692697
}
693698

@@ -712,8 +717,8 @@ export async function handle(
712717
confidence: searchQuality.confidence,
713718
...(searchQuality.status === 'low_confidence' &&
714719
searchQuality.nextSteps?.[0] && {
715-
hint: searchQuality.nextSteps[0]
716-
})
720+
hint: searchQuality.nextSteps[0]
721+
})
717722
},
718723
...(preflightPayload && { preflight: preflightPayload }),
719724
results: results.map((r) => {

0 commit comments

Comments
 (0)