Skip to content

Commit 4e359f5

Browse files
SimplyLizclaude
andcommitted
docs(site): refresh MCP tool list and fix stale version claims
- mcp.mdx: align tool list with the 34 tools advertised by `lip mcp`. Remove 5 phantom tools (lip_reindex_files, lip_similarity, lip_query_expansion, lip_cluster, lip_export_embeddings) that never existed in tools/lip-cli/src/cmd/mcp.rs. Add 15 missing tools (lip_coverage, lip_explain_match, lip_extract_terminology, lip_find_boundaries, lip_find_counterpart, lip_get_centroid, lip_nearest_by_contrast, lip_nearest_in_store, lip_novelty_score, lip_outliers, lip_prune_deleted, lip_semantic_diff, lip_semantic_drift, lip_similarity_matrix, lip_stale_embeddings). Group the table by category (structural / semantic / observability). Fix CKB → LIP mapping table to use real tool names. - index.mdx: tool count 19 → 34. - comparisons.mdx: drop stale "v2.0 roadmap" claim for Go/Java/Kotlin/C#; reflect the 8 Tier 2 backends actually shipped. - getting-started.mdx: remove hardcoded `# lip 1.3.0` comment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 9308f5c commit 4e359f5

4 files changed

Lines changed: 65 additions & 71 deletions

File tree

website/src/pages/docs/comparisons.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ The remaining gaps after v1.5:
150150

151151
- **Data-flow / taint analysis** — requires full CPG; not in scope for LIP's current architecture. SCIP remains the right tool for taint tracking and security audit workflows.
152152
- **Generics / trait bounds / overload resolution for TypeScript, Python, Dart** — rust-analyzer exposes this via Tier 2 for Rust. The other three languages rely on hover text parsing; deep generic instantiation is not yet extracted as structured relationships.
153-
- **Language coverage** — Tier 2 covers 4 languages; SCIP indexers exist for 15+. Go, Java, Kotlin, and C# are on the v2.0 roadmap.
153+
- **Language coverage** — Tier 2 covers 8 languages (Rust, Go, TypeScript, Python, Dart, Kotlin, Swift, C/C++); SCIP indexers exist for 15+. Java and C# remain on the roadmap.
154154

155155
Where LIP now **exceeds** SCIP (v1.6):
156156

website/src/pages/docs/getting-started.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ Verify:
3535

3636
```bash
3737
lip --version
38-
# lip 1.3.0
3938
```
4039

4140
---

website/src/pages/docs/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Full documentation for every CLI command, daemon configuration option, and proto
2929
|---|---|
3030
| [CLI Reference](/docs/cli-reference) | All commands: `daemon`, `query`, `import`, `export`, `lsp`, `mcp`, `slice`, `fetch`, `push`, `annotate` |
3131
| [The Daemon](/docs/daemon) | Persistence, file watcher, confidence tiers, wire protocol, service setup |
32-
| [MCP Integration](/docs/mcp) | 19 MCP tools for AI agents — full schema reference and agent workflows |
32+
| [MCP Integration](/docs/mcp) | 34 MCP tools for AI agents — full schema reference and agent workflows |
3333
| [Registry & Slices](/docs/registry) | Build, share, and consume content-addressed dependency slices |
3434
| [LSP, SCIP & LIP](/docs/comparisons) | What each protocol does and when to use which |
3535

website/src/pages/docs/mcp.mdx

Lines changed: 63 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: ../../layouts/DocsLayout.astro
33
title: MCP Integration
4-
description: 24 MCP tools for AI agents — full schema reference, annotation patterns, and agent workflows.
4+
description: 34 MCP tools for AI agents — full schema reference, annotation patterns, and agent workflows.
55
---
66

77
# MCP Integration
@@ -12,6 +12,8 @@ description: 24 MCP tools for AI agents — full schema reference, annotation pa
1212

1313
## Tools
1414

15+
**Structural / graph (14):**
16+
1517
| Tool | Description |
1618
|------|-------------|
1719
| `lip_blast_radius` | Which files are affected if this symbol changes |
@@ -28,16 +30,36 @@ description: 24 MCP tools for AI agents — full schema reference, annotation pa
2830
| `lip_stale_files` | Merkle sync probe — which files need re-indexing |
2931
| `lip_load_slice` | Mount a pre-built dependency slice into the daemon graph |
3032
| `lip_batch_query` | Execute multiple queries in one round-trip |
33+
34+
**Semantic / embeddings (17):**
35+
36+
| Tool | Description |
37+
|------|-------------|
3138
| `lip_embedding_batch` | Compute and cache file embeddings via HTTP endpoint |
3239
| `lip_nearest` | Top-K files most similar to a given file (cosine similarity) |
3340
| `lip_nearest_by_text` | Top-K files most similar to a free-text query |
41+
| `lip_nearest_by_contrast` | Contrastive search — files like X but unlike Y |
42+
| `lip_nearest_in_store` | Nearest-neighbour search against a caller-provided embedding store (cross-repo federation) |
43+
| `lip_similarity_matrix` | Pairwise cosine similarities for a list of files in one call |
44+
| `lip_semantic_drift` | Cosine distance between two files (0.0 = identical, 2.0 = opposite) |
45+
| `lip_semantic_diff` | Drift distance between two versions of a file |
46+
| `lip_outliers` | Identify semantically misplaced files within a set |
47+
| `lip_novelty_score` | How semantically novel a set of files is relative to the rest of the codebase |
48+
| `lip_find_boundaries` | Detect semantic boundaries within a file via chunked windowing |
49+
| `lip_find_counterpart` | Given a source file + candidate pool, return the closest matches |
50+
| `lip_extract_terminology` | Domain vocabulary most semantically central to a set of files |
51+
| `lip_get_centroid` | Component-wise mean embedding of a set of files |
52+
| `lip_explain_match` | Explain *why* a result was a strong semantic match for a query |
53+
| `lip_coverage` | Embedding coverage report under a filesystem path |
54+
| `lip_stale_embeddings` | Files whose stored embedding is older than the file's mtime |
55+
56+
**Observability / ops (3):**
57+
58+
| Tool | Description |
59+
|------|-------------|
3460
| `lip_index_status` | Daemon health: indexed count, embedding coverage, last update |
3561
| `lip_file_status` | Per-file indexing status and embedding age |
36-
| `lip_reindex_files` | Force re-index of specific file URIs from disk |
37-
| `lip_similarity` | Pairwise cosine similarity of two stored embeddings |
38-
| `lip_query_expansion` | Expand a query string into related symbol names |
39-
| `lip_cluster` | Group URIs by embedding proximity within a given radius |
40-
| `lip_export_embeddings` | Return raw stored vectors for external pipelines |
62+
| `lip_prune_deleted` | Remove index entries for files no longer on disk |
4163

4264
All tools are backed by the live LIP daemon — results are always current, never a stale snapshot.
4365

@@ -333,108 +355,81 @@ file:///src/auth.rs indexed=true has_embedding=true age=42s
333355

334356
---
335357

336-
## v1.6 embedding integration tools
358+
## Advanced semantic tools
337359

338-
These tools require `LIP_EMBEDDING_URL` to be set unless otherwise noted.
360+
These tools require `LIP_EMBEDDING_URL` to be set unless otherwise noted. The full table above lists every tool; the most commonly-used ones are spelled out below.
339361

340-
### lip_reindex_files
362+
### lip_nearest_by_contrast
341363

342-
Force a re-index of specific file URIs from disk. Does not require embeddings — reads each file, detects its language, and updates the symbol graph.
364+
Contrastive search — find files similar to `like_uri` but different from `unlike_uri`. Useful when two concepts are close in raw vector space and you want to disambiguate.
343365

344366
**Input:**
345367
```json
346-
{ "uris": ["file:///src/auth.rs", "file:///src/session.rs"] }
368+
{ "like_uri": "file:///src/auth.rs", "unlike_uri": "file:///src/session.rs", "top_k": 5 }
347369
```
348370

349-
Use this after out-of-band changes that the daemon's file watcher didn't catch (e.g. selective `git checkout` or CI-generated files). Returns `DeltaAck`.
350-
351371
---
352372

353-
### lip_similarity
373+
### lip_similarity_matrix
354374

355-
Pairwise cosine similarity of two stored embeddings.
375+
Pairwise cosine similarities for a list of files in a single call. Returns a labelled N×N matrix.
356376

357377
**Input:**
358378
```json
359-
{ "uri_a": "file:///src/auth.rs", "uri_b": "file:///src/session.rs" }
379+
{ "uris": ["file:///src/auth.rs", "file:///src/session.rs", "file:///src/tokens.rs"] }
360380
```
361381

362-
**Output:**
363-
```
364-
score=0.9214
365-
```
366-
367-
Returns `null` when either URI has no cached embedding — call `lip_embedding_batch` first. Accepts both `file://` (file embeddings) and `lip://` (symbol embeddings) URIs. Safe inside `lip_batch_query`.
368-
369382
---
370383

371-
### lip_query_expansion
384+
### lip_outliers
372385

373-
Embed a short query string and return the display names of the nearest symbols — useful before `lip_workspace_symbols` when the exact symbol name isn't known.
386+
Identify semantically misplaced files within a set. For each URI, computes its leave-one-out mean cosine similarity to the other set members and ranks the lowest as outliers.
374387

375388
**Input:**
376389
```json
377-
{ "query": "token validation", "top_k": 5 }
378-
```
379-
380-
**Output:**
381-
```
382-
verifyToken
383-
validateSession
384-
checkJwt
385-
parseBearer
386-
refreshToken
390+
{ "uris": ["file:///src/auth.rs", "file:///src/session.rs", "file:///src/payments.rs"], "top_k": 3 }
387391
```
388392

389-
Requires symbols to have embeddings in the symbol store (populate with `lip_embedding_batch` using `lip://` URIs).
390-
391393
---
392394

393-
### lip_cluster
395+
### lip_find_boundaries
394396

395-
Group a list of URIs into clusters based on embedding proximity.
397+
Detect semantic boundaries within a single file by chunking it into line-windows and embedding each window. Returns the line positions where adjacent windows diverge most — useful for splitting overgrown files or finding logical sections.
396398

397399
**Input:**
398400
```json
399-
{
400-
"uris": [
401-
"file:///src/auth.rs",
402-
"file:///src/session.rs",
403-
"file:///src/payments.rs",
404-
"file:///src/invoices.rs"
405-
],
406-
"radius": 0.85
407-
}
401+
{ "uri": "file:///src/giant_module.rs", "window_lines": 60, "top_k": 3 }
408402
```
409403

410-
**Output:**
411-
```
412-
Group 1: file:///src/auth.rs file:///src/session.rs
413-
Group 2: file:///src/payments.rs file:///src/invoices.rs
414-
```
415-
416-
`radius` is the cosine-similarity threshold. Two URIs land in the same group when their similarity is ≥ the radius. URIs without a cached embedding are silently excluded.
417-
418404
---
419405

420-
### lip_export_embeddings
406+
### lip_explain_match
421407

422-
Return the raw stored embedding vectors for a list of URIs. Useful for passing to external re-ranking, custom clustering, or visualization tools.
408+
Explain *why* a result was a strong semantic match for a query. Chunks the result file into windows, embeds each, and returns the top-scoring chunks with their line ranges.
423409

424410
**Input:**
425411
```json
426-
{ "uris": ["file:///src/auth.rs", "file:///src/session.rs"] }
412+
{ "query": "rate limiter token bucket", "result_uri": "file:///src/middleware/throttle.rs", "top_k": 3 }
427413
```
428414

429-
**Output:**
415+
---
416+
417+
### lip_coverage
418+
419+
Report embedding coverage under a filesystem path: how many indexed files have embeddings, how many don't. Useful as a CI gate or before running semantic queries.
420+
421+
**Input:**
430422
```json
431-
{
432-
"file:///src/auth.rs": [0.021, -0.044, 0.117, ...],
433-
"file:///src/session.rs": [0.019, -0.051, 0.109, ...]
434-
}
423+
{ "root": "file:///repo/src" }
435424
```
436425

437-
URIs with no cached embedding are omitted from the result. Safe inside `lip_batch_query`.
426+
---
427+
428+
### lip_prune_deleted
429+
430+
Remove index entries for files that no longer exist on disk. On long-running daemons in repos with high churn, ghost embeddings accumulate and pollute nearest-neighbour searches.
431+
432+
**Input:** `{}` (no arguments)
438433

439434
---
440435

@@ -503,8 +498,8 @@ lip_file_status(target_uri) # confirm specific file is indexed and fresh
503498
| `findReferences` | `lip_references` | Same semantics |
504499
| `getCallGraph` | `lip_references` + CPG edges | LIP CPG via Tier 2 |
505500
| `prepareChange` | `lip_blast_radius` + `lip_annotation_get` | LIP adds annotation check |
506-
| `batchSearch` | `lip_query_expansion` + `lip_workspace_symbols` | Expand → search replaces compound text scan |
507-
| `explore` | `lip_cluster` + `lip_nearest_by_text` | Group and navigate by semantic proximity |
508-
| `recentlyRelevant` | `lip_similarity` + `lip_export_embeddings` | Feed raw vectors into CKB's re-ranking tier |
501+
| `batchSearch` | `lip_nearest_by_text` + `lip_workspace_symbols` | Semantic top-K → name search replaces compound text scan |
502+
| `explore` | `lip_find_boundaries` + `lip_nearest_by_text` | Navigate by semantic proximity and section boundaries |
503+
| `recentlyRelevant` | `lip_similarity_matrix` + `lip_get_centroid` | Feed pairwise scores or centroid into CKB's re-ranking tier |
509504

510505
With LIP as CKB's backend, `analyzeImpact` and `prepareChange` are always current — no more `ckb index` needed.

0 commit comments

Comments
 (0)