You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Install Semble, set up the MCP server, and scaffold a sub-agent
4
4
sidebar:
5
5
icon: seti:config
6
6
---
7
7
8
+
There are three things you can do to install Semble, which are independent of eachother. We recommend doing all three, but you can pick and choose based on your needs:
9
+
10
+
1.[Install Semble](#1-install-semble) (for the CLI and AGENTS.md flow).
11
+
2.[Set up the MCP server](#2-mcp-server) (so your top-level agent can call Semble as a tool).
12
+
3.[Install the sub-agent](#3-sub-agent) (so sub-agents, which can't call MCP tools, can still search).
13
+
8
14
## Requirements
9
15
10
-
- Python 3.10 or higher
16
+
- Python 3.10 or higher.
17
+
-[uv](https://docs.astral.sh/uv/getting-started/installation/) (recommended for all three flows).
11
18
- No GPU, API keys, or external services required. Runs fully on CPU.
12
19
13
-
## Install
20
+
## 1. Install Semble
21
+
22
+
Install Semble with [`uv`](https://docs.astral.sh/uv/) (recommended) or `pip`:
14
23
15
24
```bash
16
-
pip install semble
25
+
uv tool install semble # Recommended
26
+
pip install semble # Or with pip
17
27
```
18
28
19
-
Or with [uv](https://docs.astral.sh/uv/):
29
+
This gives you the [`semble` CLI](/packages/semble/usage/).
30
+
31
+
### Optional: wire it into AGENTS.md
32
+
33
+
Once installed, drop the [AGENTS.md snippet](/packages/semble/usage/#agentsmd-snippet) into your `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, or equivalent. This teaches any agent (including sub-agents) when to reach for `semble` instead of grep, and is the only setup needed for harnesses without MCP support.
34
+
35
+
## 2. MCP Server
36
+
37
+
Install Semble as an [MCP server](/packages/semble/mcp-server/) for Claude Code:
20
38
21
39
```bash
22
-
uv add semble
40
+
claude mcp add semble -s user -- uvx --from "semble[mcp]" semble
23
41
```
24
42
25
-
## MCP Server Extra
43
+
For other agents (Cursor, Codex, OpenCode, VS Code, Copilot CLI, Windsurf, Gemini, Kiro, Zed), see [MCP Server](/packages/semble/mcp-server/) for the per-harness config snippet.
26
44
27
-
To use Semble as an [MCP server](/packages/semble/mcp-server/) with agents like Claude Code, Cursor, or OpenCode, install the `mcp` extra:
45
+
## 3. Sub-agent
46
+
47
+
Sub-agents typically cannot call MCP tools directly. To give a sub-agent access to Semble, run `semble init` once in your project root to scaffold a dedicated search sub-agent for your harness:
28
48
29
49
```bash
30
-
pip install "semble[mcp]"
50
+
semble init # Claude Code → .claude/agents/semble-search.md
Copy file name to clipboardExpand all lines: src/content/docs/packages/semble/introduction.mdx
+43-40Lines changed: 43 additions & 40 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,65 +5,68 @@ sidebar:
5
5
icon: open-book
6
6
---
7
7
8
-
[Semble](https://github.com/MinishLab/semble) is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read and cutting latency on every step. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](/packages/semble/benchmarks/)). Everything runs on CPU with no API keys, GPU, or external services.
8
+
[Semble](https://github.com/MinishLab/semble) is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](/packages/semble/benchmarks/)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](/packages/semble/mcp-server/) or call it from the shell via [AGENTS.md](/packages/semble/usage/) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
9
9
10
-
Run it as an [MCP server](/packages/semble/mcp-server/) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo, cloned and indexed on demand.
10
+
## Quickstart
11
11
12
-
## Quick Start
12
+
Your agent queries Semble in natural language (e.g. `"How is authentication handled?"`) and gets back only the relevant code snippets, without grepping or reading full files. You can set it up as an MCP server or via AGENTS.md. First, install [uv](https://docs.astral.sh/uv/getting-started/installation/) if you don't have it yet.
13
13
14
-
Install Semble:
15
14
16
-
```bash
17
-
pip install semble # Install with pip
18
-
uv add semble # Install with uv
19
-
```
20
-
21
-
Index a repo and search it:
15
+
### MCP (Claude Code)
22
16
23
-
```python
24
-
from semble import SembleIndex
17
+
Add Semble to Claude Code (requires [uv](https://docs.astral.sh/uv/getting-started/installation/)):
25
18
26
-
# Index a local directory
27
-
index = SembleIndex.from_path("./my-project")
19
+
```bash
20
+
claude mcp add semble -s user -- uvx --from "semble[mcp]" semble
21
+
```
28
22
29
-
# Index a remote git repository
30
-
index = SembleIndex.from_git("https://github.com/MinishLab/model2vec")
23
+
Using another agent harness? See [MCP Server](/packages/semble/mcp-server/) for per-agent setup.
31
24
32
-
# Search with a natural-language or code query
33
-
results = index.search("save model to disk", top_k=3)
25
+
### Bash / AGENTS.md
34
26
35
-
# Find code similar to a specific result
36
-
related = index.find_related(results[0], top_k=3)
27
+
[Install Semble](/packages/semble/installation/), then add the [AGENTS.md snippet](/packages/semble/usage/#agentsmd-snippet) to your `AGENTS.md`, `CLAUDE.md`, or equivalent. This works for any agent and is the only option for sub-agents, which typically cannot call MCP tools directly.
uv tool install semble # Install with uv (recommended)
31
+
pip install semble # Or install with pip
44
32
```
45
33
34
+
46
35
## Main Features
47
36
48
-
-**Fast**: indexes a repo in ~250 ms and answers queries in ~1.5 ms, all on CPU.
37
+
-**Fast**: indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU.
49
38
-**Accurate**: NDCG@10 of 0.854 on the [benchmarks](/packages/semble/benchmarks/), on par with code-specialized transformer models at a fraction of the size and cost.
50
-
-**Local and remote**: pass a local path or a git URL; indexes are cached for the session.
51
-
-**MCP server**: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible agent.
39
+
-**Token-efficient**: returns only the relevant chunks, using [~98% fewer tokens than grep+read](/packages/semble/benchmarks/#token-efficiency).
52
40
-**Zero setup**: runs on CPU with no API keys, GPU, or external services required.
41
+
-**MCP server**: works with Claude Code, Cursor, Codex, OpenCode, VS Code, and any other MCP-compatible agent.
42
+
-**Local and remote**: pass a local path or a git URL.
53
43
54
-
## How It Works
55
-
56
-
Semble splits each file into code-aware chunks using [Chonkie](https://github.com/chonkie-inc/chonkie), then scores every query with two complementary retrievers:
44
+
## How it works
57
45
58
-
-**Semantic**: static [Model2Vec](https://github.com/MinishLab/model2vec) embeddings from the code-specialized [potion-code-16M](https://huggingface.co/minishlab/potion-code-16M) model.
59
-
-**Lexical**: [BM25](https://github.com/xhluca/bm25s) for exact matches on identifiers and API names.
46
+
Semble splits each file into code-aware chunks using [tree-sitter](https://github.com/tree-sitter/py-tree-sitter), then scores every query against the chunks with two complementary retrievers: static [Model2Vec](https://github.com/MinishLab/model2vec) embeddings using the code-specialized [potion-code-16M](https://huggingface.co/minishlab/potion-code-16M) model for semantic similarity, and [BM25](https://github.com/xhluca/bm25s) for lexical matches on identifiers and API names. The two score lists are fused with Reciprocal Rank Fusion (RRF).
60
47
61
-
The two score lists are fused with Reciprocal Rank Fusion (RRF) and then reranked with a set of code-aware signals:
48
+
After fusing, results are reranked with a set of code-aware signals:
62
49
63
-
-**Adaptive weighting**: symbol-like queries (`Foo::bar`, `getUserById`) get more lexical weight; natural-language queries stay balanced.
64
-
-**Definition boosts**: a chunk that defines the queried symbol (`class`, `def`, `func`) ranks above chunks that merely reference it.
65
-
-**Identifier stems**: query tokens are stemmed and matched against identifier stems, so`parse config` boosts chunks containing `parseConfig`, `ConfigParser`, or `config_parser`.
66
-
-**File coherence**: when multiple chunks from the same file match, the file is boosted so the top result reflects broad file-level relevance.
67
-
-**Noise penalties**: test files, `compat`/`legacy` shims, example code, and `.d.ts` stubs are down-ranked so canonical implementations surface first.
50
+
-**Adaptive weighting.** Symbol-like queries (`Foo::bar`, `_private`, `getUserById`) get more lexical weight, while natural-language queries stay balanced between semantic and lexical retrievers.
51
+
-**Definition boosts.** A chunk that defines the queried symbol (a `class`, `def`, `func`, etc.) is ranked above chunks that merely reference it.
52
+
-**Identifier stems.** Query tokens are stemmed and matched against identifier stems in a chunk, giving an additional weight to chunks that contain them. For example, querying`parse config` boosts chunks containing `parseConfig`, `ConfigParser`, or `config_parser`.
53
+
-**File coherence.** When multiple chunks from the same file match the query, the file is boosted so the top result reflects broad file-level relevance rather than a single out-of-context chunk.
54
+
-**Noise penalties.** Test files, `compat/`/`legacy/` shims, example code, and `.d.ts` declaration stubs are down-ranked so canonical implementations surface first.
68
55
69
56
Because the embedding model is static with no transformer forward pass at query time, all of this runs in milliseconds on CPU.
57
+
58
+
## Citing
59
+
60
+
If you use Semble in your research, please cite the following:
61
+
62
+
```bibtex
63
+
@software{minishlab2026semble,
64
+
author = {{van Dongen}, Thomas and Stephan Tulkens},
65
+
title = {Semble: Fast and Accurate Code Search for Agents},
0 commit comments