Skip to content

Commit 7ebe08c

Browse files
BrainSlugs83Copilot
andcommitted
v1.4.0: npx install, non-blocking startup, server instructions
- README: step-by-step install guide with explicit mcp-config.json paths per OS - README: first-run warmup warning, lightweight CPU note, no build tools needed - README: all config examples use npx format (no more manual node paths) - package.json: add keywords, homepage, bump to v1.4.0 - package.json: add zod as direct dep (was transitive-only, broke npx installs) - package.json: bump better-sqlite3 to ^12.1.0 (Node 24 prebuilds) - index.js: non-blocking startup — MCP transport connects immediately, server warms up in background (fixes first-run timeout on slow connections) - index.js: tools wait up to 5 min for server on first run instead of failing - index.js: add MCP server instructions field with vector search usage guide Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 8caa1a2 commit 7ebe08c

File tree

4 files changed

+264
-151
lines changed

4 files changed

+264
-151
lines changed

README.md

Lines changed: 163 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -1,153 +1,202 @@
11
# vector-memory
22

3-
An [MCP](https://modelcontextprotocol.io/) server that adds semantic vector search to [**GitHub Copilot CLI**](https://docs.github.com/en/copilot/github-copilot-in-the-cli) (`github-copilot-cli`). Gives Copilot persistent long-term memory across sessions using local embeddings and vector search.
3+
An [MCP](https://modelcontextprotocol.io/) server that adds **persistent long-term memory** to [**GitHub Copilot CLI**](https://docs.github.com/en/copilot/github-copilot-in-the-cli) via local semantic vector search. Copilot can recall past conversations, code changes, and decisions across all sessions — by meaning, not just keywords.
44

55
> **Note:** This is a community project and is not affiliated with or endorsed by GitHub. [GitHub Copilot CLI](https://docs.github.com/en/copilot/github-copilot-in-the-cli) is a product of GitHub / Microsoft.
66
7-
## Architecture
7+
---
88

9-
```
10-
copilot.exe ──STDIO──▶ index.js (proxy) ──HTTP──▶ vector-memory-server.js (singleton)
11-
12-
embed-worker.js (worker thread)
13-
14-
Xenova/gte-small (ONNX, 34MB)
15-
```
9+
## Installation
1610

17-
- **index.js** — Thin STDIO MCP proxy. One per copilot instance. Checks if the HTTP server is running, launches it if not, then ferries tool calls over HTTP.
18-
- **vector-memory-server.js** — Singleton HTTP server. Owns the embedding model (one copy in memory), SQLite vector DB, and background indexing. Port is auto-assigned per user via a deterministic hash of the username.
19-
- **embed-worker.js** — Worker thread that loads the ONNX model and handles embedding inference off the main thread.
20-
- **lib.js** — Pure logic extracted for testability: filtering, dedup, post-processing, process detection.
11+
### Prerequisites
2112

22-
### Key design decisions
13+
You need **Node.js ≥18** installed. This gives you `node`, `npm`, and `npx`.
2314

24-
- **Singleton**: Only one server runs regardless of how many copilot instances are open. Saves ~200MB RAM per additional instance.
25-
- **Race condition hardened**: EADDRINUSE detection with full diagnostics — distinguishes between healthy singleton, zombie process, and foreign port conflict.
26-
- **No duplicates**: `UNIQUE` constraint + `INSERT OR IGNORE` + `isIndexing` guard prevents duplicate embeddings even under concurrent access.
27-
- **Lazy init**: ONNX model only loads after winning the singleton race. Losers exit in ~500ms.
28-
- **Idle shutdown**: Server exits after 5 minutes of inactivity (no requests and no new session content). The proxy re-launches it on next use.
29-
- **Self-healing**: Detects and deletes corrupt/truncated model files, re-downloads automatically. Retries with backoff for Windows Defender file locks.
15+
- **Windows:** `winget install OpenJS.NodeJS.LTS`
16+
- **macOS:** `brew install node` or download from [nodejs.org](https://nodejs.org)
17+
- **Linux:** Use your package manager or [nodejs.org](https://nodejs.org)
3018

31-
## Prerequisites
19+
That's it. The native SQLite modules (`better-sqlite3`, `sqlite-vec`) ship prebuilt binaries for Windows (x64), macOS (x64, ARM), and Linux (x64, ARM) — no compiler or build tools needed.
3220

33-
| Requirement | Version | Install |
34-
|---|---|---|
35-
| Node.js | ≥18.x | `winget install OpenJS.NodeJS.LTS` or [nodejs.org](https://nodejs.org) |
36-
| npm | (comes with Node) ||
37-
| Python build tools¹ || `npm install -g windows-build-tools` (Windows) |
38-
| C++ compiler¹ || Visual Studio Build Tools with "Desktop C++" workload |
21+
> <details><summary>Build tools only needed if prebuilds aren't available for your platform</summary>
22+
>
23+
> If you're on an unusual platform and the prebuilt binaries aren't available, `better-sqlite3` falls back to compiling from source. In that case you'll need:
24+
> - **Windows:** [Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) with the "Desktop development with C++" workload
25+
> - **macOS:** `xcode-select --install`
26+
> - **Linux:** `sudo apt install build-essential python3` (or equivalent)
27+
>
28+
> </details>
3929
40-
¹ Required by `better-sqlite3` and `sqlite-vec` native modules. On macOS, `xcode-select --install` covers both.
30+
### Step 1: Find (or create) your MCP config file
4131

42-
## Installation
32+
GitHub Copilot CLI reads MCP server definitions from a JSON config file. The **user-level** config lives at:
4333

44-
```bash
45-
cd ~/.copilot/mcp-servers/vector-memory
46-
npm install
47-
```
34+
| OS | Path |
35+
|---|---|
36+
| **Windows** | `%USERPROFILE%\.copilot\mcp-config.json` (e.g. `C:\Users\YourName\.copilot\mcp-config.json`) |
37+
| **macOS / Linux** | `~/.copilot/mcp-config.json` |
4838

49-
The ONNX embedding model (`Xenova/gte-small`, ~34MB) downloads automatically on first run.
39+
> **If this file doesn't exist yet**, create it. If the `.copilot` folder doesn't exist either, create that too — Copilot CLI will use it.
40+
>
41+
> You can also place a **project-level** config at `.copilot/mcp-config.json` in any repo root, but user-level is recommended for this server since it provides memory across all projects.
5042
51-
### Register with Copilot CLI
43+
### Step 2: Add the vector-memory server
5244

53-
Add to `~/.copilot/mcp-config.json`:
45+
**If the file doesn't exist or is empty**, create it with this content:
5446

5547
```json
5648
{
5749
"mcpServers": {
5850
"vector-memory": {
59-
"command": "node",
60-
"args": ["C:/Users/<you>/.copilot/mcp-servers/vector-memory/index.js"]
51+
"type": "stdio",
52+
"command": "npx",
53+
"args": ["-y", "ghcp-cli-vector-memory-mcp"]
6154
}
6255
}
6356
}
6457
```
6558

66-
### Environment variables
67-
68-
| Variable | Default | Description |
69-
|---|---|---|
70-
| `VECTOR_MEMORY_PORT` | *(auto)* | HTTP port for the singleton server. By default, a deterministic port is computed from your OS username (FNV-1a hash, range 31337–35432). Only set this if two users happen to collide. |
71-
| `VECTOR_MEMORY_IDLE_TIMEOUT` | `5` | Minutes of inactivity before the server shuts down. `0` or negative = never shut down. |
72-
73-
Set these in the `env` block of `mcp-config.json` (only if needed):
59+
**If you already have an `mcp-config.json`** with other servers, add the `"vector-memory"` entry inside the existing `"mcpServers"` object:
7460

7561
```json
7662
{
7763
"mcpServers": {
64+
"your-existing-server": { "...": "..." },
7865
"vector-memory": {
79-
"command": "node",
80-
"args": ["C:/Users/<you>/.copilot/mcp-servers/vector-memory/index.js"],
81-
"env": {
82-
"VECTOR_MEMORY_IDLE_TIMEOUT": "10"
83-
}
66+
"type": "stdio",
67+
"command": "npx",
68+
"args": ["-y", "ghcp-cli-vector-memory-mcp"]
8469
}
8570
}
8671
}
8772
```
8873

89-
#### Auto-port assignment
74+
> **You do not need to clone this repo or run `npm install` yourself.** The `npx -y` command automatically downloads, installs, and runs the package from the npm registry. It caches the package locally so subsequent launches are fast.
9075
91-
Each user automatically gets a deterministic port based on their OS username (FNV-1a hash with rotation, range 31337–35432). This means multi-user setups work **zero-config** — no manual port assignment needed.
76+
### Step 3: Restart Copilot CLI
9277

93-
In the rare case of a hash collision (two usernames mapping to the same port), the server detects the conflict at startup and tells the affected user to set `VECTOR_MEMORY_PORT` manually.
78+
Close any running Copilot CLI session and start a new one. The MCP server will launch automatically in the background.
9479

95-
#### Multi-user setup
80+
> [!IMPORTANT]
81+
> **The very first launch takes a few minutes.** On first run, `npx` installs the package and its
82+
> native dependencies, then the server downloads a small machine learning model (~34 MB,
83+
> [Xenova/gte-small](https://huggingface.co/Xenova/gte-small)). This is a **one-time cost**
84+
> subsequent starts are near-instant.
85+
>
86+
> The MCP proxy connects immediately and won't block Copilot CLI from starting. If you try to
87+
> use vector search before the model is ready, it will tell you it's still warming up.
9688
97-
On a shared machine, each user's server runs on their auto-assigned port. Just point both configs at the same codebase — no `env` block needed:
89+
> [!NOTE]
90+
> **Runs comfortably on any laptop.** The ONNX embedding model is tiny (~34 MB in memory) and
91+
> inference is fast even on CPU. There is no GPU requirement. You will not notice any impact on
92+
> battery life or system performance. The server also idles down and exits automatically after
93+
> 5 minutes of inactivity, so it costs zero resources when you're not using Copilot.
9894
99-
**User A** (`~/.copilot/mcp-config.json`):
100-
```json
101-
{
102-
"mcpServers": {
103-
"vector-memory": {
104-
"command": "node",
105-
"args": ["D:/shared/vector-memory/index.js"]
106-
}
107-
}
108-
}
95+
### Step 4: Verify it's working
96+
97+
In a new Copilot CLI session, ask:
98+
99+
```
100+
Do you have vector search available?
109101
```
110102

111-
**User B** (`~/.copilot/mcp-config.json`) — same config, different port automatically:
103+
Copilot should confirm it has the `vector_search` and `vector_reindex` tools. If it's the first launch and the model is still downloading, it will tell you — just wait a minute and try again.
104+
105+
---
106+
107+
## What it does
108+
109+
Once installed, Copilot CLI gains two new tools:
110+
111+
| Tool | Description |
112+
|---|---|
113+
| `vector_search` | Semantic search across all past session history. Find conversations, code changes, and decisions by meaning — not just keywords. Returns ranked results with similarity scores. |
114+
| `vector_reindex` | Force a full rebuild of the vector index. Normally not needed — search auto-indexes new content. Use if the index seems stale. |
115+
116+
Copilot will use `vector_search` automatically when it needs to recall past context. You can also prompt it directly: *"search your memory for..."* or *"do you remember when we..."*
117+
118+
### Data flow
119+
120+
1. Copilot CLI writes session data to `~/.copilot/session-store.db` (this already exists)
121+
2. vector-memory reads from that DB (read-only) and creates embeddings
122+
3. Embeddings are stored in `~/.copilot/vector-index.db`
123+
4. Indexing triggers: on startup, on each search (if new content exists), and every 15 minutes
124+
125+
All data stays local. Nothing is sent to any external service.
126+
127+
---
128+
129+
## Configuration
130+
131+
### Environment variables
132+
133+
| Variable | Default | Description |
134+
|---|---|---|
135+
| `VECTOR_MEMORY_PORT` | *(auto)* | HTTP port for the singleton server. A deterministic port is computed from your OS username (FNV-1a hash, range 31337–35432). Only set this if two users collide. |
136+
| `VECTOR_MEMORY_IDLE_TIMEOUT` | `5` | Minutes of inactivity before the server shuts down. `0` or negative = never shut down. |
137+
138+
Set these in the `env` block of your config (only if needed):
139+
112140
```json
113141
{
114142
"mcpServers": {
115143
"vector-memory": {
116-
"command": "node",
117-
"args": ["D:/shared/vector-memory/index.js"]
144+
"type": "stdio",
145+
"command": "npx",
146+
"args": ["-y", "ghcp-cli-vector-memory-mcp"],
147+
"env": {
148+
"VECTOR_MEMORY_IDLE_TIMEOUT": "10"
149+
}
118150
}
119151
}
120152
}
121153
```
122154

123-
Each user gets their own singleton server, their own vector index, and their own session history. The idle timeout ensures servers shut down automatically when not in use.
155+
### Multi-user setup
124156

125-
## Tools
157+
On a shared machine, each user's server runs on a unique auto-assigned port. No extra config needed — just use the same `mcp-config.json` entry above and each user gets their own singleton server, vector index, and session history.
126158

127-
| Tool | Description |
128-
|---|---|
129-
| `vector_search` | Semantic search across all past session history. Returns ranked results with similarity scores. |
130-
| `vector_reindex` | Force a full rebuild of the vector index. Normally not needed — search auto-indexes new content. |
159+
In the rare case of a port hash collision, the server detects it at startup and tells the affected user to set `VECTOR_MEMORY_PORT` manually.
131160

132-
## Data flow
161+
---
133162

134-
1. Copilot CLI writes session data to `~/.copilot/session-store.db` (FTS5 search index)
135-
2. vector-memory reads from that DB (read-only) and creates embeddings
136-
3. Embeddings are stored in `~/.copilot/vector-index.db` (sqlite-vec)
137-
4. Indexing triggers: on startup, on each search (if idle), and every 15 minutes
163+
## Architecture
164+
165+
```
166+
copilot.exe ──STDIO──▶ index.js (proxy) ──HTTP──▶ vector-memory-server.js (singleton)
167+
168+
embed-worker.js (worker thread)
169+
170+
Xenova/gte-small (ONNX, 34MB)
171+
```
172+
173+
- **index.js** — Thin STDIO MCP proxy. One per copilot instance. Checks if the HTTP server is running, launches it if not, then ferries tool calls over HTTP.
174+
- **vector-memory-server.js** — Singleton HTTP server. Owns the embedding model (one copy in memory), SQLite vector DB, and background indexing. Port is auto-assigned per user via a deterministic hash of the username.
175+
- **embed-worker.js** — Worker thread that loads the ONNX model and handles embedding inference off the main thread.
176+
- **lib.js** — Pure logic extracted for testability: filtering, dedup, post-processing, process detection.
177+
178+
### Key design decisions
179+
180+
- **Singleton**: Only one server runs regardless of how many copilot instances are open. Saves ~200MB RAM per additional instance.
181+
- **Race condition hardened**: EADDRINUSE detection with full diagnostics — distinguishes between healthy singleton, zombie process, and foreign port conflict.
182+
- **No duplicates**: `UNIQUE` constraint + `INSERT OR IGNORE` + `isIndexing` guard prevents duplicate embeddings even under concurrent access.
183+
- **Lazy init**: ONNX model only loads after winning the singleton race. Losers exit in about 500ms.
184+
- **Idle shutdown**: Server exits after 5 minutes of inactivity (no requests and no new session content). The proxy re-launches it on next use.
185+
- **Self-healing**: Detects and deletes corrupt/truncated model files, re-downloads automatically. Retries with backoff for Windows Defender file locks.
186+
187+
## Development
138188

139-
## Scripts
189+
### Scripts
140190

141191
```bash
142192
npm run lint # ESLint on all source files
143193
npm test # 38 unit tests (node:test, zero external deps)
144194
npm run check # lint + test
145195
```
146196

147-
## Running tests
197+
### Running tests
148198

149199
```bash
150-
cd ~/.copilot/mcp-servers/vector-memory
151200
npm test
152201
```
153202

@@ -157,7 +206,18 @@ With coverage:
157206
node --test --experimental-test-coverage test.js
158207
```
159208

160-
## Manual server management
209+
### File overview
210+
211+
| File | Purpose |
212+
|---|---|
213+
| `index.js` | STDIO MCP proxy — what copilot.exe launches via npx |
214+
| `vector-memory-server.js` | HTTP singleton — owns model, DB, indexing |
215+
| `embed-worker.js` | Worker thread for ONNX embedding inference |
216+
| `lib.js` | Pure logic: filtering, dedup, scoring, handler factory |
217+
| `test.js` | 38 unit tests with DI mocks |
218+
| `eslint.config.js` | Lint config |
219+
220+
### Manual server management
161221

162222
```bash
163223
# Start server directly (normally done by the proxy)
@@ -175,31 +235,31 @@ curl -X POST http://127.0.0.1:<PORT>/search \
175235
cat ~/.copilot/vector-memory.pid
176236
```
177237

178-
## File overview
179-
180-
| File | Purpose |
181-
|---|---|
182-
| `index.js` | STDIO MCP proxy — what copilot.exe launches |
183-
| `vector-memory-server.js` | HTTP singleton — owns model, DB, indexing |
184-
| `embed-worker.js` | Worker thread for ONNX embedding inference |
185-
| `lib.js` | Pure logic: filtering, dedup, scoring, handler factory |
186-
| `test.js` | 38 unit tests with DI mocks |
187-
| `eslint.config.js` | Lint config |
238+
---
188239

189240
## Troubleshooting
190241

242+
### First run is slow
243+
244+
This is expected! On first launch, the server needs to:
245+
1. Install native SQLite extensions (`better-sqlite3`, `sqlite-vec`)
246+
2. Download the embedding model (~34 MB from Hugging Face)
247+
248+
This can take **2–5 minutes** depending on your connection speed and whether native compilation is needed. Subsequent launches start in seconds.
249+
191250
### Port collision with another user
192251

193252
**Error:** `Port 31796 is owned by user "X" (expected "Y")`
194253

195-
Two usernames hashed to the same port (rare — FNV-1a produces very few collisions). One user needs to set a manual override. Edit `~/.copilot/mcp-config.json` and add a `VECTOR_MEMORY_PORT` env var:
254+
Two usernames hashed to the same port (rare). One user needs to set a manual override:
196255

197256
```json
198257
{
199258
"mcpServers": {
200259
"vector-memory": {
201-
"command": "node",
202-
"args": ["<path-to>/index.js"],
260+
"type": "stdio",
261+
"command": "npx",
262+
"args": ["-y", "ghcp-cli-vector-memory-mcp"],
203263
"env": {
204264
"VECTOR_MEMORY_PORT": "31338"
205265
}
@@ -208,8 +268,6 @@ Two usernames hashed to the same port (rare — FNV-1a produces very few collisi
208268
}
209269
```
210270

211-
See [Multi-user setup](#multi-user-setup) for full details.
212-
213271
### Port occupied by another service
214272

215273
**Error:** `Vector memory server failed to start — port XXXXX may be in use by another service`
@@ -243,7 +301,7 @@ The next copilot launch will start the updated server automatically.
243301

244302
**Error:** `Session store not found`
245303

246-
The file `~/.copilot/session-store.db` doesn't exist yet. This is normal on a fresh install — Copilot CLI creates it after your first conversation. Use copilot for a bit, then try again.
304+
The file `~/.copilot/session-store.db` doesn't exist yet. This is normal on a fresh Copilot CLI install — it creates the file after your first conversation. Use Copilot for a bit, then try again.
247305

248306
### Embedding model corrupt
249307

@@ -256,13 +314,16 @@ cat ~/.copilot/vector-memory.pid
256314
kill <PID>
257315
```
258316

259-
If it persists, delete the model cache and reinstall:
317+
If it persists, clear the model cache:
260318

261319
```bash
262320
rm -rf node_modules/@huggingface/transformers/.cache
263-
npm run postinstall
264321
```
265322

323+
The model will re-download on next launch.
324+
325+
---
326+
266327
## License
267328

268329
MIT — see [LICENSE](LICENSE).

0 commit comments

Comments
 (0)