You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v1.4.0: npx install, non-blocking startup, server instructions
- README: step-by-step install guide with explicit mcp-config.json paths per OS
- README: first-run warmup warning, lightweight CPU note, no build tools needed
- README: all config examples use npx format (no more manual node paths)
- package.json: add keywords, homepage, bump to v1.4.0
- package.json: add zod as direct dep (was transitive-only, broke npx installs)
- package.json: bump better-sqlite3 to ^12.1.0 (Node 24 prebuilds)
- index.js: non-blocking startup — MCP transport connects immediately, server
warms up in background (fixes first-run timeout on slow connections)
- index.js: tools wait up to 5 min for server on first run instead of failing
- index.js: add MCP server instructions field with vector search usage guide
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
An [MCP](https://modelcontextprotocol.io/) server that adds semantic vector search to [**GitHub Copilot CLI**](https://docs.github.com/en/copilot/github-copilot-in-the-cli)(`github-copilot-cli`). Gives Copilot persistent long-term memory across sessions using local embeddings and vector search.
3
+
An [MCP](https://modelcontextprotocol.io/) server that adds **persistent long-term memory** to [**GitHub Copilot CLI**](https://docs.github.com/en/copilot/github-copilot-in-the-cli)via local semantic vector search. Copilot can recall past conversations, code changes, and decisions across all sessions — by meaning, not just keywords.
4
4
5
5
> **Note:** This is a community project and is not affiliated with or endorsed by GitHub. [GitHub Copilot CLI](https://docs.github.com/en/copilot/github-copilot-in-the-cli) is a product of GitHub / Microsoft.
-**index.js** — Thin STDIO MCP proxy. One per copilot instance. Checks if the HTTP server is running, launches it if not, then ferries tool calls over HTTP.
18
-
-**vector-memory-server.js** — Singleton HTTP server. Owns the embedding model (one copy in memory), SQLite vector DB, and background indexing. Port is auto-assigned per user via a deterministic hash of the username.
19
-
-**embed-worker.js** — Worker thread that loads the ONNX model and handles embedding inference off the main thread.
20
-
-**lib.js** — Pure logic extracted for testability: filtering, dedup, post-processing, process detection.
11
+
### Prerequisites
21
12
22
-
### Key design decisions
13
+
You need **Node.js ≥18** installed. This gives you `node`, `npm`, and `npx`.
23
14
24
-
-**Singleton**: Only one server runs regardless of how many copilot instances are open. Saves ~200MB RAM per additional instance.
25
-
-**Race condition hardened**: EADDRINUSE detection with full diagnostics — distinguishes between healthy singleton, zombie process, and foreign port conflict.
26
-
-**No duplicates**: `UNIQUE` constraint + `INSERT OR IGNORE` + `isIndexing` guard prevents duplicate embeddings even under concurrent access.
27
-
-**Lazy init**: ONNX model only loads after winning the singleton race. Losers exit in ~500ms.
28
-
-**Idle shutdown**: Server exits after 5 minutes of inactivity (no requests and no new session content). The proxy re-launches it on next use.
29
-
-**Self-healing**: Detects and deletes corrupt/truncated model files, re-downloads automatically. Retries with backoff for Windows Defender file locks.
15
+
-**Windows:**`winget install OpenJS.NodeJS.LTS`
16
+
-**macOS:**`brew install node` or download from [nodejs.org](https://nodejs.org)
17
+
-**Linux:** Use your package manager or [nodejs.org](https://nodejs.org)
30
18
31
-
## Prerequisites
19
+
That's it. The native SQLite modules (`better-sqlite3`, `sqlite-vec`) ship prebuilt binaries for Windows (x64), macOS (x64, ARM), and Linux (x64, ARM) — no compiler or build tools needed.
32
20
33
-
| Requirement | Version | Install |
34
-
|---|---|---|
35
-
| Node.js | ≥18.x |`winget install OpenJS.NodeJS.LTS` or [nodejs.org](https://nodejs.org)|
| C++ compiler¹ | — | Visual Studio Build Tools with "Desktop C++" workload |
21
+
> <details><summary>Build tools only needed if prebuilds aren't available for your platform</summary>
22
+
>
23
+
> If you're on an unusual platform and the prebuilt binaries aren't available, `better-sqlite3` falls back to compiling from source. In that case you'll need:
24
+
> -**Windows:**[Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) with the "Desktop development with C++" workload
The ONNX embedding model (`Xenova/gte-small`, ~34MB) downloads automatically on first run.
39
+
> **If this file doesn't exist yet**, create it. If the `.copilot` folder doesn't exist either, create that too — Copilot CLI will use it.
40
+
>
41
+
> You can also place a **project-level** config at `.copilot/mcp-config.json` in any repo root, but user-level is recommended for this server since it provides memory across all projects.
50
42
51
-
### Register with Copilot CLI
43
+
### Step 2: Add the vector-memory server
52
44
53
-
Add to `~/.copilot/mcp-config.json`:
45
+
**If the file doesn't exist or is empty**, create it with this content:
|`VECTOR_MEMORY_PORT`|*(auto)*| HTTP port for the singleton server. By default, a deterministic port is computed from your OS username (FNV-1a hash, range 31337–35432). Only set this if two users happen to collide. |
71
-
|`VECTOR_MEMORY_IDLE_TIMEOUT`|`5`| Minutes of inactivity before the server shuts down. `0` or negative = never shut down. |
72
-
73
-
Set these in the `env` block of `mcp-config.json` (only if needed):
59
+
**If you already have an `mcp-config.json`** with other servers, add the `"vector-memory"` entry inside the existing `"mcpServers"` object:
> **You do not need to clone this repo or run `npm install` yourself.** The `npx -y` command automatically downloads, installs, and runs the package from the npm registry. It caches the package locally so subsequent launches are fast.
90
75
91
-
Each user automatically gets a deterministic port based on their OS username (FNV-1a hash with rotation, range 31337–35432). This means multi-user setups work **zero-config** — no manual port assignment needed.
76
+
### Step 3: Restart Copilot CLI
92
77
93
-
In the rare case of a hash collision (two usernames mapping to the same port), the server detects the conflict at startup and tells the affected user to set `VECTOR_MEMORY_PORT` manually.
78
+
Close any running Copilot CLI session and start a new one. The MCP server will launch automatically in the background.
94
79
95
-
#### Multi-user setup
80
+
> [!IMPORTANT]
81
+
> **The very first launch takes a few minutes.** On first run, `npx` installs the package and its
82
+
> native dependencies, then the server downloads a small machine learning model (~34 MB,
83
+
> [Xenova/gte-small](https://huggingface.co/Xenova/gte-small)). This is a **one-time cost** —
84
+
> subsequent starts are near-instant.
85
+
>
86
+
> The MCP proxy connects immediately and won't block Copilot CLI from starting. If you try to
87
+
> use vector search before the model is ready, it will tell you it's still warming up.
96
88
97
-
On a shared machine, each user's server runs on their auto-assigned port. Just point both configs at the same codebase — no `env` block needed:
89
+
> [!NOTE]
90
+
> **Runs comfortably on any laptop.** The ONNX embedding model is tiny (~34 MB in memory) and
91
+
> inference is fast even on CPU. There is no GPU requirement. You will not notice any impact on
92
+
> battery life or system performance. The server also idles down and exits automatically after
93
+
> 5 minutes of inactivity, so it costs zero resources when you're not using Copilot.
98
94
99
-
**User A** (`~/.copilot/mcp-config.json`):
100
-
```json
101
-
{
102
-
"mcpServers": {
103
-
"vector-memory": {
104
-
"command": "node",
105
-
"args": ["D:/shared/vector-memory/index.js"]
106
-
}
107
-
}
108
-
}
95
+
### Step 4: Verify it's working
96
+
97
+
In a new Copilot CLI session, ask:
98
+
99
+
```
100
+
Do you have vector search available?
109
101
```
110
102
111
-
**User B** (`~/.copilot/mcp-config.json`) — same config, different port automatically:
103
+
Copilot should confirm it has the `vector_search` and `vector_reindex` tools. If it's the first launch and the model is still downloading, it will tell you — just wait a minute and try again.
104
+
105
+
---
106
+
107
+
## What it does
108
+
109
+
Once installed, Copilot CLI gains two new tools:
110
+
111
+
| Tool | Description |
112
+
|---|---|
113
+
|`vector_search`| Semantic search across all past session history. Find conversations, code changes, and decisions by meaning — not just keywords. Returns ranked results with similarity scores. |
114
+
|`vector_reindex`| Force a full rebuild of the vector index. Normally not needed — search auto-indexes new content. Use if the index seems stale. |
115
+
116
+
Copilot will use `vector_search` automatically when it needs to recall past context. You can also prompt it directly: *"search your memory for..."* or *"do you remember when we..."*
117
+
118
+
### Data flow
119
+
120
+
1. Copilot CLI writes session data to `~/.copilot/session-store.db` (this already exists)
121
+
2. vector-memory reads from that DB (read-only) and creates embeddings
122
+
3. Embeddings are stored in `~/.copilot/vector-index.db`
123
+
4. Indexing triggers: on startup, on each search (if new content exists), and every 15 minutes
124
+
125
+
All data stays local. Nothing is sent to any external service.
126
+
127
+
---
128
+
129
+
## Configuration
130
+
131
+
### Environment variables
132
+
133
+
| Variable | Default | Description |
134
+
|---|---|---|
135
+
|`VECTOR_MEMORY_PORT`|*(auto)*| HTTP port for the singleton server. A deterministic port is computed from your OS username (FNV-1a hash, range 31337–35432). Only set this if two users collide. |
136
+
|`VECTOR_MEMORY_IDLE_TIMEOUT`|`5`| Minutes of inactivity before the server shuts down. `0` or negative = never shut down. |
137
+
138
+
Set these in the `env` block of your config (only if needed):
139
+
112
140
```json
113
141
{
114
142
"mcpServers": {
115
143
"vector-memory": {
116
-
"command": "node",
117
-
"args": ["D:/shared/vector-memory/index.js"]
144
+
"type": "stdio",
145
+
"command": "npx",
146
+
"args": ["-y", "ghcp-cli-vector-memory-mcp"],
147
+
"env": {
148
+
"VECTOR_MEMORY_IDLE_TIMEOUT": "10"
149
+
}
118
150
}
119
151
}
120
152
}
121
153
```
122
154
123
-
Each user gets their own singleton server, their own vector index, and their own session history. The idle timeout ensures servers shut down automatically when not in use.
155
+
### Multi-user setup
124
156
125
-
## Tools
157
+
On a shared machine, each user's server runs on a unique auto-assigned port. No extra config needed — just use the same `mcp-config.json` entry above and each user gets their own singleton server, vector index, and session history.
126
158
127
-
| Tool | Description |
128
-
|---|---|
129
-
|`vector_search`| Semantic search across all past session history. Returns ranked results with similarity scores. |
130
-
|`vector_reindex`| Force a full rebuild of the vector index. Normally not needed — search auto-indexes new content. |
159
+
In the rare case of a port hash collision, the server detects it at startup and tells the affected user to set `VECTOR_MEMORY_PORT` manually.
131
160
132
-
## Data flow
161
+
---
133
162
134
-
1. Copilot CLI writes session data to `~/.copilot/session-store.db` (FTS5 search index)
135
-
2. vector-memory reads from that DB (read-only) and creates embeddings
136
-
3. Embeddings are stored in `~/.copilot/vector-index.db` (sqlite-vec)
137
-
4. Indexing triggers: on startup, on each search (if idle), and every 15 minutes
-**index.js** — Thin STDIO MCP proxy. One per copilot instance. Checks if the HTTP server is running, launches it if not, then ferries tool calls over HTTP.
174
+
-**vector-memory-server.js** — Singleton HTTP server. Owns the embedding model (one copy in memory), SQLite vector DB, and background indexing. Port is auto-assigned per user via a deterministic hash of the username.
175
+
-**embed-worker.js** — Worker thread that loads the ONNX model and handles embedding inference off the main thread.
176
+
-**lib.js** — Pure logic extracted for testability: filtering, dedup, post-processing, process detection.
177
+
178
+
### Key design decisions
179
+
180
+
-**Singleton**: Only one server runs regardless of how many copilot instances are open. Saves ~200MB RAM per additional instance.
181
+
-**Race condition hardened**: EADDRINUSE detection with full diagnostics — distinguishes between healthy singleton, zombie process, and foreign port conflict.
182
+
-**No duplicates**: `UNIQUE` constraint + `INSERT OR IGNORE` + `isIndexing` guard prevents duplicate embeddings even under concurrent access.
183
+
-**Lazy init**: ONNX model only loads after winning the singleton race. Losers exit in about 500ms.
184
+
-**Idle shutdown**: Server exits after 5 minutes of inactivity (no requests and no new session content). The proxy re-launches it on next use.
185
+
-**Self-healing**: Detects and deletes corrupt/truncated model files, re-downloads automatically. Retries with backoff for Windows Defender file locks.
186
+
187
+
## Development
138
188
139
-
## Scripts
189
+
###Scripts
140
190
141
191
```bash
142
192
npm run lint # ESLint on all source files
143
193
npm test# 38 unit tests (node:test, zero external deps)
144
194
npm run check # lint + test
145
195
```
146
196
147
-
## Running tests
197
+
###Running tests
148
198
149
199
```bash
150
-
cd~/.copilot/mcp-servers/vector-memory
151
200
npm test
152
201
```
153
202
@@ -157,7 +206,18 @@ With coverage:
157
206
node --test --experimental-test-coverage test.js
158
207
```
159
208
160
-
## Manual server management
209
+
### File overview
210
+
211
+
| File | Purpose |
212
+
|---|---|
213
+
|`index.js`| STDIO MCP proxy — what copilot.exe launches via npx |
2. Download the embedding model (~34 MB from Hugging Face)
247
+
248
+
This can take **2–5 minutes** depending on your connection speed and whether native compilation is needed. Subsequent launches start in seconds.
249
+
191
250
### Port collision with another user
192
251
193
252
**Error:**`Port 31796 is owned by user "X" (expected "Y")`
194
253
195
-
Two usernames hashed to the same port (rare — FNV-1a produces very few collisions). One user needs to set a manual override. Edit `~/.copilot/mcp-config.json` and add a `VECTOR_MEMORY_PORT` env var:
254
+
Two usernames hashed to the same port (rare). One user needs to set a manual override:
196
255
197
256
```json
198
257
{
199
258
"mcpServers": {
200
259
"vector-memory": {
201
-
"command": "node",
202
-
"args": ["<path-to>/index.js"],
260
+
"type": "stdio",
261
+
"command": "npx",
262
+
"args": ["-y", "ghcp-cli-vector-memory-mcp"],
203
263
"env": {
204
264
"VECTOR_MEMORY_PORT": "31338"
205
265
}
@@ -208,8 +268,6 @@ Two usernames hashed to the same port (rare — FNV-1a produces very few collisi
208
268
}
209
269
```
210
270
211
-
See [Multi-user setup](#multi-user-setup) for full details.
212
-
213
271
### Port occupied by another service
214
272
215
273
**Error:**`Vector memory server failed to start — port XXXXX may be in use by another service`
@@ -243,7 +301,7 @@ The next copilot launch will start the updated server automatically.
243
301
244
302
**Error:**`Session store not found`
245
303
246
-
The file `~/.copilot/session-store.db` doesn't exist yet. This is normal on a fresh install — Copilot CLI creates it after your first conversation. Use copilot for a bit, then try again.
304
+
The file `~/.copilot/session-store.db` doesn't exist yet. This is normal on a fresh Copilot CLI install — it creates the file after your first conversation. Use Copilot for a bit, then try again.
0 commit comments