Skip to content

Commit dfa1f98

Browse files
committed
Add local MCP server for the docs corpus
Add tools/mcp: a local, stdio Model Context Protocol server that exposes the OpenVox documentation to MCP-aware tools (Claude Code, Cursor, Claude Desktop) so an assistant can search and read the docs in-workflow. It is the self-hosted counterpart to a hosted "Ask AI" widget: no third party, no API key, no quota, and queries stay on the machine. The corpus is the site's llms.txt / llms-full.txt files, which already ship as machine-readable plain text with stable per-project and per-doc delimiters, so the server parses them into one document per page without scraping HTML. - corpus.py: fetch llms.txt / llms-full.txt from the live site (default) with on-disk caching and conditional requests, falling back to the last good copy when offline; OPENVOX_DOCS_SOURCE reads a local _site build instead. Parses both into structured Doc records. - search.py: BM25 keyword search (rank-bm25) over the parsed bodies, with the title weighted and a query-centered snippet. - server.py: FastMCP app exposing list_projects, list_docs, search_docs, get_doc, and refresh_corpus. Exclude tools/ from the Jekyll build so the server isn't published with the site. Signed-off-by: Michael Harp <mike@mikeharp.com>
1 parent a4a95d6 commit dfa1f98

17 files changed

Lines changed: 1381 additions & 0 deletions

File tree

.github/workflows/mcp.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: MCP server
3+
4+
on:
5+
pull_request:
6+
paths:
7+
- 'tools/mcp/**'
8+
workflow_call: {}
9+
10+
permissions:
11+
contents: read
12+
13+
jobs:
14+
test:
15+
name: Python tests (${{ matrix.python-version }})
16+
runs-on: ubuntu-24.04
17+
strategy:
18+
fail-fast: false
19+
matrix:
20+
python-version: ['3.10', '3.13']
21+
defaults:
22+
run:
23+
working-directory: tools/mcp
24+
steps:
25+
- name: Checkout current PR
26+
uses: actions/checkout@v7
27+
- name: Install Python ${{ matrix.python-version }}
28+
uses: actions/setup-python@v5
29+
with:
30+
python-version: ${{ matrix.python-version }}
31+
cache: pip
32+
cache-dependency-path: tools/mcp/pyproject.toml
33+
- name: Install with dev extras
34+
run: pip install -e '.[dev]'
35+
- name: Run pytest
36+
run: pytest
37+
- name: Run the stdio smoke test
38+
run: python smoke_test.py

_config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ exclude:
173173
- Rakefile
174174
- README_WRITING.markdown
175175
- README.markdown
176+
- tools
176177
- util
177178
- vendor
178179
- WORKFLOW.md

tools/mcp/.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
.venv/
2+
.sample-corpus/
3+
__pycache__/
4+
.pytest_cache/
5+
*.egg-info/
6+
dist/
7+
build/
8+
.coverage
9+
htmlcov/

tools/mcp/README.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# OpenVox Docs MCP server
2+
3+
[![MCP server](https://github.com/OpenVoxProject/openvox-docs/actions/workflows/mcp.yml/badge.svg)](https://github.com/OpenVoxProject/openvox-docs/actions/workflows/mcp.yml)
4+
5+
A local [Model Context Protocol](https://modelcontextprotocol.io/) server that
6+
exposes the OpenVox documentation to MCP-aware tools (Claude Code, Cursor, Claude
7+
Desktop, …) so an assistant can search and read the docs in-workflow.
8+
9+
It is the local, self-hosted counterpart to a hosted "Ask AI" widget: no third
10+
party, no API key, no usage quota, and queries never leave your machine.
11+
12+
![Claude Code answering a question by calling the openvox-docs MCP server](demo/demo.gif)
13+
14+
> Recorded with [VHS](https://github.com/charmbracelet/vhs) from
15+
> [`demo/demo.tape`](demo/demo.tape); regenerate with `vhs tools/mcp/demo/demo.tape`.
16+
17+
## How it works
18+
19+
The server's corpus is the two machine-readable files the docs site publishes
20+
(see the repository README's "LLM-friendly documentation files" section):
21+
22+
- [`llms.txt`](https://docs.openvoxproject.org/llms.txt) — the index of pages,
23+
grouped by project.
24+
- [`llms-full.txt`](https://docs.openvoxproject.org/llms-full.txt) — the full
25+
text of every current ("latest") page, including the generated reference pages
26+
(configuration, function, type, man pages).
27+
28+
On first use it fetches both from `https://docs.openvoxproject.org`, caches them
29+
under `~/.cache/openvox-docs-mcp/` with conditional requests (so refreshes are
30+
cheap and it still works offline from the last copy), parses them into one
31+
document per page, and builds a BM25 keyword index.
32+
33+
## Tools
34+
35+
| Tool | Purpose |
36+
|------|---------|
37+
| `list_projects` | List the documentation projects (openvox, openvox-server, …). |
38+
| `list_docs(project?)` | List pages as `{project, title, url}`, optionally filtered to one project. |
39+
| `search_docs(query, project?, limit=5)` | BM25 keyword search; returns `{title, url, project, score, snippet}`. |
40+
| `get_doc(ref, max_chars=40000)` | Full text of one page, resolved by URL, URL path, or exact title. The body is capped at `max_chars` (the single-page function/type references are very large); raise it to fetch more. |
41+
| `refresh_corpus()` | Force a re-fetch of the llms files and rebuild the index. |
42+
43+
## Install
44+
45+
Requires Python 3.10+.
46+
47+
```console
48+
cd tools/mcp
49+
python -m venv .venv && . .venv/bin/activate
50+
pip install -e .
51+
```
52+
53+
## Register with an MCP client
54+
55+
All clients launch the same stdio entry point. Use the **absolute path** to the
56+
installed executable — `tools/mcp/.venv/bin/openvox-docs-mcp` after the install
57+
above (substitute your checkout path). Each client's config also accepts an `env`
58+
block if you want to set the variables from [Configuration](#configuration)
59+
(for example `OPENVOX_DOCS_SOURCE`).
60+
61+
### Claude Code
62+
63+
```console
64+
claude mcp add openvox-docs -- /path/to/tools/mcp/.venv/bin/openvox-docs-mcp
65+
```
66+
67+
Or add it to a project-scoped `.mcp.json`:
68+
69+
```json
70+
{
71+
"mcpServers": {
72+
"openvox-docs": {
73+
"command": "/path/to/tools/mcp/.venv/bin/openvox-docs-mcp"
74+
}
75+
}
76+
}
77+
```
78+
79+
### Cursor
80+
81+
Add it to `~/.cursor/mcp.json` (global) or `.cursor/mcp.json` (project). Cursor
82+
uses the same `mcpServers` schema as Claude Code:
83+
84+
```json
85+
{
86+
"mcpServers": {
87+
"openvox-docs": {
88+
"command": "/path/to/tools/mcp/.venv/bin/openvox-docs-mcp"
89+
}
90+
}
91+
}
92+
```
93+
94+
### GitHub Copilot (VS Code)
95+
96+
Add it to `.vscode/mcp.json` in your workspace (or run **MCP: Add Server** from
97+
the Command Palette). VS Code uses a top-level `servers` key, and stdio is the
98+
default for a `command`:
99+
100+
```json
101+
{
102+
"servers": {
103+
"openvox-docs": {
104+
"type": "stdio",
105+
"command": "/path/to/tools/mcp/.venv/bin/openvox-docs-mcp"
106+
}
107+
}
108+
}
109+
```
110+
111+
Or from the command line:
112+
113+
```console
114+
code --add-mcp '{"name":"openvox-docs","command":"/path/to/tools/mcp/.venv/bin/openvox-docs-mcp"}'
115+
```
116+
117+
### Codex CLI
118+
119+
Add it to `~/.codex/config.toml` (or a project-scoped `.codex/config.toml`).
120+
Codex uses a `mcp_servers` TOML table:
121+
122+
```toml
123+
[mcp_servers.openvox-docs]
124+
command = "/path/to/tools/mcp/.venv/bin/openvox-docs-mcp"
125+
```
126+
127+
## Testing
128+
129+
Install the dev extras, then run the suite:
130+
131+
```console
132+
pip install -e '.[dev]'
133+
pytest
134+
```
135+
136+
`pytest` reports coverage and fails under 90% (configured in `pyproject.toml`).
137+
The unit tests cover corpus parsing, remote fetch/caching (offline fallback and
138+
304 reuse), BM25 ranking, and every tool. For an end-to-end check that launches
139+
the server over stdio and exercises all five tools against a throwaway corpus
140+
(network- and model-free):
141+
142+
```console
143+
python smoke_test.py
144+
```
145+
146+
## Configuration
147+
148+
The server is configured entirely through environment variables:
149+
150+
| Variable | Default | Effect |
151+
|----------|---------|--------|
152+
| `OPENVOX_DOCS_BASE_URL` | `https://docs.openvoxproject.org` | Site to fetch the llms files from. |
153+
| `OPENVOX_DOCS_SOURCE` | _(unset)_ | Read `llms.txt` / `llms-full.txt` from a local directory instead of fetching. Point it at a Jekyll `_site/` build for offline or pre-release use. |
154+
155+
For example, to run against a local build of this repo:
156+
157+
```console
158+
bundle exec jekyll build
159+
OPENVOX_DOCS_SOURCE="$PWD/_site" openvox-docs-mcp
160+
```
161+
162+
> **Note:** until the `llms.txt` / `llms-full.txt` feature is deployed to the live
163+
> site, set `OPENVOX_DOCS_SOURCE` to a local `_site/` build — the live URLs will
164+
> 404 otherwise.

tools/mcp/demo/demo.gif

364 KB
Loading

tools/mcp/demo/demo.tape

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# VHS tape: the OpenVox docs MCP server used live from Claude Code.
2+
#
3+
# Prereqs:
4+
# - vhs installed (https://github.com/charmbracelet/vhs)
5+
# - the server registered with Claude Code (see ../README.md), e.g. pointing at
6+
# a local corpus via OPENVOX_DOCS_SOURCE
7+
# - the mcp__openvox-docs__* tools allowed (so the run isn't blocked on a
8+
# permission prompt)
9+
#
10+
# Render from the repo root:
11+
# vhs tools/mcp/demo/demo.tape
12+
13+
Output tools/mcp/demo/demo.gif
14+
15+
Require claude
16+
17+
Set Shell zsh
18+
Set FontSize 16
19+
Set Width 1200
20+
Set Height 860
21+
Set Padding 18
22+
Set Theme "Catppuccin Mocha"
23+
Set Framerate 12
24+
Set TypingSpeed 50ms
25+
26+
# Quietly land in the repo root with a clean screen.
27+
Hide
28+
Type "cd /Users/michaelharp/projects/openvox-docs && clear"
29+
Enter
30+
Sleep 1s
31+
Show
32+
33+
Type "claude"
34+
Sleep 1s
35+
Enter
36+
Sleep 8s
37+
38+
Type@35ms "Using the openvox-docs MCP server's search_docs tool, find the OpenVoxDB PostgreSQL configuration page. Reply with only the page title and URL on one line."
39+
Sleep 1s
40+
Enter
41+
42+
# Wait for the model to call search_docs and write its (one-line) answer.
43+
Sleep 35s
44+
45+
# Hold on the finished answer.
46+
Sleep 3s
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
"""Local MCP server exposing the OpenVox documentation corpus."""
2+
3+
from .server import main, mcp
4+
5+
__all__ = ["main", "mcp"]
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
from .server import main
2+
3+
if __name__ == "__main__":
4+
main()

0 commit comments

Comments
 (0)