|
| 1 | +--- |
| 2 | +title: Run vMCP locally with the CLI |
| 3 | +description: |
| 4 | + Run Virtual MCP Server locally with the thv vmcp command to aggregate a |
| 5 | + ToolHive group without Kubernetes. |
| 6 | +--- |
| 7 | + |
| 8 | +Virtual MCP Server (vMCP) is usually deployed on Kubernetes through the |
| 9 | +`VirtualMCPServer` custom resource, but you can also run it locally from the |
| 10 | +ToolHive CLI. The `thv vmcp` subcommands aggregate the MCP servers in a local |
| 11 | +[ToolHive group](../concepts/groups.mdx) behind a single endpoint, without a |
| 12 | +cluster or operator. |
| 13 | + |
| 14 | +Use this mode for local development, quick evaluation, or any case where you |
| 15 | +want vMCP's aggregation, tool routing, and optimizer capabilities without the |
| 16 | +operational overhead of Kubernetes. |
| 17 | + |
| 18 | +## When to use the local CLI |
| 19 | + |
| 20 | +- You are developing or evaluating vMCP on your workstation. |
| 21 | +- You run MCP servers locally with `thv run` and want to expose them through a |
| 22 | + single endpoint. |
| 23 | +- You want to use the vMCP [optimizer](./optimizer.mdx) to reduce token usage |
| 24 | + across a local group. |
| 25 | +- You don't yet need the clustered, operator-managed deployment model covered in |
| 26 | + the [Quickstart](./quickstart.mdx). |
| 27 | + |
| 28 | +For production and multi-tenant deployments, use the Kubernetes |
| 29 | +[`VirtualMCPServer`](./quickstart.mdx) resource instead. |
| 30 | + |
| 31 | +## Prerequisites |
| 32 | + |
| 33 | +- ToolHive CLI v0.24.0 or later. Check with `thv version`. |
| 34 | +- A container runtime (Docker, Podman, or OrbStack) available to ToolHive. |
| 35 | +- A ToolHive group with one or more running MCP servers. To create one: |
| 36 | + |
| 37 | + ```bash |
| 38 | + thv group create my-group |
| 39 | + thv run --group my-group fetch |
| 40 | + thv run --group my-group github |
| 41 | + ``` |
| 42 | + |
| 43 | + See [Manage ToolHive groups](../guides-cli/group-management.mdx) for details. |
| 44 | + |
| 45 | +## Subcommands at a glance |
| 46 | + |
| 47 | +The `thv vmcp` command has three subcommands: |
| 48 | + |
| 49 | +| Subcommand | Purpose | |
| 50 | +| ------------------- | ----------------------------------------------------- | |
| 51 | +| `thv vmcp init` | Generate a starter YAML config from a running group | |
| 52 | +| `thv vmcp validate` | Validate a YAML config for syntax and semantic errors | |
| 53 | +| `thv vmcp serve` | Start the aggregated vMCP server | |
| 54 | + |
| 55 | +There are two ways to run the server: |
| 56 | + |
| 57 | +- **Quick mode** uses `thv vmcp serve --group <name>` to generate an in-memory |
| 58 | + config from a group. No YAML file is required. |
| 59 | +- **Config-file mode** uses `thv vmcp init` → edit → `thv vmcp validate` → |
| 60 | + `thv vmcp serve --config vmcp.yaml` for reproducible or customized setups. |
| 61 | + |
| 62 | +## Quick mode |
| 63 | + |
| 64 | +Quick mode is the fastest way to aggregate a local group. Run the server with |
| 65 | +just a group name: |
| 66 | + |
| 67 | +```bash |
| 68 | +thv vmcp serve --group my-group |
| 69 | +``` |
| 70 | + |
| 71 | +By default, the server binds to `127.0.0.1:4483`. Point your MCP client at |
| 72 | +`http://127.0.0.1:4483` to access all tools from the group through a single |
| 73 | +endpoint. |
| 74 | + |
| 75 | +:::note[Loopback-only] |
| 76 | + |
| 77 | +Quick mode always uses anonymous authentication, so `thv vmcp serve --group` |
| 78 | +only accepts loopback bind addresses (`127.0.0.1`, `::1`, `localhost`, or the |
| 79 | +default empty value). Binding to a non-loopback interface is rejected to avoid |
| 80 | +exposing an unauthenticated server on the network. To bind to a non-loopback |
| 81 | +address, use [config-file mode](#config-file-mode) and configure client |
| 82 | +authentication. |
| 83 | + |
| 84 | +::: |
| 85 | + |
| 86 | +### Enable the optimizer in quick mode |
| 87 | + |
| 88 | +Add `--optimizer` or `--optimizer-embedding` to replace the full tool list with |
| 89 | +`find_tool` and `call_tool` primitives: |
| 90 | + |
| 91 | +```bash |
| 92 | +# Tier 1: FTS5 keyword search (no external container) |
| 93 | +thv vmcp serve --group my-group --optimizer |
| 94 | + |
| 95 | +# Tier 2: FTS5 + semantic search using a managed TEI container |
| 96 | +thv vmcp serve --group my-group --optimizer-embedding |
| 97 | +``` |
| 98 | + |
| 99 | +See [Optimizer tiers](#optimizer-tiers) for the full comparison. |
| 100 | + |
| 101 | +## Config-file mode |
| 102 | + |
| 103 | +Config-file mode is recommended when you need to customize backend settings, |
| 104 | +authentication, or aggregation rules, or when you want a reproducible setup |
| 105 | +checked into version control. |
| 106 | + |
| 107 | +### Step 1: Generate a starter config |
| 108 | + |
| 109 | +`thv vmcp init` discovers running workloads in a group and writes a starter YAML |
| 110 | +file with one backend entry per accessible workload: |
| 111 | + |
| 112 | +```bash |
| 113 | +thv vmcp init --group my-group --output vmcp.yaml |
| 114 | +``` |
| 115 | + |
| 116 | +Omit `--output` to write the generated YAML to standard output instead. |
| 117 | + |
| 118 | +The generated file includes inline comments describing each section. A minimal |
| 119 | +example looks like this: |
| 120 | + |
| 121 | +```yaml title="vmcp.yaml" |
| 122 | +# Generated by `thv vmcp init`. Review and customize before use. |
| 123 | + |
| 124 | +name: my-group-vmcp |
| 125 | +groupRef: my-group |
| 126 | + |
| 127 | +incomingAuth: |
| 128 | + type: anonymous |
| 129 | + |
| 130 | +outgoingAuth: |
| 131 | + source: inline |
| 132 | + |
| 133 | +aggregation: |
| 134 | + conflictResolution: prefix |
| 135 | + conflictResolutionConfig: |
| 136 | + prefixFormat: '{workload}_' |
| 137 | + |
| 138 | +backends: |
| 139 | + - name: fetch |
| 140 | + url: http://127.0.0.1:12345/sse |
| 141 | + transport: sse |
| 142 | + - name: github |
| 143 | + url: http://127.0.0.1:12346/mcp |
| 144 | + transport: streamable-http |
| 145 | +``` |
| 146 | +
|
| 147 | +### Step 2: Review and edit |
| 148 | +
|
| 149 | +Customize the generated config. Common edits include: |
| 150 | +
|
| 151 | +- Changing `incomingAuth` from `anonymous` to `oidc` to require authenticated |
| 152 | + clients. |
| 153 | +- Adding tool filters, renames, or overrides under each backend. |
| 154 | +- Configuring the [optimizer](./optimizer.mdx) under an `optimizer` section. |
| 155 | + |
| 156 | +See [Configure vMCP](./configuration.mdx) for the full schema. |
| 157 | + |
| 158 | +### Step 3: Validate the config |
| 159 | + |
| 160 | +```bash |
| 161 | +thv vmcp validate --config vmcp.yaml |
| 162 | +``` |
| 163 | + |
| 164 | +Validation checks YAML syntax, required fields, middleware configuration, and |
| 165 | +backend settings. It exits `0` on success and non-zero with a descriptive |
| 166 | +message otherwise. |
| 167 | + |
| 168 | +### Step 4: Start the server |
| 169 | + |
| 170 | +```bash |
| 171 | +thv vmcp serve --config vmcp.yaml |
| 172 | +``` |
| 173 | + |
| 174 | +When both `--config` and `--group` are set, `--config` takes precedence. |
| 175 | + |
| 176 | +## Optimizer tiers |
| 177 | + |
| 178 | +`thv vmcp serve` supports four tiers of tool optimization. Tier 0 is the |
| 179 | +default; tiers 1 through 3 replace the full backend tool list with `find_tool` |
| 180 | +and `call_tool` primitives that search the aggregated tool set. Tier 1 uses FTS5 |
| 181 | +keyword search only; tiers 2 and 3 add semantic embeddings on top for hybrid |
| 182 | +search. |
| 183 | + |
| 184 | +| Tier | Flag or setting | Search | External service | |
| 185 | +| ---- | ------------------------------------------- | ------------------------------- | ----------------------------- | |
| 186 | +| 0 | (none) | None - all tools passed through | None | |
| 187 | +| 1 | `--optimizer` | FTS5 keyword (in-process) | None | |
| 188 | +| 2 | `--optimizer-embedding` | FTS5 + TEI semantic | Managed TEI container | |
| 189 | +| 3 | `optimizer.embeddingService` in config YAML | FTS5 + external embedding | User-managed embedding server | |
| 190 | + |
| 191 | +Tier 2 implies Tier 1: `--optimizer-embedding` also enables the keyword index. |
| 192 | +For Tier 2, ToolHive starts and stops a HuggingFace Text Embeddings Inference |
| 193 | +(TEI) container named `thv-embedding-<hash>` automatically. Customize the model |
| 194 | +and image with `--embedding-model` and `--embedding-image`. |
| 195 | + |
| 196 | +For the conceptual background and tuning parameters, see |
| 197 | +[Optimize tool discovery](./optimizer.mdx) and |
| 198 | +[Tool optimization](../concepts/tool-optimization.mdx). |
| 199 | + |
| 200 | +## Enable audit logging |
| 201 | + |
| 202 | +Add `--enable-audit` to `thv vmcp serve` to turn on audit logging with default |
| 203 | +settings when the loaded config doesn't already define an audit section: |
| 204 | + |
| 205 | +```bash |
| 206 | +thv vmcp serve --group my-group --enable-audit |
| 207 | +``` |
| 208 | + |
| 209 | +For audit configuration options, see [Audit logging](./audit-logging.mdx). |
| 210 | + |
| 211 | +## Command reference |
| 212 | + |
| 213 | +All `thv vmcp` flags, with their defaults: |
| 214 | + |
| 215 | +### `thv vmcp serve` |
| 216 | + |
| 217 | +| Flag | Default | Description | |
| 218 | +| ----------------------- | ---------------------------------------------------------- | -------------------------------------------------------------------- | |
| 219 | +| `--config`, `-c` | (empty) | Path to a vMCP configuration file | |
| 220 | +| `--group` | (empty) | ToolHive group name for quick mode (used when `--config` is not set) | |
| 221 | +| `--host` | `127.0.0.1` | Bind address (quick mode requires a loopback address) | |
| 222 | +| `--port` | `4483` | TCP port to listen on | |
| 223 | +| `--enable-audit` | `false` | Enable audit logging with default configuration | |
| 224 | +| `--optimizer` | `false` | Enable Tier 1 FTS5 keyword optimizer | |
| 225 | +| `--optimizer-embedding` | `false` | Enable Tier 2 semantic optimizer (implies `--optimizer`) | |
| 226 | +| `--embedding-model` | `BAAI/bge-small-en-v1.5` | HuggingFace model name for the managed TEI container | |
| 227 | +| `--embedding-image` | `ghcr.io/huggingface/text-embeddings-inference:cpu-latest` | TEI container image | |
| 228 | + |
| 229 | +### `thv vmcp init` |
| 230 | + |
| 231 | +| Flag | Default | Description | |
| 232 | +| ---------------- | ---------- | -------------------------------------------------- | |
| 233 | +| `--group`, `-g` | (required) | ToolHive group name whose workloads are discovered | |
| 234 | +| `--output`, `-o` | stdout | Output file path for the generated config | |
| 235 | +| `--config`, `-c` | stdout | Alias for `--output` | |
| 236 | + |
| 237 | +### `thv vmcp validate` |
| 238 | + |
| 239 | +| Flag | Default | Description | |
| 240 | +| ---------------- | ---------- | ----------------------------------------------- | |
| 241 | +| `--config`, `-c` | (required) | Path to the vMCP configuration file to validate | |
| 242 | + |
| 243 | +For full CLI help, run `thv vmcp --help` or see |
| 244 | +[`thv vmcp`](../reference/cli/thv_vmcp.md) in the reference. |
| 245 | + |
| 246 | +## Compared to the Kubernetes deployment |
| 247 | + |
| 248 | +| Aspect | Local CLI (`thv vmcp`) | Kubernetes (`VirtualMCPServer` CRD) | |
| 249 | +| ----------------- | ---------------------------------------------- | ----------------------------------------- | |
| 250 | +| Runtime | Foreground process | Pod managed by the operator | |
| 251 | +| Configuration | CLI flags or local YAML file | `VirtualMCPServer` custom resource | |
| 252 | +| Backend discovery | Reads ToolHive groups on the local machine | Reads `MCPGroup` resources in the cluster | |
| 253 | +| Authentication | Anonymous in quick mode; configurable in files | Full OIDC integration via CRD fields | |
| 254 | +| Lifecycle | Tied to the terminal session | Managed declaratively, survives restarts | |
| 255 | +| Embedding server | Managed TEI container (Tier 2) | `EmbeddingServer` custom resource | |
| 256 | + |
| 257 | +The underlying aggregation, tool routing, and optimizer logic are the same. Use |
| 258 | +the local CLI for development and single-user workflows; use the Kubernetes |
| 259 | +deployment for shared, production, or multi-user environments. |
| 260 | + |
| 261 | +## Next steps |
| 262 | + |
| 263 | +- [Configure vMCP](./configuration.mdx) to customize backends, authentication, |
| 264 | + and aggregation rules. |
| 265 | +- [Optimize tool discovery](./optimizer.mdx) to tune `find_tool` and `call_tool` |
| 266 | + for large toolsets. |
| 267 | +- [Deploy vMCP on Kubernetes](./quickstart.mdx) when you're ready to move to a |
| 268 | + production-grade deployment. |
| 269 | + |
| 270 | +## Related information |
| 271 | + |
| 272 | +- [Understanding Virtual MCP Server](../concepts/vmcp.mdx) |
| 273 | +- [Manage ToolHive groups](../guides-cli/group-management.mdx) |
| 274 | +- [`thv vmcp` CLI reference](../reference/cli/thv_vmcp.md) |
0 commit comments