|
| 1 | +# Using OSS Models via Ollama, vLLM, and other LLM servers |
| 2 | + |
| 3 | +VibePod can connect agents to external LLM servers that expose OpenAI- or Anthropic-compatible APIs. This lets you run agents like Claude Code and Codex against open-source models served by [Ollama](https://ollama.com), [vLLM](https://docs.vllm.ai), or any compatible endpoint. |
| 4 | + |
| 5 | +## Supported agents |
| 6 | + |
| 7 | +| Agent | Env vars injected | CLI flags appended | |
| 8 | +|-------|------------------|--------------------| |
| 9 | +| claude | `ANTHROPIC_BASE_URL`, `ANTHROPIC_API_KEY` | `--model <model>` | |
| 10 | +| codex | `CODEX_OSS_BASE_URL` | `--oss -m <model>` | |
| 11 | + |
| 12 | +Other agents do not yet have LLM mapping and will not receive any LLM configuration. |
| 13 | + |
| 14 | +## Quick start with Ollama |
| 15 | + |
| 16 | +### 1. Start Ollama and pull a model |
| 17 | + |
| 18 | +```bash |
| 19 | +ollama pull qwen3:14b |
| 20 | +``` |
| 21 | + |
| 22 | +### 2. Configure VibePod |
| 23 | + |
| 24 | +Add the following to your global or project config: |
| 25 | + |
| 26 | +```yaml |
| 27 | +# ~/.config/vibepod/config.yaml |
| 28 | +llm: |
| 29 | + enabled: true |
| 30 | + base_url: "http://host.docker.internal:11434" |
| 31 | + api_key: "ollama" |
| 32 | + model: "qwen3:14b" |
| 33 | +``` |
| 34 | +
|
| 35 | +!!! note |
| 36 | + Use `host.docker.internal` (not `localhost`) so the Docker container can reach Ollama on the host machine. |
| 37 | + |
| 38 | +### 3. Run an agent |
| 39 | + |
| 40 | +```bash |
| 41 | +vp run claude |
| 42 | +# Starts Claude Code with: |
| 43 | +# ANTHROPIC_BASE_URL=http://host.docker.internal:11434 |
| 44 | +# ANTHROPIC_API_KEY=ollama |
| 45 | +# claude --model qwen3:14b |
| 46 | +
|
| 47 | +vp run codex |
| 48 | +# Starts Codex with: |
| 49 | +# CODEX_OSS_BASE_URL=http://host.docker.internal:11434 |
| 50 | +# codex --oss -m qwen3:14b |
| 51 | +``` |
| 52 | + |
| 53 | +## Using environment variables |
| 54 | + |
| 55 | +You can also configure LLM settings at runtime without editing config files. |
| 56 | + |
| 57 | +**Claude Code with a remote Ollama server:** |
| 58 | + |
| 59 | +```bash |
| 60 | +VP_LLM_ENABLED=true VP_LLM_MODEL=qwen3.5:9b VP_LLM_BASE_URL=https://ollama.example.com vp run claude |
| 61 | +``` |
| 62 | + |
| 63 | +**Codex with a remote Ollama server (note the `/v1` suffix):** |
| 64 | + |
| 65 | +```bash |
| 66 | +VP_LLM_ENABLED=true VP_LLM_MODEL=qwen3.5:9b VP_LLM_BASE_URL=https://ollama.example.com/v1 vp run codex |
| 67 | +``` |
| 68 | + |
| 69 | +**Local Ollama with an API key:** |
| 70 | + |
| 71 | +```bash |
| 72 | +VP_LLM_ENABLED=true VP_LLM_BASE_URL=http://host.docker.internal:11434 VP_LLM_API_KEY=ollama VP_LLM_MODEL=qwen3:14b vp run claude |
| 73 | +``` |
| 74 | + |
| 75 | +!!! note |
| 76 | + Claude Code uses the Anthropic-compatible endpoint (no `/v1` suffix), while Codex uses the OpenAI-compatible endpoint (with `/v1` suffix). Adjust `VP_LLM_BASE_URL` accordingly, or use per-agent overrides if you need both agents to work from the same config. |
| 77 | + |
| 78 | +See [Configuration > Environment variables](configuration.md#environment-variables) for the full list. |
| 79 | + |
| 80 | +## Using vLLM or other OpenAI-compatible servers |
| 81 | + |
| 82 | +Point `base_url` at any server that speaks the OpenAI or Anthropic API: |
| 83 | + |
| 84 | +```yaml |
| 85 | +llm: |
| 86 | + enabled: true |
| 87 | + base_url: "http://my-vllm-server:8000/v1" |
| 88 | + api_key: "my-api-key" |
| 89 | + model: "meta-llama/Llama-3-8B-Instruct" |
| 90 | +``` |
| 91 | + |
| 92 | +## Per-agent overrides |
| 93 | + |
| 94 | +If you need different LLM settings for a specific agent, use the per-agent `env` config. Per-agent env vars take precedence over the `llm` section: |
| 95 | + |
| 96 | +```yaml |
| 97 | +llm: |
| 98 | + enabled: true |
| 99 | + base_url: "http://host.docker.internal:11434" |
| 100 | + api_key: "ollama" |
| 101 | + model: "qwen3:14b" |
| 102 | +
|
| 103 | +agents: |
| 104 | + claude: |
| 105 | + env: |
| 106 | + ANTHROPIC_BASE_URL: "http://different-server:11434" |
| 107 | +``` |
| 108 | + |
| 109 | +## Disabling |
| 110 | + |
| 111 | +To turn off LLM injection without removing the config: |
| 112 | + |
| 113 | +```yaml |
| 114 | +llm: |
| 115 | + enabled: false |
| 116 | +``` |
| 117 | + |
| 118 | +Or at runtime: |
| 119 | + |
| 120 | +```bash |
| 121 | +VP_LLM_ENABLED=false vp run claude |
| 122 | +``` |
0 commit comments