Access the full OpenRouter catalog — chat, TTS, audio, image-generation, and embedding models — directly inside Open WebUI, with provider routing, reasoning tokens, streaming, fallbacks, and cache control out of the box.
Models from OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek and more — each with its provider icon.
<think> blocks streamed in real time with configurable effort levels.
Sort, prefer, exclude and require parameters across providers per request.
- Features
- Requirements
- Installation
- Usage
- Configuration
- Architecture
- Development
- Contributing
- Troubleshooting
- FAQ
- License
- Manifold pipe — exposes the full OpenRouter catalog (chat, TTS, audio, image, embeddings) as native Open WebUI models in the model selector. Configurable via
OUTPUT_MODALITIESandMODEL_CATEGORY. - Web search plugin — attach OpenRouter's
webplugin to any model with domain allow/deny lists, custom search prompt, and result-count limits. - Variant routing — surface virtual
:nitro/:exacto/:thinking/:online/:free/:extendedmodel entries that route to OpenRouter's specialized profiles. - Service tier hint — forward OpenAI-style
flex/priority/scaletiers to compatible providers. - Generation auditability — optional generation ID footer maps each response to OpenRouter's
/generation?id=activity API. - Cached-input savings — surface cached vs. non-cached prompt tokens in the cost footer (Anthropic prompt caching, OpenAI implicit caching, Gemini context caching).
- Deprecation visibility — models with an
expiration_dateare tagged with ⚠ in the selector (or hidden viaHIDE_DEPRECATED_MODELS). - Provider routing — sort by
price,throughput, orlatency; prefer or exclude specific providers; enforcerequire_parameters. - Reasoning tokens —
<think>blocks streamed in real time with configurable effort (low,medium,high). - Streaming — full SSE streaming with mid-stream error handling and automatic
<think>closure on error. - Model fallbacks — automatic failover to one or more backup models via
FALLBACK_MODELS. - Middle-out compression — fits long prompts within context windows (
transforms: ["middle-out"]). - Cache control — Anthropic-style
cache_controlinjection on the longest message chunk. - Citations —
[n]references from web-search-enabled models are converted to markdown links. - Provider icons — 13 hardcoded fast-path logos plus auto-discovered icons for ~20 more providers (xAI, Inflection, NVIDIA, Arcee, Morph, Cerebras, …) lazy-loaded from OpenRouter's provider registry, all synced directly into Open WebUI's model database.
- Retry logic — exponential backoff with jitter on timeout and connection errors.
- FREE_ONLY mode — filter to show only free-tier models (
:freesuffix or0/0pricing). - Pre-flight validation — invalid API keys are caught at model-fetch time, not after sending a message.
- Open WebUI ≥ 0.4.0 running locally or in Docker.
- OpenRouter API key — free account, key starts with
sk-or-. - Python ≥ 3.10 (managed by Open WebUI; no separate install needed for the pipe).
Search for "OpenRouter Pipe" on openwebui.com and install it directly from the community hub — no copy-paste required.
- Copy the full content of
openrouter_pipe.py. - In Open WebUI, navigate to Admin Panel → Functions.
- Click + Add Function (or Import).
- Paste the code and save.
- Enable the function using the toggle.
- Click the ⚙️ Valves icon and enter your
OPENROUTER_API_KEY.
All OpenRouter models will appear in the model selector immediately.
Note: You can also set
OPENROUTER_API_KEYas a server environment variable instead of entering it in Valves.
git clone https://github.com/sena-labs/Open-WebUI-Pipe-OpenRouter.git
cd Open-WebUI-Pipe-OpenRouter
pip install -r requirements.txt
python test_pipe.py # 431 tests — verify everything is greenAll behavior is controlled through Valves in the Open WebUI admin panel. Every valve accepts an environment variable fallback (see Configuration).
| Goal | Valves to set |
|---|---|
| Show only OpenAI and Anthropic models | MODEL_PROVIDERS = openai,anthropic |
| Show only free models | FREE_ONLY = true |
| Use DeepSeek for reasoning | select deepseek/deepseek-r1, INCLUDE_REASONING = true |
| Route cheapest provider first | PROVIDER_SORT = price |
| Add a fallback model | FALLBACK_MODELS = anthropic/claude-3.5-sonnet |
When INCLUDE_REASONING is enabled (default), the pipe requests reasoning tokens from models that
support them. The internal reasoning appears inside <think>…</think> blocks before the main
response.
Set REASONING_EFFORT to low, medium, or high to control how much compute the model
allocates to reasoning. Leave it empty to let the model decide.
Models with web-search capabilities return citation annotations. The pipe automatically converts
[1], [2] references to [[1]](url) markdown links and appends a numbered Citations:
section at the end of the response.
Every valve accepts an environment variable fallback. The table below lists both.
| Valve | Env Var | Default | Description |
|---|---|---|---|
OPENROUTER_API_KEY |
OPENROUTER_API_KEY |
"" |
Your OpenRouter API key |
OPENROUTER_BASE_URL |
OPENROUTER_BASE_URL |
https://openrouter.ai/api/v1 |
API endpoint |
| Valve | Env Var | Default | Description |
|---|---|---|---|
INCLUDE_REASONING |
OPENROUTER_INCLUDE_REASONING |
true |
Request reasoning tokens (<think> blocks) |
REASONING_EFFORT |
OPENROUTER_REASONING_EFFORT |
"" |
Effort level: minimal, low, medium, high, xhigh, or empty |
REASONING_SUMMARY_MODE |
OPENROUTER_REASONING_SUMMARY_MODE |
disabled |
Reasoning-summary verbosity: auto, concise, detailed, disabled |
REASONING_MAX_TOKENS |
OPENROUTER_REASONING_MAX_TOKENS |
0 |
Hard cap on reasoning tokens per response (0 disables the cap) |
ENABLE_ANTHROPIC_INTERLEAVED_THINKING |
OPENROUTER_ANTHROPIC_INTERLEAVED_THINKING |
true |
Auto-inject anthropic-beta: interleaved-thinking-2025-05-14 for anthropic/* models |
| Valve | Env Var | Default | Description |
|---|---|---|---|
MODEL_PREFIX |
— | None |
Custom prefix for model names (e.g. 🔥 ) |
MODEL_PROVIDERS |
OPENROUTER_MODEL_PROVIDERS |
ALL |
Provider filter (e.g. openai,anthropic). ALL means no filter |
INVERT_PROVIDER_LIST |
OPENROUTER_INVERT_PROVIDER_LIST |
false |
Treat MODEL_PROVIDERS as an exclusion list |
FREE_MODEL_FILTER |
OPENROUTER_FREE_MODEL_FILTER |
all |
Free-tier filter: all / only / exclude |
TOOL_CALLING_FILTER |
OPENROUTER_TOOL_CALLING_FILTER |
all |
Tool-capable filter (reads supported_parameters): all / only / exclude |
OUTPUT_MODALITIES |
OPENROUTER_OUTPUT_MODALITIES |
all |
Output modalities to fetch from /models. all (default) lists every model. Restrict with text, image, audio, embeddings, or a comma list (e.g. text,audio) |
MODEL_VARIANTS |
OPENROUTER_MODEL_VARIANTS |
"" |
Comma-separated base_id:tag entries that surface virtual variant models (e.g. openai/gpt-4o:nitro). Tags: free, thinking, online, nitro, exacto, extended |
MODEL_CATEGORY |
OPENROUTER_MODEL_CATEGORY |
"" |
Server-side category filter (?category=). Common values: programming, roleplay, marketing, science, legal, finance, health, academia |
HIDE_DEPRECATED_MODELS |
OPENROUTER_HIDE_DEPRECATED_MODELS |
false |
Hide models with a non-null expiration_date. When False, deprecated models are tagged ⚠ {name} (deprecated) |
ZDR_MODELS_ONLY |
OPENROUTER_ZDR_MODELS_ONLY |
false |
Catalog-side: hide models without a ZDR endpoint (reads /endpoints/zdr) |
| Valve | Env Var | Default | Description |
|---|---|---|---|
PROVIDER_SORT |
OPENROUTER_PROVIDER_SORT |
"" |
Sort: price, throughput, latency |
PROVIDER_ORDER |
OPENROUTER_PROVIDER_ORDER |
"" |
Preferred providers (comma-separated) |
PROVIDER_IGNORE |
OPENROUTER_PROVIDER_IGNORE |
"" |
Excluded providers (comma-separated) |
PROVIDER_ONLY |
OPENROUTER_PROVIDER_ONLY |
"" |
Provider allowlist (comma-separated). Merged with account-wide settings |
PROVIDER_QUANTIZATIONS |
OPENROUTER_PROVIDER_QUANTIZATIONS |
"" |
Allowed quantizations (comma-separated, e.g. bf16,fp8) |
PROVIDER_ALLOW_FALLBACKS |
OPENROUTER_PROVIDER_ALLOW_FALLBACKS |
true |
When False, OpenRouter fails fast on the primary/ordered provider instead of falling back |
PROVIDER_MAX_PRICE_PROMPT |
OPENROUTER_PROVIDER_MAX_PRICE_PROMPT |
"" |
Maximum prompt price (USD per 1M tokens) |
PROVIDER_MAX_PRICE_COMPLETION |
OPENROUTER_PROVIDER_MAX_PRICE_COMPLETION |
"" |
Maximum completion price (USD per 1M tokens) |
SERVICE_TIER |
OPENROUTER_SERVICE_TIER |
"" |
OpenAI-style service tier: auto, default, flex, priority, scale |
REQUIRE_PARAMETERS |
OPENROUTER_REQUIRE_PARAMETERS |
false |
Only use providers that support all request parameters |
DATA_COLLECTION |
OPENROUTER_DATA_COLLECTION |
allow |
Data policy: allow or deny |
ZDR_ENFORCE |
OPENROUTER_ZDR_ENFORCE |
false |
Send provider.zdr=true so OpenRouter routes only to ZDR endpoints (request fails if none available) |
| Valve | Env Var | Default | Description |
|---|---|---|---|
FALLBACK_MODELS |
OPENROUTER_FALLBACK_MODELS |
"" |
Fallback model IDs (comma-separated) |
ENABLE_MIDDLE_OUT |
OPENROUTER_ENABLE_MIDDLE_OUT |
false |
Middle-out compression for long prompts |
ENABLE_WEB_SEARCH |
OPENROUTER_ENABLE_WEB_SEARCH |
false |
Attach OpenRouter's web plugin so any model can ground answers in fresh web results |
WEB_SEARCH_MAX_RESULTS |
OPENROUTER_WEB_SEARCH_MAX_RESULTS |
5 |
Max search results passed to the model (1-20) |
WEB_SEARCH_PROMPT |
OPENROUTER_WEB_SEARCH_PROMPT |
"" |
Optional custom search prompt forwarded to the search engine |
WEB_SEARCH_INCLUDE_DOMAINS |
OPENROUTER_WEB_SEARCH_INCLUDE_DOMAINS |
"" |
Domain allowlist (supports wildcards & paths) |
WEB_SEARCH_EXCLUDE_DOMAINS |
OPENROUTER_WEB_SEARCH_EXCLUDE_DOMAINS |
"" |
Domain denylist |
ENABLE_CACHE_CONTROL |
OPENROUTER_ENABLE_CACHE_CONTROL |
false |
Inject Anthropic cache_control on the longest message |
ANTHROPIC_PROMPT_CACHE_TTL |
OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL |
5m |
TTL for the Anthropic ephemeral cache breakpoint: 5m or 1h |
SHOW_GENERATION_ID |
OPENROUTER_SHOW_GENERATION_ID |
false |
Append the OpenRouter generation ID to each response (for GET /generation?id= lookups) |
SYNC_PROVIDER_ICONS |
OPENROUTER_SYNC_ICONS |
true |
Sync provider icons into Open WebUI's model database |
| Valve | Env Var | Default | Description |
|---|---|---|---|
REQUEST_TIMEOUT |
OPENROUTER_REQUEST_TIMEOUT |
90 |
HTTP timeout in seconds |
MAX_RETRIES |
— | 2 |
Auto-retry count on transient errors |
HTTP_REFERER_OVERRIDE |
OPENROUTER_HTTP_REFERER |
"" |
Override the HTTP-Referer header sent to OpenRouter (must include scheme). Empty falls back to WEBUI_URL |
The pipe implements the Manifold pattern: one pipe entry point that surfaces multiple models.
| Layer | Files | Responsibility |
|---|---|---|
| Entry points | Pipe.pipes(), Pipe.pipe() |
Model listing and chat routing |
| Payload | _prepare_payload() |
Sanitize OWUI internals, inject routing and reasoning |
| Transport | _retryable_request() |
Retry wrapper with exponential backoff |
| Streaming | _stream_response() |
SSE parser, <think> management, mid-stream errors |
| Non-streaming | _non_stream_response() |
JSON response, body-level error detection |
| Enrichment | _inject_cache_control(), _insert_citations() |
Post-processing |
Open-WebUI-Pipe-OpenRouter/
├── openrouter_pipe.py # Main pipe source — install this in Open WebUI
├── function.json # Open WebUI community manifest
├── test_pipe.py # Unit test suite (431 tests)
├── integration_test.py # Live API integration tests (43 assertions)
├── TESTING.md # Manual pre-release checklist
├── SECURITY.md # Security policy
├── CONTRIBUTING.md # Contribution guidelines
├── CHANGELOG.md # Version history
├── LICENSE # MIT License
├── requirements.txt # Python dependencies
└── .github/
├── workflows/
│ └── tests.yml # CI pipeline (Python 3.10–3.13)
└── ISSUE_TEMPLATE/
├── bug_report.yml
└── feature_request.yml
The pipe strips these Open WebUI-internal keys before forwarding to OpenRouter:
_OWUI_INTERNAL_KEYS = {
"chat_id", "title", "task", "task_id", "features", "citations",
"metadata", "files", "tool_ids", "session_id", "message_id"
}It also removes user when sent as a dict (Open WebUI format) since OpenRouter expects a string.
python test_pipe.py # Unit tests (431 tests)
python integration_test.py # Live API tests (requires OPENROUTER_API_KEY)The unit test suite covers: valve defaults, payload preparation, streaming and non-streaming
responses, retry logic, citation injection, model listing, and pipe() routing.
Contributions are welcome. See CONTRIBUTING.md for the full playbook.
Set your API key in Admin Panel → Functions → OpenRouter Pipe → Valves (⚙️), or set the
OPENROUTER_API_KEY environment variable on the server and restart Open WebUI.
Your key is incorrect or malformed. Retrieve a valid key from
openrouter.ai/keys — it should start with sk-or-.
Wait a moment and retry. MAX_RETRIES only retries on network timeouts and connection failures —
HTTP 429 errors are returned immediately. Consider upgrading your OpenRouter plan for higher limits.
Add credits at openrouter.ai/credits.
Increase REQUEST_TIMEOUT in Valves (default: 90 seconds), or try a faster model. Some large
reasoning models can take over a minute for complex prompts.
- Verify your API key is valid (a single "error" model appears if it is not).
- If
MODEL_PROVIDERSis set, confirm the provider names are lowercase:openai,anthropic,google. - If
FREE_ONLYis enabled, some providers may have no free models — try disabling it. - Set
MODEL_PROVIDERS = ALLto show the full catalog.
Some models may be temporarily unavailable. Try a different model or check status.openrouter.ai.
Q: Does this work with Open WebUI's native tool calling?
A: Open WebUI manages tool calling in an iterative loop: when the pipe's response contains
tool calls, Open WebUI executes them, appends the results as role: "tool" messages, and
re-invokes the pipe with the updated thread. The pipe forwards the full message list to
OpenRouter on each invocation. Whether a model can generate tool calls depends on
OpenRouter's provider support for that model.
Q: Why does FREE_ONLY include models without a :free suffix?
A: Some models are listed as free on OpenRouter without carrying a :free suffix in their
ID. The pipe uses a two-pass check: first it looks for the :free suffix, then it falls
back to inspecting the pricing.prompt and pricing.completion fields returned by the
OpenRouter /models endpoint — if both are 0, the model is treated as free.
Q: Can I use multiple provider filters at once?
A: MODEL_PROVIDERS accepts a comma-separated list (e.g. openai,anthropic). Enable
INVERT_PROVIDER_LIST to turn it into an exclusion list instead.
Q: How do fallback models work?
A: FALLBACK_MODELS adds extra model IDs to the models array in the OpenRouter request. If the
primary model fails, OpenRouter automatically tries the next one. Non-streaming responses include
a "Responded by: model-id" attribution when a fallback handled the request.
Q: I selected a TTS / embeddings / image-generation model and got an error — why?
A: The pipe routes every request through OpenRouter's /chat/completions endpoint. Models that
only expose a non-chat endpoint (e.g. pure TTS models served via /audio/speech) return an
"endpoint not supported" error from OpenRouter. The pipe surfaces that error verbatim. Chat
completion models that output audio or images (e.g. openai/gpt-audio) work normally — their
audio transcript and generated images are rendered inline. To hide non-chat models from the
selector entirely, set OUTPUT_MODALITIES = text.
This project is licensed under the MIT License — see the LICENSE file for details.