You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+56Lines changed: 56 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,62 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
8
8
## [Unreleased]
9
9
10
+
## [1.6.0] — 2026-05-08
11
+
12
+
### Added
13
+
14
+
-**Web search plugin** — five new valves (`ENABLE_WEB_SEARCH`, `WEB_SEARCH_MAX_RESULTS`, `WEB_SEARCH_PROMPT`, `WEB_SEARCH_INCLUDE_DOMAINS`, `WEB_SEARCH_EXCLUDE_DOMAINS`) attach OpenRouter's `web` plugin to every request so any model can ground answers in fresh web results, with domain allow/deny lists and a custom search prompt
-**Deprecation handling** — models with a non-null `expiration_date` are tagged `⚠ {name} (deprecated)` in the selector. New `HIDE_DEPRECATED_MODELS` valve removes them entirely
17
+
-**`REASONING_MAX_TOKENS` valve** — hard cap on reasoning tokens per response (sent as `reasoning.max_tokens`) for budget control on deep-thinking models
18
+
-**Provider preferences extras** — `PROVIDER_ONLY` (allowlist), `PROVIDER_QUANTIZATIONS` (e.g. `bf16,fp8`), `PROVIDER_ALLOW_FALLBACKS`, `PROVIDER_MAX_PRICE_PROMPT`, `PROVIDER_MAX_PRICE_COMPLETION`. Translates to `provider.only/quantizations/allow_fallbacks/max_price` per the OpenRouter SDK schema
19
+
-**`SERVICE_TIER` valve** — OpenAI-style tier hint (`auto`/`default`/`flex`/`priority`/`scale`) forwarded to compatible providers
20
+
-**`SHOW_GENERATION_ID` valve** — captures the `id` field from chat-completion responses (works in both streaming and non-streaming modes) and appends `*Generation ID: gen-…*` so users can later call `GET /api/v1/generation?id={id}` for audit trails and per-request usage details
21
+
-**Cached prompt-token cost breakdown** — when the provider reports `prompt_tokens_details.cached_tokens` (Anthropic prompt caching, OpenAI implicit caching, Gemini context caching), the `SHOW_COST_INFO` footer splits out cached vs. non-cached prompt tokens so users can see the savings (Anthropic caches save up to 90% on input cost)
22
+
-**`_build_web_search_plugin()`**, **`_format_generation_id()`** — new helpers on `Pipe`
23
+
24
+
### Changed
25
+
26
+
- Model-list cache fingerprint now also includes `MODEL_CATEGORY` and `HIDE_DEPRECATED_MODELS` so toggling either invalidates the cached list
27
+
-`pipes()` now sends `params={"output_modalities": ..., "category": ...}` when a category is set
28
+
-`_prepare_payload()` now emits `service_tier`, `provider.only`, `provider.quantizations`, `provider.allow_fallbacks=false`, `provider.max_price.{prompt,completion}`, `reasoning.max_tokens`, and a `web` entry in `plugins` (without overwriting any user-supplied plugins)
29
+
30
+
## [1.5.0] — 2026-05-07
31
+
32
+
### Added
33
+
34
+
-**Variant model routing** — new `MODEL_VARIANTS` valve (env: `OPENROUTER_MODEL_VARIANTS`). Comma-separated `base_id:variant` entries surface as virtual catalog rows that inherit the base model's display name and provider icon while OpenRouter routes the suffixed ID via its variant logic. Recognised tags: `free`, `thinking`, `online`, `nitro`, `exacto`, `extended`. Example: `MODEL_VARIANTS=openai/gpt-4o:nitro,anthropic/claude-3.5-sonnet:thinking`
35
+
-**Reasoning effort: `minimal` and `xhigh`** — extends `REASONING_EFFORT` with two new levels for fastest/maximum-depth thinking on supporting models
36
+
-**`REASONING_SUMMARY_MODE` valve** (env: `OPENROUTER_REASONING_SUMMARY_MODE`, default `disabled`) — requests a `reasoning.summary` block from supporting models. Options: `auto`, `concise`, `detailed`, `disabled`
37
+
-**Anthropic interleaved thinking** — new `ENABLE_ANTHROPIC_INTERLEAVED_THINKING` valve (default on, env: `OPENROUTER_ANTHROPIC_INTERLEAVED_THINKING`). When the selected model is `anthropic/...`, automatically injects the `anthropic-beta: interleaved-thinking-2025-05-14` header so Claude interleaves reasoning with tool use
38
+
-**`ANTHROPIC_PROMPT_CACHE_TTL` valve** (env: `OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL`, default `5m`) — extends `ENABLE_CACHE_CONTROL` so the ephemeral cache breakpoint can be set to either `5m` (default) or `1h` for longer cache lifetimes between turns
39
+
-**`TOOL_CALLING_FILTER` valve** (env: `OPENROUTER_TOOL_CALLING_FILTER`, default `all`) — catalog filter for tool-capable models. Options: `all`, `only`, `exclude`. Reads `supported_parameters` from `/models` and matches on `tools`/`tool_choice`
40
+
-**ZDR (Zero Data Retention) support** — two new valves: `ZDR_MODELS_ONLY` (catalog filter — fetches `/endpoints/zdr` and hides models without a ZDR-capable endpoint) and `ZDR_ENFORCE` (request-side — adds `provider.zdr=true` so OpenRouter rejects the call if no ZDR endpoint is available)
41
+
-**`HTTP_REFERER_OVERRIDE` valve** (env: `OPENROUTER_HTTP_REFERER`) — explicit override for the `HTTP-Referer` app-attribution header. Empty falls back to `WEBUI_URL` env or `http://localhost:3000`
42
+
-**`_load_zdr_model_ids()`**, **`_parse_variant_specs()`**, **`_expand_variant_models()`**, **`_resolve_referer()`**, **`_is_anthropic_model()`** — new instance methods on `Pipe`
43
+
44
+
### Changed
45
+
46
+
-**Breaking:**`FREE_ONLY` (boolean) replaced by **`FREE_MODEL_FILTER`** (env: `OPENROUTER_FREE_MODEL_FILTER`, default `all`). Options: `all`, `only`, `exclude`. Setups using `FREE_ONLY=true` should switch to `FREE_MODEL_FILTER=only`; setups using `FREE_ONLY=false` need no change
47
+
-**Reasoning payload shape:** when both `REASONING_EFFORT` and `REASONING_SUMMARY_MODE` are set, both fields are merged into the same `reasoning` object instead of overwriting
48
+
- Model-list cache fingerprint now also includes `FREE_MODEL_FILTER`, `TOOL_CALLING_FILTER`, `ZDR_MODELS_ONLY`, and `MODEL_VARIANTS` so toggling any of them invalidates the 5-minute cache
49
+
-`_build_headers()` accepts an optional `model_id` kwarg so it can decide whether to inject Anthropic-specific beta headers
50
+
51
+
## [1.4.0] — 2026-05-07
52
+
53
+
### Added
54
+
55
+
-**`OUTPUT_MODALITIES` valve** (env: `OPENROUTER_OUTPUT_MODALITIES`, default `all`) — controls which model output modalities are fetched from OpenRouter's `/models` endpoint. Accepts `text`, `image`, `audio`, `embeddings`, `all`, or a comma-separated combination
56
+
-**Full-catalog model listing** — TTS (e.g. `openai/gpt-4o-mini-tts-*`), audio-output, image-generation, and embedding models now appear in the Open WebUI model selector by default
57
+
-**Auto-discovered provider icons** — for providers not in the hardcoded fast-path dict, the pipe now lazy-loads OpenRouter's frontend provider registry (`/api/frontend/all-providers`) and resolves the icon from there. Adds icon coverage for ~20 additional model authors (xAI, Inflection, NVIDIA, Arcee, Morph, Cerebras, etc.) including gstatic favicons for providers without an OpenRouter-hosted logo. Slug normalization handles `x-ai` ↔ `xai` style mismatches
58
+
-**`_load_provider_registry()`** and **`_get_provider_icon()`** methods on `Pipe` — layered icon resolution: hardcoded dict → registry exact slug → registry hyphen-stripped slug. Network failures are silent (best-effort fallback)
59
+
60
+
### Changed
61
+
62
+
- The `/models` request now passes `output_modalities=all` by default, so the catalog is no longer silently restricted to text-output models. Set `OUTPUT_MODALITIES = text` to restore the previous chat-only behaviour
63
+
- Model-list cache fingerprint now includes `OUTPUT_MODALITIES`, so toggling the valve correctly invalidates the cached list
64
+
-`_is_owui_managed_icon()` now also recognises `https://t0.gstatic.com/faviconV2` URLs as pipe-managed, so registry-sourced gstatic favicons remain overwriteable when OpenRouter updates its provider mapping
Access **340+ AI models** through OpenRouter directly inside Open WebUI — with provider routing,
8
-
reasoning tokens, streaming, fallbacks, and cache control out of the box.
7
+
Access the **full OpenRouter catalog** — chat, TTS, audio, image-generation, and embedding models —
8
+
directly inside Open WebUI, with provider routing, reasoning tokens, streaming, fallbacks, and
9
+
cache control out of the box.
9
10
10
11
## Feature gallery
11
12
@@ -50,15 +51,21 @@ reasoning tokens, streaming, fallbacks, and cache control out of the box.
50
51
51
52
## Features
52
53
53
-
-**Manifold pipe** — exposes all OpenRouter models as native Open WebUI models in the model selector.
54
+
-**Manifold pipe** — exposes the full OpenRouter catalog (chat, TTS, audio, image, embeddings) as native Open WebUI models in the model selector. Configurable via `OUTPUT_MODALITIES` and `MODEL_CATEGORY`.
55
+
-**Web search plugin** — attach OpenRouter's `web` plugin to any model with domain allow/deny lists, custom search prompt, and result-count limits.
56
+
-**Variant routing** — surface virtual `:nitro`/`:exacto`/`:thinking`/`:online`/`:free`/`:extended` model entries that route to OpenRouter's specialized profiles.
-**Generation auditability** — optional generation ID footer maps each response to OpenRouter's `/generation?id=` activity API.
59
+
-**Cached-input savings** — surface cached vs. non-cached prompt tokens in the cost footer (Anthropic prompt caching, OpenAI implicit caching, Gemini context caching).
60
+
-**Deprecation visibility** — models with an `expiration_date` are tagged with ⚠ in the selector (or hidden via `HIDE_DEPRECATED_MODELS`).
54
61
-**Provider routing** — sort by `price`, `throughput`, or `latency`; prefer or exclude specific providers; enforce `require_parameters`.
55
62
-**Reasoning tokens** — `<think>` blocks streamed in real time with configurable effort (`low`, `medium`, `high`).
56
63
-**Streaming** — full SSE streaming with mid-stream error handling and automatic `<think>` closure on error.
57
64
-**Model fallbacks** — automatic failover to one or more backup models via `FALLBACK_MODELS`.
58
65
-**Middle-out compression** — fits long prompts within context windows (`transforms: ["middle-out"]`).
59
66
-**Cache control** — Anthropic-style `cache_control` injection on the longest message chunk.
60
67
-**Citations** — `[n]` references from web-search-enabled models are converted to markdown links.
61
-
-**Provider icons** — 13 provider logos synced directly into Open WebUI's model database.
68
+
-**Provider icons** — 13 hardcoded fast-path logos plus auto-discovered icons for ~20 more providers (xAI, Inflection, NVIDIA, Arcee, Morph, Cerebras, …) lazy-loaded from OpenRouter's provider registry, all synced directly into Open WebUI's model database.
62
69
-**Retry logic** — exponential backoff with jitter on timeout and connection errors.
63
70
-**FREE_ONLY mode** — filter to show only free-tier models (`:free` suffix or `0/0` pricing).
64
71
-**Pre-flight validation** — invalid API keys are caught at model-fetch time, not after sending a message.
@@ -145,7 +152,10 @@ Every valve accepts an environment variable fallback. The table below lists both
|`OUTPUT_MODALITIES`|`OPENROUTER_OUTPUT_MODALITIES`|`all`| Output modalities to fetch from `/models`. `all` (default) lists every model. Restrict with `text`, `image`, `audio`, `embeddings`, or a comma list (e.g. `text,audio`) |
|`HIDE_DEPRECATED_MODELS`|`OPENROUTER_HIDE_DEPRECATED_MODELS`|`false`| Hide models with a non-null `expiration_date`. When False, deprecated models are tagged `⚠ {name} (deprecated)`|
173
+
|`ZDR_MODELS_ONLY`|`OPENROUTER_ZDR_MODELS_ONLY`|`false`| Catalog-side: hide models without a ZDR endpoint (reads `/endpoints/zdr`) |
158
174
159
175
### Provider Routing
160
176
@@ -163,16 +179,30 @@ Every valve accepts an environment variable fallback. The table below lists both
|`PROVIDER_ONLY`|`OPENROUTER_PROVIDER_ONLY`|`""`| Provider allowlist (comma-separated). Merged with account-wide settings |
183
+
|`PROVIDER_QUANTIZATIONS`|`OPENROUTER_PROVIDER_QUANTIZATIONS`|`""`| Allowed quantizations (comma-separated, e.g. `bf16,fp8`) |
184
+
|`PROVIDER_ALLOW_FALLBACKS`|`OPENROUTER_PROVIDER_ALLOW_FALLBACKS`|`true`| When False, OpenRouter fails fast on the primary/ordered provider instead of falling back |
185
+
|`PROVIDER_MAX_PRICE_PROMPT`|`OPENROUTER_PROVIDER_MAX_PRICE_PROMPT`|`""`| Maximum prompt price (USD per 1M tokens) |
186
+
|`PROVIDER_MAX_PRICE_COMPLETION`|`OPENROUTER_PROVIDER_MAX_PRICE_COMPLETION`|`""`| Maximum completion price (USD per 1M tokens) |
187
+
|`SERVICE_TIER`|`OPENROUTER_SERVICE_TIER`|`""`| OpenAI-style service tier: `auto`, `default`, `flex`, `priority`, `scale`|
166
188
|`REQUIRE_PARAMETERS`|`OPENROUTER_REQUIRE_PARAMETERS`|`false`| Only use providers that support all request parameters |
167
189
|`DATA_COLLECTION`|`OPENROUTER_DATA_COLLECTION`|`allow`| Data policy: `allow` or `deny`|
190
+
|`ZDR_ENFORCE`|`OPENROUTER_ZDR_ENFORCE`|`false`| Send `provider.zdr=true` so OpenRouter routes only to ZDR endpoints (request fails if none available) |
168
191
169
192
### Advanced
170
193
171
194
| Valve | Env Var | Default | Description |
172
195
| --- | --- | --- | --- |
173
196
|`FALLBACK_MODELS`|`OPENROUTER_FALLBACK_MODELS`|`""`| Fallback model IDs (comma-separated) |
174
197
|`ENABLE_MIDDLE_OUT`|`OPENROUTER_ENABLE_MIDDLE_OUT`|`false`| Middle-out compression for long prompts |
198
+
|`ENABLE_WEB_SEARCH`|`OPENROUTER_ENABLE_WEB_SEARCH`|`false`| Attach OpenRouter's `web` plugin so any model can ground answers in fresh web results |
199
+
|`WEB_SEARCH_MAX_RESULTS`|`OPENROUTER_WEB_SEARCH_MAX_RESULTS`|`5`| Max search results passed to the model (1-20) |
200
+
|`WEB_SEARCH_PROMPT`|`OPENROUTER_WEB_SEARCH_PROMPT`|`""`| Optional custom search prompt forwarded to the search engine |
|`ENABLE_CACHE_CONTROL`|`OPENROUTER_ENABLE_CACHE_CONTROL`|`false`| Inject Anthropic `cache_control` on the longest message |
204
+
|`ANTHROPIC_PROMPT_CACHE_TTL`|`OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL`|`5m`| TTL for the Anthropic ephemeral cache breakpoint: `5m` or `1h`|
205
+
|`SHOW_GENERATION_ID`|`OPENROUTER_SHOW_GENERATION_ID`|`false`| Append the OpenRouter generation ID to each response (for `GET /generation?id=` lookups) |
176
206
|`SYNC_PROVIDER_ICONS`|`OPENROUTER_SYNC_ICONS`|`true`| Sync provider icons into Open WebUI's model database |
177
207
178
208
### Network
@@ -181,6 +211,7 @@ Every valve accepts an environment variable fallback. The table below lists both
181
211
| --- | --- | --- | --- |
182
212
|`REQUEST_TIMEOUT`|`OPENROUTER_REQUEST_TIMEOUT`|`90`| HTTP timeout in seconds |
183
213
|`MAX_RETRIES`| — |`2`| Auto-retry count on transient errors |
214
+
|`HTTP_REFERER_OVERRIDE`|`OPENROUTER_HTTP_REFERER`|`""`| Override the `HTTP-Referer` header sent to OpenRouter (must include scheme). Empty falls back to `WEBUI_URL`|
184
215
185
216
## Architecture
186
217
@@ -320,6 +351,15 @@ A: `FALLBACK_MODELS` adds extra model IDs to the `models` array in the OpenRoute
320
351
primary model fails, OpenRouter automatically tries the next one. Non-streaming responses include
321
352
a "Responded by: model-id" attribution when a fallback handled the request.
322
353
354
+
**Q: I selected a TTS / embeddings / image-generation model and got an error — why?**
355
+
356
+
A: The pipe routes every request through OpenRouter's `/chat/completions` endpoint. Models that
357
+
only expose a non-chat endpoint (e.g. pure TTS models served via `/audio/speech`) return an
358
+
"endpoint not supported" error from OpenRouter. The pipe surfaces that error verbatim. Chat
359
+
completion models that *output* audio or images (e.g. `openai/gpt-audio`) work normally — their
360
+
audio transcript and generated images are rendered inline. To hide non-chat models from the
361
+
selector entirely, set `OUTPUT_MODALITIES = text`.
362
+
323
363
## License
324
364
325
365
This project is licensed under the **MIT License** — see the [LICENSE](LICENSE) file for details.
Copy file name to clipboardExpand all lines: function.json
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -3,13 +3,13 @@
3
3
"name": "OpenRouter Pipe",
4
4
"type": "manifold",
5
5
"meta": {
6
-
"description": "Access 340+ AI models through OpenRouter directly inside Open WebUI. Features provider routing, reasoning tokens with <think> tags, full SSE streaming, model fallbacks, middle-out compression, Anthropic cache control, citations, 13 provider icons, and configurable retry logic.",
6
+
"description": "The definitive OpenRouter integration for Open WebUI. Full catalog (chat/TTS/audio/image/embeddings), variant routing (:nitro/:exacto/:thinking/:online/:free/:extended), web search plugin with domain filters, server-side category filter, deprecation warnings, extended reasoning (minimal→xhigh + max_tokens + summary), Anthropic interleaved thinking + cache TTL, ZDR enforcement, tool/free-tier filters, provider preferences (only/quantizations/max_price/allow_fallbacks), service tier routing (auto/flex/priority/scale), generation-ID auditability, cached-input cost breakdown, model fallbacks, middle-out compression, citations, auto-discovered provider icons.",
0 commit comments