diff --git a/CHANGELOG.md b/CHANGELOG.md
index d0c3db8..6fe0eef 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,62 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+## [1.6.0] — 2026-05-08
+
+### Added
+
+- **Web search plugin** — five new valves (`ENABLE_WEB_SEARCH`, `WEB_SEARCH_MAX_RESULTS`, `WEB_SEARCH_PROMPT`, `WEB_SEARCH_INCLUDE_DOMAINS`, `WEB_SEARCH_EXCLUDE_DOMAINS`) attach OpenRouter's `web` plugin to every request so any model can ground answers in fresh web results, with domain allow/deny lists and a custom search prompt
+- **`MODEL_CATEGORY` valve** — server-side `?category=...` filter on `/models` (e.g. `programming`, `roleplay`, `marketing`, `science`, `legal`, `finance`, `health`, `academia`)
+- **Deprecation handling** — models with a non-null `expiration_date` are tagged `⚠ {name} (deprecated)` in the selector. New `HIDE_DEPRECATED_MODELS` valve removes them entirely
+- **`REASONING_MAX_TOKENS` valve** — hard cap on reasoning tokens per response (sent as `reasoning.max_tokens`) for budget control on deep-thinking models
+- **Provider preferences extras** — `PROVIDER_ONLY` (allowlist), `PROVIDER_QUANTIZATIONS` (e.g. `bf16,fp8`), `PROVIDER_ALLOW_FALLBACKS`, `PROVIDER_MAX_PRICE_PROMPT`, `PROVIDER_MAX_PRICE_COMPLETION`. Translates to `provider.only/quantizations/allow_fallbacks/max_price` per the OpenRouter SDK schema
+- **`SERVICE_TIER` valve** — OpenAI-style tier hint (`auto`/`default`/`flex`/`priority`/`scale`) forwarded to compatible providers
+- **`SHOW_GENERATION_ID` valve** — captures the `id` field from chat-completion responses (works in both streaming and non-streaming modes) and appends `*Generation ID: gen-…*` so users can later call `GET /api/v1/generation?id={id}` for audit trails and per-request usage details
+- **Cached prompt-token cost breakdown** — when the provider reports `prompt_tokens_details.cached_tokens` (Anthropic prompt caching, OpenAI implicit caching, Gemini context caching), the `SHOW_COST_INFO` footer splits out cached vs. non-cached prompt tokens so users can see the savings (Anthropic caches save up to 90% on input cost)
+- **`_build_web_search_plugin()`**, **`_format_generation_id()`** — new helpers on `Pipe`
+
+### Changed
+
+- Model-list cache fingerprint now also includes `MODEL_CATEGORY` and `HIDE_DEPRECATED_MODELS` so toggling either invalidates the cached list
+- `pipes()` now sends `params={"output_modalities": ..., "category": ...}` when a category is set
+- `_prepare_payload()` now emits `service_tier`, `provider.only`, `provider.quantizations`, `provider.allow_fallbacks=false`, `provider.max_price.{prompt,completion}`, `reasoning.max_tokens`, and a `web` entry in `plugins` (without overwriting any user-supplied plugins)
+
+## [1.5.0] — 2026-05-07
+
+### Added
+
+- **Variant model routing** — new `MODEL_VARIANTS` valve (env: `OPENROUTER_MODEL_VARIANTS`). Comma-separated `base_id:variant` entries surface as virtual catalog rows that inherit the base model's display name and provider icon while OpenRouter routes the suffixed ID via its variant logic. Recognised tags: `free`, `thinking`, `online`, `nitro`, `exacto`, `extended`. Example: `MODEL_VARIANTS=openai/gpt-4o:nitro,anthropic/claude-3.5-sonnet:thinking`
+- **Reasoning effort: `minimal` and `xhigh`** — extends `REASONING_EFFORT` with two new levels for fastest/maximum-depth thinking on supporting models
+- **`REASONING_SUMMARY_MODE` valve** (env: `OPENROUTER_REASONING_SUMMARY_MODE`, default `disabled`) — requests a `reasoning.summary` block from supporting models. Options: `auto`, `concise`, `detailed`, `disabled`
+- **Anthropic interleaved thinking** — new `ENABLE_ANTHROPIC_INTERLEAVED_THINKING` valve (default on, env: `OPENROUTER_ANTHROPIC_INTERLEAVED_THINKING`). When the selected model is `anthropic/...`, automatically injects the `anthropic-beta: interleaved-thinking-2025-05-14` header so Claude interleaves reasoning with tool use
+- **`ANTHROPIC_PROMPT_CACHE_TTL` valve** (env: `OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL`, default `5m`) — extends `ENABLE_CACHE_CONTROL` so the ephemeral cache breakpoint can be set to either `5m` (default) or `1h` for longer cache lifetimes between turns
+- **`TOOL_CALLING_FILTER` valve** (env: `OPENROUTER_TOOL_CALLING_FILTER`, default `all`) — catalog filter for tool-capable models. Options: `all`, `only`, `exclude`. Reads `supported_parameters` from `/models` and matches on `tools`/`tool_choice`
+- **ZDR (Zero Data Retention) support** — two new valves: `ZDR_MODELS_ONLY` (catalog filter — fetches `/endpoints/zdr` and hides models without a ZDR-capable endpoint) and `ZDR_ENFORCE` (request-side — adds `provider.zdr=true` so OpenRouter rejects the call if no ZDR endpoint is available)
+- **`HTTP_REFERER_OVERRIDE` valve** (env: `OPENROUTER_HTTP_REFERER`) — explicit override for the `HTTP-Referer` app-attribution header. Empty falls back to `WEBUI_URL` env or `http://localhost:3000`
+- **`_load_zdr_model_ids()`**, **`_parse_variant_specs()`**, **`_expand_variant_models()`**, **`_resolve_referer()`**, **`_is_anthropic_model()`** — new instance methods on `Pipe`
+
+### Changed
+
+- **Breaking:** `FREE_ONLY` (boolean) replaced by **`FREE_MODEL_FILTER`** (env: `OPENROUTER_FREE_MODEL_FILTER`, default `all`). Options: `all`, `only`, `exclude`. Setups using `FREE_ONLY=true` should switch to `FREE_MODEL_FILTER=only`; setups using `FREE_ONLY=false` need no change
+- **Reasoning payload shape:** when both `REASONING_EFFORT` and `REASONING_SUMMARY_MODE` are set, both fields are merged into the same `reasoning` object instead of overwriting
+- Model-list cache fingerprint now also includes `FREE_MODEL_FILTER`, `TOOL_CALLING_FILTER`, `ZDR_MODELS_ONLY`, and `MODEL_VARIANTS` so toggling any of them invalidates the 5-minute cache
+- `_build_headers()` accepts an optional `model_id` kwarg so it can decide whether to inject Anthropic-specific beta headers
+
+## [1.4.0] — 2026-05-07
+
+### Added
+
+- **`OUTPUT_MODALITIES` valve** (env: `OPENROUTER_OUTPUT_MODALITIES`, default `all`) — controls which model output modalities are fetched from OpenRouter's `/models` endpoint. Accepts `text`, `image`, `audio`, `embeddings`, `all`, or a comma-separated combination
+- **Full-catalog model listing** — TTS (e.g. `openai/gpt-4o-mini-tts-*`), audio-output, image-generation, and embedding models now appear in the Open WebUI model selector by default
+- **Auto-discovered provider icons** — for providers not in the hardcoded fast-path dict, the pipe now lazy-loads OpenRouter's frontend provider registry (`/api/frontend/all-providers`) and resolves the icon from there. Adds icon coverage for ~20 additional model authors (xAI, Inflection, NVIDIA, Arcee, Morph, Cerebras, etc.) including gstatic favicons for providers without an OpenRouter-hosted logo. Slug normalization handles `x-ai` ↔ `xai` style mismatches
+- **`_load_provider_registry()`** and **`_get_provider_icon()`** methods on `Pipe` — layered icon resolution: hardcoded dict → registry exact slug → registry hyphen-stripped slug. Network failures are silent (best-effort fallback)
+
+### Changed
+
+- The `/models` request now passes `output_modalities=all` by default, so the catalog is no longer silently restricted to text-output models. Set `OUTPUT_MODALITIES = text` to restore the previous chat-only behaviour
+- Model-list cache fingerprint now includes `OUTPUT_MODALITIES`, so toggling the valve correctly invalidates the cached list
+- `_is_owui_managed_icon()` now also recognises `https://t0.gstatic.com/faviconV2` URLs as pipe-managed, so registry-sourced gstatic favicons remain overwriteable when OpenRouter updates its provider mapping
+
 ## [1.3.0] — 2026-05-07
 
 ### Added
diff --git a/README.md b/README.md
index 3f8f542..bc13378 100644
--- a/README.md
+++ b/README.md
@@ -4,8 +4,9 @@
 [![Python](https://img.shields.io/badge/Python-%E2%89%A53.10-blue)](https://www.python.org/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 
-Access **340+ AI models** through OpenRouter directly inside Open WebUI — with provider routing,
-reasoning tokens, streaming, fallbacks, and cache control out of the box.
+Access the **full OpenRouter catalog** — chat, TTS, audio, image-generation, and embedding models —
+directly inside Open WebUI, with provider routing, reasoning tokens, streaming, fallbacks, and
+cache control out of the box.
 
 ## Feature gallery
 
@@ -50,7 +51,13 @@ reasoning tokens, streaming, fallbacks, and cache control out of the box.
 
 ## Features
 
-- **Manifold pipe** — exposes all OpenRouter models as native Open WebUI models in the model selector.
+- **Manifold pipe** — exposes the full OpenRouter catalog (chat, TTS, audio, image, embeddings) as native Open WebUI models in the model selector. Configurable via `OUTPUT_MODALITIES` and `MODEL_CATEGORY`.
+- **Web search plugin** — attach OpenRouter's `web` plugin to any model with domain allow/deny lists, custom search prompt, and result-count limits.
+- **Variant routing** — surface virtual `:nitro`/`:exacto`/`:thinking`/`:online`/`:free`/`:extended` model entries that route to OpenRouter's specialized profiles.
+- **Service tier hint** — forward OpenAI-style `flex`/`priority`/`scale` tiers to compatible providers.
+- **Generation auditability** — optional generation ID footer maps each response to OpenRouter's `/generation?id=` activity API.
+- **Cached-input savings** — surface cached vs. non-cached prompt tokens in the cost footer (Anthropic prompt caching, OpenAI implicit caching, Gemini context caching).
+- **Deprecation visibility** — models with an `expiration_date` are tagged with ⚠ in the selector (or hidden via `HIDE_DEPRECATED_MODELS`).
 - **Provider routing** — sort by `price`, `throughput`, or `latency`; prefer or exclude specific providers; enforce `require_parameters`.
 - **Reasoning tokens** — `<think>` blocks streamed in real time with configurable effort (`low`, `medium`, `high`).
 - **Streaming** — full SSE streaming with mid-stream error handling and automatic `<think>` closure on error.
@@ -58,7 +65,7 @@ reasoning tokens, streaming, fallbacks, and cache control out of the box.
 - **Middle-out compression** — fits long prompts within context windows (`transforms: ["middle-out"]`).
 - **Cache control** — Anthropic-style `cache_control` injection on the longest message chunk.
 - **Citations** — `[n]` references from web-search-enabled models are converted to markdown links.
-- **Provider icons** — 13 provider logos synced directly into Open WebUI's model database.
+- **Provider icons** — 13 hardcoded fast-path logos plus auto-discovered icons for ~20 more providers (xAI, Inflection, NVIDIA, Arcee, Morph, Cerebras, …) lazy-loaded from OpenRouter's provider registry, all synced directly into Open WebUI's model database.
 - **Retry logic** — exponential backoff with jitter on timeout and connection errors.
 - **FREE_ONLY mode** — filter to show only free-tier models (`:free` suffix or `0/0` pricing).
 - **Pre-flight validation** — invalid API keys are caught at model-fetch time, not after sending a message.
@@ -145,7 +152,10 @@ Every valve accepts an environment variable fallback. The table below lists both
 | Valve | Env Var | Default | Description |
 | --- | --- | --- | --- |
 | `INCLUDE_REASONING` | `OPENROUTER_INCLUDE_REASONING` | `true` | Request reasoning tokens (`<think>` blocks) |
-| `REASONING_EFFORT` | `OPENROUTER_REASONING_EFFORT` | `""` | Effort level: `low`, `medium`, `high`, or empty |
+| `REASONING_EFFORT` | `OPENROUTER_REASONING_EFFORT` | `""` | Effort level: `minimal`, `low`, `medium`, `high`, `xhigh`, or empty |
+| `REASONING_SUMMARY_MODE` | `OPENROUTER_REASONING_SUMMARY_MODE` | `disabled` | Reasoning-summary verbosity: `auto`, `concise`, `detailed`, `disabled` |
+| `REASONING_MAX_TOKENS` | `OPENROUTER_REASONING_MAX_TOKENS` | `0` | Hard cap on reasoning tokens per response (0 disables the cap) |
+| `ENABLE_ANTHROPIC_INTERLEAVED_THINKING` | `OPENROUTER_ANTHROPIC_INTERLEAVED_THINKING` | `true` | Auto-inject `anthropic-beta: interleaved-thinking-2025-05-14` for `anthropic/*` models |
 
 ### Display & Filtering
 
@@ -154,7 +164,13 @@ Every valve accepts an environment variable fallback. The table below lists both
 | `MODEL_PREFIX` | — | `None` | Custom prefix for model names (e.g. `🔥 `) |
 | `MODEL_PROVIDERS` | `OPENROUTER_MODEL_PROVIDERS` | `ALL` | Provider filter (e.g. `openai,anthropic`). `ALL` means no filter |
 | `INVERT_PROVIDER_LIST` | `OPENROUTER_INVERT_PROVIDER_LIST` | `false` | Treat `MODEL_PROVIDERS` as an exclusion list |
-| `FREE_ONLY` | `OPENROUTER_FREE_ONLY` | `false` | Show only free-tier models |
+| `FREE_MODEL_FILTER` | `OPENROUTER_FREE_MODEL_FILTER` | `all` | Free-tier filter: `all` / `only` / `exclude` |
+| `TOOL_CALLING_FILTER` | `OPENROUTER_TOOL_CALLING_FILTER` | `all` | Tool-capable filter (reads `supported_parameters`): `all` / `only` / `exclude` |
+| `OUTPUT_MODALITIES` | `OPENROUTER_OUTPUT_MODALITIES` | `all` | Output modalities to fetch from `/models`. `all` (default) lists every model. Restrict with `text`, `image`, `audio`, `embeddings`, or a comma list (e.g. `text,audio`) |
+| `MODEL_VARIANTS` | `OPENROUTER_MODEL_VARIANTS` | `""` | Comma-separated `base_id:tag` entries that surface virtual variant models (e.g. `openai/gpt-4o:nitro`). Tags: `free`, `thinking`, `online`, `nitro`, `exacto`, `extended` |
+| `MODEL_CATEGORY` | `OPENROUTER_MODEL_CATEGORY` | `""` | Server-side category filter (`?category=`). Common values: `programming`, `roleplay`, `marketing`, `science`, `legal`, `finance`, `health`, `academia` |
+| `HIDE_DEPRECATED_MODELS` | `OPENROUTER_HIDE_DEPRECATED_MODELS` | `false` | Hide models with a non-null `expiration_date`. When False, deprecated models are tagged `⚠ {name} (deprecated)` |
+| `ZDR_MODELS_ONLY` | `OPENROUTER_ZDR_MODELS_ONLY` | `false` | Catalog-side: hide models without a ZDR endpoint (reads `/endpoints/zdr`) |
 
 ### Provider Routing
 
@@ -163,8 +179,15 @@ Every valve accepts an environment variable fallback. The table below lists both
 | `PROVIDER_SORT` | `OPENROUTER_PROVIDER_SORT` | `""` | Sort: `price`, `throughput`, `latency` |
 | `PROVIDER_ORDER` | `OPENROUTER_PROVIDER_ORDER` | `""` | Preferred providers (comma-separated) |
 | `PROVIDER_IGNORE` | `OPENROUTER_PROVIDER_IGNORE` | `""` | Excluded providers (comma-separated) |
+| `PROVIDER_ONLY` | `OPENROUTER_PROVIDER_ONLY` | `""` | Provider allowlist (comma-separated). Merged with account-wide settings |
+| `PROVIDER_QUANTIZATIONS` | `OPENROUTER_PROVIDER_QUANTIZATIONS` | `""` | Allowed quantizations (comma-separated, e.g. `bf16,fp8`) |
+| `PROVIDER_ALLOW_FALLBACKS` | `OPENROUTER_PROVIDER_ALLOW_FALLBACKS` | `true` | When False, OpenRouter fails fast on the primary/ordered provider instead of falling back |
+| `PROVIDER_MAX_PRICE_PROMPT` | `OPENROUTER_PROVIDER_MAX_PRICE_PROMPT` | `""` | Maximum prompt price (USD per 1M tokens) |
+| `PROVIDER_MAX_PRICE_COMPLETION` | `OPENROUTER_PROVIDER_MAX_PRICE_COMPLETION` | `""` | Maximum completion price (USD per 1M tokens) |
+| `SERVICE_TIER` | `OPENROUTER_SERVICE_TIER` | `""` | OpenAI-style service tier: `auto`, `default`, `flex`, `priority`, `scale` |
 | `REQUIRE_PARAMETERS` | `OPENROUTER_REQUIRE_PARAMETERS` | `false` | Only use providers that support all request parameters |
 | `DATA_COLLECTION` | `OPENROUTER_DATA_COLLECTION` | `allow` | Data policy: `allow` or `deny` |
+| `ZDR_ENFORCE` | `OPENROUTER_ZDR_ENFORCE` | `false` | Send `provider.zdr=true` so OpenRouter routes only to ZDR endpoints (request fails if none available) |
 
 ### Advanced
 
@@ -172,7 +195,14 @@ Every valve accepts an environment variable fallback. The table below lists both
 | --- | --- | --- | --- |
 | `FALLBACK_MODELS` | `OPENROUTER_FALLBACK_MODELS` | `""` | Fallback model IDs (comma-separated) |
 | `ENABLE_MIDDLE_OUT` | `OPENROUTER_ENABLE_MIDDLE_OUT` | `false` | Middle-out compression for long prompts |
+| `ENABLE_WEB_SEARCH` | `OPENROUTER_ENABLE_WEB_SEARCH` | `false` | Attach OpenRouter's `web` plugin so any model can ground answers in fresh web results |
+| `WEB_SEARCH_MAX_RESULTS` | `OPENROUTER_WEB_SEARCH_MAX_RESULTS` | `5` | Max search results passed to the model (1-20) |
+| `WEB_SEARCH_PROMPT` | `OPENROUTER_WEB_SEARCH_PROMPT` | `""` | Optional custom search prompt forwarded to the search engine |
+| `WEB_SEARCH_INCLUDE_DOMAINS` | `OPENROUTER_WEB_SEARCH_INCLUDE_DOMAINS` | `""` | Domain allowlist (supports wildcards & paths) |
+| `WEB_SEARCH_EXCLUDE_DOMAINS` | `OPENROUTER_WEB_SEARCH_EXCLUDE_DOMAINS` | `""` | Domain denylist |
 | `ENABLE_CACHE_CONTROL` | `OPENROUTER_ENABLE_CACHE_CONTROL` | `false` | Inject Anthropic `cache_control` on the longest message |
+| `ANTHROPIC_PROMPT_CACHE_TTL` | `OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL` | `5m` | TTL for the Anthropic ephemeral cache breakpoint: `5m` or `1h` |
+| `SHOW_GENERATION_ID` | `OPENROUTER_SHOW_GENERATION_ID` | `false` | Append the OpenRouter generation ID to each response (for `GET /generation?id=` lookups) |
 | `SYNC_PROVIDER_ICONS` | `OPENROUTER_SYNC_ICONS` | `true` | Sync provider icons into Open WebUI's model database |
 
 ### Network
@@ -181,6 +211,7 @@ Every valve accepts an environment variable fallback. The table below lists both
 | --- | --- | --- | --- |
 | `REQUEST_TIMEOUT` | `OPENROUTER_REQUEST_TIMEOUT` | `90` | HTTP timeout in seconds |
 | `MAX_RETRIES` | — | `2` | Auto-retry count on transient errors |
+| `HTTP_REFERER_OVERRIDE` | `OPENROUTER_HTTP_REFERER` | `""` | Override the `HTTP-Referer` header sent to OpenRouter (must include scheme). Empty falls back to `WEBUI_URL` |
 
 ## Architecture
 
@@ -320,6 +351,15 @@ A: `FALLBACK_MODELS` adds extra model IDs to the `models` array in the OpenRoute
 primary model fails, OpenRouter automatically tries the next one. Non-streaming responses include
 a "Responded by: model-id" attribution when a fallback handled the request.
 
+**Q: I selected a TTS / embeddings / image-generation model and got an error — why?**
+
+A: The pipe routes every request through OpenRouter's `/chat/completions` endpoint. Models that
+only expose a non-chat endpoint (e.g. pure TTS models served via `/audio/speech`) return an
+"endpoint not supported" error from OpenRouter. The pipe surfaces that error verbatim. Chat
+completion models that *output* audio or images (e.g. `openai/gpt-audio`) work normally — their
+audio transcript and generated images are rendered inline. To hide non-chat models from the
+selector entirely, set `OUTPUT_MODALITIES = text`.
+
 ## License
 
 This project is licensed under the **MIT License** — see the [LICENSE](LICENSE) file for details.
diff --git a/function.json b/function.json
index c132bb7..f708d95 100644
--- a/function.json
+++ b/function.json
@@ -3,13 +3,13 @@
   "name": "OpenRouter Pipe",
   "type": "manifold",
   "meta": {
-    "description": "Access 340+ AI models through OpenRouter directly inside Open WebUI. Features provider routing, reasoning tokens with <think> tags, full SSE streaming, model fallbacks, middle-out compression, Anthropic cache control, citations, 13 provider icons, and configurable retry logic.",
+    "description": "The definitive OpenRouter integration for Open WebUI. Full catalog (chat/TTS/audio/image/embeddings), variant routing (:nitro/:exacto/:thinking/:online/:free/:extended), web search plugin with domain filters, server-side category filter, deprecation warnings, extended reasoning (minimal→xhigh + max_tokens + summary), Anthropic interleaved thinking + cache TTL, ZDR enforcement, tool/free-tier filters, provider preferences (only/quantizations/max_price/allow_fallbacks), service tier routing (auto/flex/priority/scale), generation-ID auditability, cached-input cost breakdown, model fallbacks, middle-out compression, citations, auto-discovered provider icons.",
     "manifest": {
       "title": "OpenRouter Pipe",
       "author": "Sena Labs",
       "author_url": "https://github.com/sena-labs",
       "funding_url": "https://github.com/sponsors/sena-labs",
-      "version": "1.3.0",
+      "version": "1.6.0",
       "license": "MIT",
       "required_open_webui_version": "0.4.0",
       "requirements": ["requests>=2.20", "pydantic>=2.0"]
diff --git a/integration_test.py b/integration_test.py
index 8fdd156..51b5201 100644
--- a/integration_test.py
+++ b/integration_test.py
@@ -169,13 +169,13 @@ def _check_chat_available() -> bool:
 )
 
 # ══════════════════════════════════════════════════════════════════════════════
-# 3. FREE_ONLY filter
+# 3. FREE_MODEL_FILTER='only'
 # ══════════════════════════════════════════════════════════════════════════════
 
-_section("3. FREE_ONLY filter")
+_section("3. FREE_MODEL_FILTER='only'")
 
 pipe_free = Pipe()
-pipe_free.valves = Pipe.Valves(OPENROUTER_API_KEY=API_KEY, FREE_ONLY=True)
+pipe_free.valves = Pipe.Valves(OPENROUTER_API_KEY=API_KEY, FREE_MODEL_FILTER="only")
 free_models = pipe_free.pipes()
 _assert(len(free_models) > 0, f"free models: {len(free_models)}")
 _assert(
diff --git a/openrouter_pipe.py b/openrouter_pipe.py
index 8d4f3c8..8281ee0 100644
--- a/openrouter_pipe.py
+++ b/openrouter_pipe.py
@@ -3,12 +3,12 @@
 author: Sena Labs
 author_url: https://github.com/sena-labs
 funding_url: https://github.com/sponsors/sena-labs
-version: 1.3.0
+version: 1.6.0
 license: MIT
 icon_url: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj48ZGVmcz48bGluZWFyR3JhZGllbnQgaWQ9ImJnIiB4MT0iMCUiIHkxPSIwJSIgeDI9IjEwMCUiIHkyPSIxMDAlIj48c3RvcCBvZmZzZXQ9IjAlIiBzdG9wLWNvbG9yPSIjNmQyOGQ5Ii8+PHN0b3Agb2Zmc2V0PSIxMDAlIiBzdG9wLWNvbG9yPSIjYTc4YmZhIi8+PC9saW5lYXJHcmFkaWVudD48L2RlZnM+PHJlY3Qgd2lkdGg9IjEwMCIgaGVpZ2h0PSIxMDAiIHJ4PSIyMCIgZmlsbD0idXJsKCNiZykiLz48cGF0aCBkPSJNMjAgNTAgQzIwIDMwLCA0MCAzMCwgNTAgMzAgTDUwIDIyIEw2OCA0MCBMNTAgNTggTDUwIDUwIEM0MCA1MCwgMzUgNDUsIDMwIDUwIEMyNSA1NSwgMjAgNzAsIDIwIDUwIFoiIGZpbGw9IndoaXRlIiBvcGFjaXR5PSIwLjk1Ii8+PGNpcmNsZSBjeD0iNzgiIGN5PSIzMCIgcj0iNyIgZmlsbD0id2hpdGUiIG9wYWNpdHk9IjAuOCIvPjxjaXJjbGUgY3g9IjgyIiBjeT0iNTAiIHI9IjciIGZpbGw9IndoaXRlIiBvcGFjaXR5PSIwLjk1Ii8+PGNpcmNsZSBjeD0iNzgiIGN5PSI3MCIgcj0iNyIgZmlsbD0id2hpdGUiIG9wYWNpdHk9IjAuOCIvPjxsaW5lIHgxPSI2OCIgeTE9IjQwIiB4Mj0iNzYiIHkyPSIzMiIgc3Ryb2tlPSJ3aGl0ZSIgc3Ryb2tlLXdpZHRoPSIyIiBvcGFjaXR5PSIwLjUiLz48bGluZSB4MT0iNjgiIHkxPSI0MCIgeDI9Ijc2IiB5Mj0iNTAiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iMiIgb3BhY2l0eT0iMC41Ii8+PGxpbmUgeDE9IjY4IiB5MT0iNDAiIHgyPSI3NiIgeTI9IjY4IiBzdHJva2U9IndoaXRlIiBzdHJva2Utd2lkdGg9IjIiIG9wYWNpdHk9IjAuNSIvPjwvc3ZnPg==
 required_open_webui_version: 0.4.0
 requirements: requests>=2.20, pydantic>=2.0
-description: Access 340+ AI models through OpenRouter directly inside Open WebUI. Features provider routing, reasoning tokens with <think> tags, full SSE streaming, model fallbacks, middle-out compression, Anthropic cache control, citations, 13 provider icons, and configurable retry logic.
+description: The definitive OpenRouter integration for Open WebUI. Full catalog (chat/TTS/audio/image/embeddings), variant routing (:nitro/:exacto/:thinking/:online/:free/:extended), web search plugin with domain filters, server-side category filter, deprecation warnings, extended reasoning (minimal→xhigh + max_tokens + summary), Anthropic interleaved thinking + cache TTL, ZDR enforcement, tool/free-tier filters, provider preferences (only/quantizations/max_price/allow_fallbacks), service tier routing (auto/flex/priority/scale), generation-ID auditability, cached-input cost breakdown, model fallbacks, middle-out compression, citations, auto-discovered provider icons.
 """
 
 import copy
@@ -35,13 +35,31 @@
 # API path constants
 _API_PATH_MODELS = "/models"
 _API_PATH_CHAT = "/chat/completions"
+_API_PATH_ZDR_ENDPOINTS = "/endpoints/zdr"
+
+# Beta header for Claude's interleaved-thinking + tool-use mode.
+# https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
+_ANTHROPIC_INTERLEAVED_THINKING_BETA = "interleaved-thinking-2025-05-14"
+
+# OpenRouter variant suffixes that route to specialized providers/profiles.
+# https://openrouter.ai/docs/features/preset-routing
+_RECOGNISED_VARIANT_TAGS = frozenset(
+    {"free", "thinking", "online", "nitro", "exacto", "extended"}
+)
 
 # Cache TTL for model list (seconds)
 _MODELS_CACHE_TTL = 300.0  # 5 minutes
 
+# OpenRouter's frontend provider registry — gives us icon URLs for ~70 providers
+# (hosted SVG/PNG when available, gstatic favicons otherwise). Used as a
+# dynamic fallback when a model's author isn't in _PROVIDER_ICONS.
+_PROVIDER_REGISTRY_URL = "https://openrouter.ai/api/frontend/all-providers"
+
 # Provider icons — synced into the Open WebUI Models database by
 # _sync_model_icons() so the frontend can serve them via
 # /models/model/profile/image.  Disable with SYNC_PROVIDER_ICONS = False.
+# Hardcoded fast path for top model authors; everything else is auto-discovered
+# via _load_provider_registry().
 # URLs verified against https://openrouter.ai/images/icons/ (May 2025).
 _PROVIDER_ICONS = {
     "openai": "https://openrouter.ai/images/icons/OpenAI.svg",
@@ -70,8 +88,12 @@ def _is_owui_managed_icon(url: str) -> bool:
 
     data: URLs are the pipe's own SVG icon that OWUI assigns as default to all
     manifold child models.  openrouter.ai/images/models/ and
-    openrouter.ai/images/icons/ are the provider icon paths we write (the
-    former was the old path, now superseded by the latter).  Any other URL is
+    openrouter.ai/images/icons/ are the OpenRouter-hosted provider icons we
+    write (the former was the old path, superseded by the latter).
+    t0.gstatic.com/faviconV2 URLs are the gstatic favicons returned by
+    OpenRouter's provider registry for providers without a hosted icon — we
+    write those too as part of icon auto-discovery, so they must remain
+    overwriteable when OpenRouter updates its mapping.  Any other URL is
     assumed to be a user-set custom icon and must not be overwritten.
     """
     return (
@@ -79,6 +101,7 @@ def _is_owui_managed_icon(url: str) -> bool:
         or url.startswith("data:")
         or url.startswith("https://openrouter.ai/images/models/")
         or url.startswith("https://openrouter.ai/images/icons/")
+        or url.startswith("https://t0.gstatic.com/faviconV2")
     )
 
 
@@ -126,7 +149,11 @@ def _format_citation_list(citations: Optional[List[str]]) -> str:
 
 
 def _format_cost_info(usage: dict, currency: str = "USD") -> str:
-    """Format token usage and cost from an OpenRouter usage dict."""
+    """Format token usage and cost from an OpenRouter usage dict.
+
+    When the provider reports cached prompt tokens (90%+ cheaper on most
+    providers), the breakdown is shown so users see the savings.
+    """
     if not usage:
         return ""
     prompt = usage.get("prompt_tokens", 0)
@@ -134,7 +161,23 @@ def _format_cost_info(usage: dict, currency: str = "USD") -> str:
     total = usage.get("total_tokens", 0) or (prompt + completion)
     cost = usage.get("cost")
 
-    token_str = f"{prompt:,} prompt + {completion:,} completion = {total:,} total"
+    # Cached prompt tokens — emitted by Anthropic prompt caching, OpenAI
+    # implicit caching, and Gemini context caching. Shape varies per provider.
+    cached_tokens = 0
+    details = usage.get("prompt_tokens_details") or {}
+    if isinstance(details, dict):
+        cached_tokens = details.get("cached_tokens") or 0
+    if not cached_tokens:
+        cached_tokens = usage.get("cache_read_input_tokens") or 0
+
+    if cached_tokens:
+        non_cached = max(prompt - int(cached_tokens), 0)
+        token_str = (
+            f"{non_cached:,} prompt + {int(cached_tokens):,} cached + "
+            f"{completion:,} completion = {total:,} total"
+        )
+    else:
+        token_str = f"{prompt:,} prompt + {completion:,} completion = {total:,} total"
     parts = [f"**Tokens:** {token_str}"]
 
     if cost is not None:
@@ -156,6 +199,17 @@ def _format_cost_info(usage: dict, currency: str = "USD") -> str:
     return f"\n\n---\n*{' · '.join(parts)}*"
 
 
+def _format_generation_id(generation_id: Optional[str]) -> str:
+    """Format the OpenRouter generation ID footer.
+
+    Users can pass the ID to ``GET /api/v1/generation?id={id}`` to retrieve
+    detailed usage and routing info for any past request.
+    """
+    if not generation_id:
+        return ""
+    return f"\n\n---\n*Generation ID: `{generation_id}`*"
+
+
 def _format_image_output(images: list) -> str:
     """Format OpenRouter image output objects as markdown image tags.
 
@@ -189,23 +243,69 @@ class Valves(BaseModel):
         )
         REASONING_EFFORT: str = Field(
             default=os.getenv("OPENROUTER_REASONING_EFFORT", ""),
-            description="Controls reasoning depth. Works independently of Include Reasoning",
+            description=(
+                "Controls reasoning depth. Works independently of Include Reasoning. "
+                "'minimal' favors fastest output, 'xhigh' requests maximum depth on "
+                "supporting models."
+            ),
             json_schema_extra={
                 "input": {
                     "type": "select",
                     "options": [
                         {"value": "", "label": "Disabled"},
+                        {"value": "minimal", "label": "Minimal"},
                         {"value": "low", "label": "Low"},
                         {"value": "medium", "label": "Medium"},
                         {"value": "high", "label": "High"},
+                        {"value": "xhigh", "label": "Extra High"},
+                    ],
+                }
+            },
+        )
+        REASONING_SUMMARY_MODE: str = Field(
+            default=os.getenv("OPENROUTER_REASONING_SUMMARY_MODE", "disabled"),
+            description=(
+                "Reasoning summary verbosity sent as `reasoning.summary` in the "
+                "request payload. 'disabled' (default) skips the field entirely; "
+                "supporting models emit a concise/detailed summary block alongside "
+                "their reasoning trace."
+            ),
+            json_schema_extra={
+                "input": {
+                    "type": "select",
+                    "options": [
+                        {"value": "disabled", "label": "Disabled"},
+                        {"value": "auto", "label": "Auto"},
+                        {"value": "concise", "label": "Concise"},
+                        {"value": "detailed", "label": "Detailed"},
                     ],
                 }
             },
         )
+        REASONING_MAX_TOKENS: int = Field(
+            default=int(os.getenv("OPENROUTER_REASONING_MAX_TOKENS", "0")),
+            ge=0,
+            description=(
+                "Hard cap on reasoning tokens per response (sent as "
+                "`reasoning.max_tokens`). 0 (default) leaves the cap to the "
+                "provider. Useful for budget control on deep-thinking models."
+            ),
+        )
         INCLUDE_REASONING: bool = Field(
             default=os.getenv("OPENROUTER_INCLUDE_REASONING", "true").lower() == "true",
             description="Show model reasoning in <think> blocks. Can be used with or without Reasoning Effort",
         )
+        ENABLE_ANTHROPIC_INTERLEAVED_THINKING: bool = Field(
+            default=os.getenv(
+                "OPENROUTER_ANTHROPIC_INTERLEAVED_THINKING", "true"
+            ).lower()
+            == "true",
+            description=(
+                "When True and the selected model is `anthropic/...`, send the "
+                "`anthropic-beta: interleaved-thinking-2025-05-14` header so Claude "
+                "interleaves reasoning with tool use. No effect on other providers."
+            ),
+        )
         MODEL_PREFIX: Optional[str] = Field(
             default=None, description="Prefix shown before model names (include trailing space if needed, e.g. 'OR: ')"
         )
@@ -218,9 +318,80 @@ class Valves(BaseModel):
             == "true",
             description="When true the provider list becomes an exclusion list",
         )
-        FREE_ONLY: bool = Field(
-            default=os.getenv("OPENROUTER_FREE_ONLY", "false").lower() == "true",
-            description="Show only free-tier models (by suffix :free or zero pricing)",
+        FREE_MODEL_FILTER: str = Field(
+            default=os.getenv("OPENROUTER_FREE_MODEL_FILTER", "all"),
+            description=(
+                "Filter the catalog by free-tier status (':free' suffix or zero "
+                "prompt+completion pricing). 'all' = no filter (default), "
+                "'only' = keep just free models, 'exclude' = hide free models."
+            ),
+            json_schema_extra={
+                "input": {
+                    "type": "select",
+                    "options": [
+                        {"value": "all", "label": "All"},
+                        {"value": "only", "label": "Only free"},
+                        {"value": "exclude", "label": "Exclude free"},
+                    ],
+                }
+            },
+        )
+        TOOL_CALLING_FILTER: str = Field(
+            default=os.getenv("OPENROUTER_TOOL_CALLING_FILTER", "all"),
+            description=(
+                "Filter the catalog by tool-calling capability "
+                "(`supported_parameters` containing `tools` or `tool_choice`). "
+                "'all' (default) keeps everything, 'only' restricts to tool-capable "
+                "models, 'exclude' hides them."
+            ),
+            json_schema_extra={
+                "input": {
+                    "type": "select",
+                    "options": [
+                        {"value": "all", "label": "All"},
+                        {"value": "only", "label": "Only tool-capable"},
+                        {"value": "exclude", "label": "Exclude tool-capable"},
+                    ],
+                }
+            },
+        )
+        MODEL_VARIANTS: str = Field(
+            default=os.getenv("OPENROUTER_MODEL_VARIANTS", ""),
+            description=(
+                "Comma-separated `base_id:variant` entries to expose as virtual "
+                "models that inherit the base model's metadata (name, icon). "
+                "Example: 'openai/gpt-4o:nitro, anthropic/claude-3.5-sonnet:thinking'. "
+                "Recognised tags: free, thinking, online, nitro, exacto, extended. "
+                "OpenRouter routes the suffixed ID specially "
+                "(see https://openrouter.ai/docs/features/preset-routing)."
+            ),
+        )
+        MODEL_CATEGORY: str = Field(
+            default=os.getenv("OPENROUTER_MODEL_CATEGORY", ""),
+            description=(
+                "Server-side category filter for `/models` (passed as "
+                "`?category=...`). Empty disables. Common values: "
+                "programming, roleplay, marketing, marketing/seo, technology, "
+                "science, translation, legal, finance, health, trivia, academia."
+            ),
+        )
+        HIDE_DEPRECATED_MODELS: bool = Field(
+            default=os.getenv("OPENROUTER_HIDE_DEPRECATED_MODELS", "false").lower()
+            == "true",
+            description=(
+                "Hide models with a non-null `expiration_date`. When False "
+                "(default), deprecated models stay visible but are tagged with "
+                "a ⚠ prefix in the display name."
+            ),
+        )
+        OUTPUT_MODALITIES: str = Field(
+            default=os.getenv("OPENROUTER_OUTPUT_MODALITIES", "all"),
+            description=(
+                "Output modalities to fetch from OpenRouter's /models endpoint. "
+                "'all' (default) lists every model — chat, TTS, audio, image, and embeddings. "
+                "Use 'text' for chat-only, or a comma list e.g. 'text,audio'. "
+                "Valid tokens: text, image, audio, embeddings, all."
+            ),
         )
         PROVIDER_SORT: str = Field(
             default=os.getenv("OPENROUTER_PROVIDER_SORT", ""),
@@ -245,6 +416,67 @@ class Valves(BaseModel):
             default=os.getenv("OPENROUTER_PROVIDER_IGNORE", ""),
             description="Excluded providers, comma-separated",
         )
+        PROVIDER_ONLY: str = Field(
+            default=os.getenv("OPENROUTER_PROVIDER_ONLY", ""),
+            description=(
+                "Allowlist of provider slugs to use (comma-separated). When "
+                "set, OpenRouter routes only to these providers. Merged with "
+                "your account-wide allowlist."
+            ),
+        )
+        PROVIDER_QUANTIZATIONS: str = Field(
+            default=os.getenv("OPENROUTER_PROVIDER_QUANTIZATIONS", ""),
+            description=(
+                "Comma-separated quantization filters (e.g. 'bf16,fp8'). Only "
+                "endpoints serving the model at one of these precisions will "
+                "be used. Common values: bf16, fp16, fp8, int8, int4."
+            ),
+        )
+        PROVIDER_ALLOW_FALLBACKS: bool = Field(
+            default=os.getenv("OPENROUTER_PROVIDER_ALLOW_FALLBACKS", "true").lower()
+            == "true",
+            description=(
+                "When True (default), OpenRouter falls back to alternate "
+                "providers if the primary one (or those in PROVIDER_ORDER) is "
+                "unavailable. Set False to fail fast on the primary provider."
+            ),
+        )
+        PROVIDER_MAX_PRICE_PROMPT: str = Field(
+            default=os.getenv("OPENROUTER_PROVIDER_MAX_PRICE_PROMPT", ""),
+            description=(
+                "Maximum prompt price (USD per 1M tokens) you accept for this "
+                "request, e.g. '3.0'. Empty disables. Sent as "
+                "`provider.max_price.prompt`."
+            ),
+        )
+        PROVIDER_MAX_PRICE_COMPLETION: str = Field(
+            default=os.getenv("OPENROUTER_PROVIDER_MAX_PRICE_COMPLETION", ""),
+            description=(
+                "Maximum completion price (USD per 1M tokens) you accept for "
+                "this request, e.g. '15.0'. Empty disables. Sent as "
+                "`provider.max_price.completion`."
+            ),
+        )
+        SERVICE_TIER: str = Field(
+            default=os.getenv("OPENROUTER_SERVICE_TIER", ""),
+            description=(
+                "OpenAI-style service tier hint forwarded to compatible "
+                "providers. Empty (default) leaves the choice to the provider."
+            ),
+            json_schema_extra={
+                "input": {
+                    "type": "select",
+                    "options": [
+                        {"value": "", "label": "Default"},
+                        {"value": "auto", "label": "Auto"},
+                        {"value": "default", "label": "Default tier"},
+                        {"value": "flex", "label": "Flex (cheaper, slower)"},
+                        {"value": "priority", "label": "Priority (faster)"},
+                        {"value": "scale", "label": "Scale"},
+                    ],
+                }
+            },
+        )
         REQUIRE_PARAMETERS: bool = Field(
             default=os.getenv("OPENROUTER_REQUIRE_PARAMETERS", "false").lower()
             == "true",
@@ -272,11 +504,89 @@ class Valves(BaseModel):
             == "true",
             description="Automatically compress long conversations that exceed the model's context window by summarizing middle messages",
         )
+        ENABLE_WEB_SEARCH: bool = Field(
+            default=os.getenv("OPENROUTER_ENABLE_WEB_SEARCH", "false").lower()
+            == "true",
+            description=(
+                "Attach OpenRouter's `web` plugin to every request so the "
+                "model can ground answers in fresh web results. Stacks with "
+                "the `:online` variant tag (provider-side) — pick one. "
+                "OpenRouter charges per search call separately from tokens."
+            ),
+        )
+        WEB_SEARCH_MAX_RESULTS: int = Field(
+            default=int(os.getenv("OPENROUTER_WEB_SEARCH_MAX_RESULTS", "5")),
+            ge=1,
+            le=20,
+            description="Maximum number of search results returned to the model when ENABLE_WEB_SEARCH is on.",
+        )
+        WEB_SEARCH_PROMPT: str = Field(
+            default=os.getenv("OPENROUTER_WEB_SEARCH_PROMPT", ""),
+            description=(
+                "Optional custom search prompt forwarded to the search engine "
+                "(`plugins[].search_prompt`). Empty uses OpenRouter's default."
+            ),
+        )
+        WEB_SEARCH_INCLUDE_DOMAINS: str = Field(
+            default=os.getenv("OPENROUTER_WEB_SEARCH_INCLUDE_DOMAINS", ""),
+            description=(
+                "Comma-separated domain allowlist for web search. Wildcards "
+                "and path filters supported (e.g. '*.substack.com, "
+                "openai.com/blog')."
+            ),
+        )
+        WEB_SEARCH_EXCLUDE_DOMAINS: str = Field(
+            default=os.getenv("OPENROUTER_WEB_SEARCH_EXCLUDE_DOMAINS", ""),
+            description="Comma-separated domain denylist for web search (same format as include list).",
+        )
         ENABLE_CACHE_CONTROL: bool = Field(
             default=os.getenv("OPENROUTER_ENABLE_CACHE_CONTROL", "false").lower()
             == "true",
             description="Enable prompt caching for Anthropic models (reduces cost on repeated long prompts). No effect on other providers",
         )
+        ANTHROPIC_PROMPT_CACHE_TTL: str = Field(
+            default=os.getenv("OPENROUTER_ANTHROPIC_PROMPT_CACHE_TTL", "5m"),
+            description=(
+                "TTL for the Anthropic ephemeral cache breakpoint when "
+                "ENABLE_CACHE_CONTROL is on. '5m' (default) keeps the standard "
+                "short-lived cache; '1h' costs more on cache writes but persists "
+                "longer between turns."
+            ),
+            json_schema_extra={
+                "input": {
+                    "type": "select",
+                    "options": [
+                        {"value": "5m", "label": "5 minutes"},
+                        {"value": "1h", "label": "1 hour"},
+                    ],
+                }
+            },
+        )
+        ZDR_ENFORCE: bool = Field(
+            default=os.getenv("OPENROUTER_ZDR_ENFORCE", "false").lower() == "true",
+            description=(
+                "When True, every chat request includes `provider.zdr=true` so "
+                "OpenRouter rejects the call unless a Zero Data Retention "
+                "endpoint is available for the chosen model."
+            ),
+        )
+        ZDR_MODELS_ONLY: bool = Field(
+            default=os.getenv("OPENROUTER_ZDR_MODELS_ONLY", "false").lower() == "true",
+            description=(
+                "Catalog-side filter: when True, fetch OpenRouter's "
+                "`/endpoints/zdr` list and hide models without any ZDR-capable "
+                "endpoint. Pairs well with ZDR_ENFORCE for end-to-end privacy "
+                "guarantees."
+            ),
+        )
+        HTTP_REFERER_OVERRIDE: str = Field(
+            default=os.getenv("OPENROUTER_HTTP_REFERER", ""),
+            description=(
+                "Override the `HTTP-Referer` header sent to OpenRouter for app "
+                "attribution (must be a full URL with scheme). Empty falls back "
+                "to WEBUI_URL or http://localhost:3000."
+            ),
+        )
         SYNC_PROVIDER_ICONS: bool = Field(
             default=os.getenv("OPENROUTER_SYNC_ICONS", "true").lower() == "true",
             description="Automatically sync provider icons into Open WebUI's model database so they appear in the UI",
@@ -293,6 +603,15 @@ class Valves(BaseModel):
             default=False,
             description="Append token usage and cost to each response",
         )
+        SHOW_GENERATION_ID: bool = Field(
+            default=os.getenv("OPENROUTER_SHOW_GENERATION_ID", "false").lower()
+            == "true",
+            description=(
+                "Append the OpenRouter generation ID to each response so it "
+                "can be looked up later via `GET /generation?id=...` for "
+                "audit trails and per-request usage details."
+            ),
+        )
         COST_CURRENCY: str = Field(
             default=os.getenv("OPENROUTER_COST_CURRENCY", "USD"),
             description="Currency label shown in cost display (display only; OpenRouter bills in USD)",
@@ -332,6 +651,12 @@ def __init__(self) -> None:
         self._models_cache_key: str = ""
         # Track which model IDs already have icons synced (avoids repeated DB writes)
         self._icons_synced: set = set()
+        # Lazy-loaded mirror of OpenRouter's provider registry (slug → icon URL).
+        # None = not attempted; {} = attempted but failed/empty (do not retry).
+        self._provider_registry: Optional[dict] = None
+        # Lazy-loaded set of model IDs that have at least one ZDR endpoint.
+        # None = not attempted; frozenset() = attempted but failed/empty.
+        self._zdr_model_ids: Optional[frozenset] = None
         # Cache function_id once: OWUI sets __module__ to "function_{id}" at load time
         _fm = type(self).__module__ or ""
         self._function_id: Optional[str] = (
@@ -367,9 +692,12 @@ def _build_cache_key(self) -> str:
             else ""
         )
         return (
-            f"{api_key_hash}|{self.valves.FREE_ONLY}|"
+            f"{api_key_hash}|{self.valves.FREE_MODEL_FILTER}|"
             f"{self.valves.MODEL_PROVIDERS}|{self.valves.INVERT_PROVIDER_LIST}|"
-            f"{self.valves.MODEL_PREFIX}"
+            f"{self.valves.MODEL_PREFIX}|{self.valves.OUTPUT_MODALITIES}|"
+            f"{self.valves.TOOL_CALLING_FILTER}|{self.valves.ZDR_MODELS_ONLY}|"
+            f"{self.valves.MODEL_VARIANTS}|{self.valves.MODEL_CATEGORY}|"
+            f"{self.valves.HIDE_DEPRECATED_MODELS}"
         )
 
     def _models_cache_valid(self) -> bool:
@@ -395,10 +723,18 @@ def pipes(self) -> List[dict]:
             return self._models_cache
 
         headers = self._build_headers(include_content_type=False)
+        modalities = (self.valves.OUTPUT_MODALITIES or "all").strip() or "all"
+        params: dict = {"output_modalities": modalities}
+        category = (self.valves.MODEL_CATEGORY or "").strip()
+        if category:
+            params["category"] = category
         response = None
         try:
             response = self._session.get(
-                self.models_url, headers=headers, timeout=self.valves.REQUEST_TIMEOUT
+                self.models_url,
+                headers=headers,
+                params=params,
+                timeout=self.valves.REQUEST_TIMEOUT,
             )
             # Detect auth errors from the models endpoint itself
             # 502 from Clerk usually means the key format is invalid
@@ -438,6 +774,12 @@ def pipes(self) -> List[dict]:
 
         provider_filter = self._parse_provider_filter()
         prefix = self.valves.MODEL_PREFIX or ""
+        free_filter = (self.valves.FREE_MODEL_FILTER or "all").strip().lower()
+        tool_filter = (self.valves.TOOL_CALLING_FILTER or "all").strip().lower()
+        zdr_only = self.valves.ZDR_MODELS_ONLY
+        zdr_capable_ids: Optional[frozenset] = (
+            self._load_zdr_model_ids() if zdr_only else None
+        )
         models: List[dict] = []
 
         for model in data:
@@ -445,7 +787,7 @@ def pipes(self) -> List[dict]:
             if not model_id:
                 continue
 
-            if self.valves.FREE_ONLY:
+            if free_filter in ("only", "exclude"):
                 is_free = ":free" in model_id.lower()
                 if not is_free:
                     pricing = model.get("pricing") or {}
@@ -456,9 +798,37 @@ def pipes(self) -> List[dict]:
                         )
                     except (ValueError, TypeError):
                         is_free = False
-                if not is_free:
+                if free_filter == "only" and not is_free:
+                    continue
+                if free_filter == "exclude" and is_free:
+                    continue
+
+            if tool_filter in ("only", "exclude"):
+                supported = model.get("supported_parameters") or []
+                tool_capable = any(
+                    p in supported for p in ("tools", "tool_choice")
+                )
+                if tool_filter == "only" and not tool_capable:
+                    continue
+                if tool_filter == "exclude" and tool_capable:
                     continue
 
+            if zdr_only and zdr_capable_ids is not None:
+                # OpenRouter's /endpoints/zdr returns base IDs (no '~' alias prefix
+                # and no ':variant' suffix). Strip both before comparing.
+                base_id = model_id.lstrip("~").split(":", 1)[0]
+                if base_id not in zdr_capable_ids:
+                    continue
+
+            # Deprecation handling: a non-null `expiration_date` means
+            # OpenRouter has scheduled the model for removal. Hide the entry
+            # entirely when the operator opts in; otherwise keep it but tag
+            # the display name so users notice before relying on it.
+            expiration = model.get("expiration_date")
+            is_deprecated = expiration is not None and str(expiration).strip() != ""
+            if is_deprecated and self.valves.HIDE_DEPRECATED_MODELS:
+                continue
+
             # Split model_id once for provider extraction.
             # Strip leading '~' (OpenRouter "latest" aliases like ~anthropic/claude-haiku-latest)
             # so they match the same provider filter as their base provider.
@@ -471,6 +841,8 @@ def pipes(self) -> List[dict]:
                     continue
 
             model_name = model.get("name", model_id)
+            if is_deprecated:
+                model_name = f"⚠ {model_name} (deprecated)"
 
             model_dict = {
                 "id": model_id,
@@ -479,9 +851,18 @@ def pipes(self) -> List[dict]:
 
             models.append(model_dict)
 
+        # Append virtual variant entries (e.g. openai/gpt-4o:nitro). Variants
+        # inherit the base model's display name; only the suffix and a tag
+        # label change — the icon-sync step writes the same provider icon.
+        models = self._expand_variant_models(models, prefix)
+
         if not models:
-            if self.valves.FREE_ONLY:
-                error_text = "No free models available. Disable FREE_ONLY to see paid models."
+            if free_filter == "only":
+                error_text = "No free models available. Set FREE_MODEL_FILTER to 'all' to see paid models."
+            elif tool_filter == "only":
+                error_text = "No tool-capable models available. Set TOOL_CALLING_FILTER to 'all' to broaden the catalog."
+            elif zdr_only:
+                error_text = "No ZDR-capable models available. Disable ZDR_MODELS_ONLY or check your OpenRouter privacy settings."
             elif provider_filter:
                 providers_str = ", ".join(sorted(provider_filter))
                 error_text = f"No models match providers: {providers_str}. Check MODEL_PROVIDERS setting."
@@ -554,7 +935,7 @@ async def pipe(
             )
 
         payload = self._prepare_payload(body)
-        headers = self._build_headers()
+        headers = self._build_headers(model_id=payload.get("model"))
         stream = body.get("stream", False)
 
         if stream:
@@ -630,7 +1011,7 @@ def _sync_model_icons(self, models: List[dict]) -> None:
             # ~anthropic/claude-haiku-latest) resolve to the correct icon.
             parts = model_id.split("/", 1)
             provider_key = parts[0].lstrip("~").lower() if len(parts) > 1 else ""
-            icon_url = _PROVIDER_ICONS.get(provider_key)
+            icon_url = self._get_provider_icon(provider_key)
             # Build the prefixed ID that Open WebUI uses in the frontend
             db_model_id = f"{function_id}.{model_id}"
 
@@ -730,9 +1111,71 @@ def _sync_model_icons(self, models: List[dict]) -> None:
 
     @staticmethod
     def get_provider_icon(provider: str) -> Optional[str]:
-        """Return icon URL for the given provider."""
+        """Return hardcoded icon URL for the given provider (fast path only).
+
+        Does not consult the dynamic OpenRouter provider registry — for that,
+        use ``_get_provider_icon`` on a Pipe instance.
+        """
         return _PROVIDER_ICONS.get(provider.lower())
 
+    def _load_provider_registry(self) -> dict:
+        """Lazy-load OpenRouter's provider registry, cache for the pipe lifetime.
+
+        Returns ``{slug: icon_url}`` (with each slug also indexed under its
+        hyphen-stripped variant so e.g. ``x-ai`` resolves to the registry's
+        ``xai`` entry). Network failures are silent — a single empty dict is
+        cached and the pipe falls back to the hardcoded ``_PROVIDER_ICONS``.
+        """
+        if self._provider_registry is not None:
+            return self._provider_registry
+
+        registry: dict = {}
+        try:
+            resp = self._session.get(
+                _PROVIDER_REGISTRY_URL,
+                timeout=min(self.valves.REQUEST_TIMEOUT, 15),
+            )
+            try:
+                if resp.status_code == 200:
+                    data = resp.json().get("data") or []
+                    for entry in data:
+                        slug = (entry or {}).get("slug") or ""
+                        icon = ((entry or {}).get("icon") or {}).get("url") or ""
+                        if not slug or not icon:
+                            continue
+                        if icon.startswith("/"):
+                            icon = f"https://openrouter.ai{icon}"
+                        if not _is_safe_url(icon):
+                            continue
+                        registry[slug] = icon
+                        # Also index by hyphen-stripped slug — model-author IDs
+                        # like ``x-ai`` map to provider slug ``xai``.
+                        compact = slug.replace("-", "")
+                        if compact and compact != slug:
+                            registry.setdefault(compact, icon)
+            finally:
+                resp.close()
+        except Exception as exc:  # pragma: no cover
+            print(f"[OpenRouter Pipe] Provider registry fetch failed: {exc}")
+
+        self._provider_registry = registry
+        return registry
+
+    def _get_provider_icon(self, provider_key: str) -> Optional[str]:
+        """Resolve a provider icon URL using the layered fallback chain.
+
+        Order: hardcoded ``_PROVIDER_ICONS`` → registry exact match →
+        registry hyphen-stripped match. Returns ``None`` if no source has it.
+        """
+        if not provider_key:
+            return None
+        key = provider_key.lower()
+        icon = _PROVIDER_ICONS.get(key)
+        if icon:
+            return icon
+        registry = self._load_provider_registry()
+        return registry.get(key) or registry.get(key.replace("-", "")) or None
+
     def _parse_provider_filter(self) -> Optional[set]:
         """Parse MODEL_PROVIDERS valve into a set of lowercase provider names."""
         val = (self.valves.MODEL_PROVIDERS or "").strip()
@@ -747,6 +1190,142 @@ def _parse_csv(value: str) -> List[str]:
             return []
         return [item.strip() for item in value.split(",") if item.strip()]
 
+    def _load_zdr_model_ids(self) -> frozenset:
+        """Lazy-load OpenRouter's ZDR-capable model IDs and cache for the pipe lifetime.
+
+        Returns the cached set on subsequent calls (including the empty-set
+        sentinel returned on network failure, so we don't retry on every
+        ``pipes()`` call). The endpoint returns a list of model IDs that have
+        at least one Zero Data Retention provider endpoint.
+        """
+        if self._zdr_model_ids is not None:
+            return self._zdr_model_ids
+
+        ids: set = set()
+        try:
+            resp = self._session.get(
+                f"{self._base}{_API_PATH_ZDR_ENDPOINTS}",
+                headers=self._build_headers(include_content_type=False),
+                timeout=min(self.valves.REQUEST_TIMEOUT, 30),
+            )
+            try:
+                if resp.status_code == 200:
+                    payload = resp.json() or {}
+                    raw = payload.get("data") or payload.get("models") or []
+                    for entry in raw:
+                        if isinstance(entry, str):
+                            ids.add(entry)
+                        elif isinstance(entry, dict):
+                            mid = entry.get("id") or entry.get("model")
+                            if isinstance(mid, str) and mid:
+                                ids.add(mid)
+            finally:
+                resp.close()
+        except Exception as exc:  # pragma: no cover
+            print(f"[OpenRouter Pipe] ZDR endpoint fetch failed: {exc}")
+
+        self._zdr_model_ids = frozenset(ids)
+        return self._zdr_model_ids
+
+    def _parse_variant_specs(self) -> List[tuple]:
+        """Parse MODEL_VARIANTS into ``(base_id, variant_tag)`` pairs.
+
+        Recognised tags are listed in ``_RECOGNISED_VARIANT_TAGS`` and ensure
+        we don't accidentally fabricate IDs OpenRouter wouldn't honour.
+        Unknown tags are skipped with a console note.
+        """
+        raw = self.valves.MODEL_VARIANTS or ""
+        out: List[tuple] = []
+        for spec in self._parse_csv(raw):
+            if ":" not in spec:
+                print(f"[OpenRouter Pipe] Skipping malformed variant spec '{spec}' (expected base_id:variant_tag)")
+                continue
+            base_id, _, tag = spec.rpartition(":")
+            base_id = base_id.strip()
+            tag = tag.strip().lower()
+            if not base_id or not tag:
+                continue
+            if tag not in _RECOGNISED_VARIANT_TAGS:
+                print(
+                    f"[OpenRouter Pipe] Skipping unknown variant tag ':{tag}' "
+                    f"(supported: {', '.join(sorted(_RECOGNISED_VARIANT_TAGS))})"
+                )
+                continue
+            out.append((base_id, tag))
+        return out
+
+    def _build_web_search_plugin(self) -> Optional[dict]:
+        """Assemble the OpenRouter `web` plugin spec from valve settings.
+
+        Returns ``None`` when the feature is disabled. Output mirrors the
+        WebSearchPlugin schema from the official SDK
+        (id/enabled/max_results/search_prompt/include_domains/exclude_domains).
+        """
+        if not self.valves.ENABLE_WEB_SEARCH:
+            return None
+        plugin: dict = {"id": "web"}
+        max_results = self.valves.WEB_SEARCH_MAX_RESULTS
+        if max_results:
+            plugin["max_results"] = int(max_results)
+        prompt = (self.valves.WEB_SEARCH_PROMPT or "").strip()
+        if prompt:
+            plugin["search_prompt"] = prompt
+        include = self._parse_csv(self.valves.WEB_SEARCH_INCLUDE_DOMAINS)
+        if include:
+            plugin["include_domains"] = include
+        exclude = self._parse_csv(self.valves.WEB_SEARCH_EXCLUDE_DOMAINS)
+        if exclude:
+            plugin["exclude_domains"] = exclude
+        return plugin
+
+    def _expand_variant_models(self, models: List[dict], prefix: str) -> List[dict]:
+        """Append virtual variant entries to the catalog.
+
+        Each ``base_id:variant`` entry inherits the base model's display name
+        (with the tag appended) and reuses the same provider icon — only the
+        ID changes so OpenRouter routes the request via the variant suffix.
+        Variants whose base model isn't in the catalog (filtered out, or
+        unknown to OpenRouter) are silently skipped.
+        """
+        specs = self._parse_variant_specs()
+        if not specs:
+            return models
+
+        prefix_str = prefix or ""
+        # Strip the user-set prefix so we can reuse base names verbatim.
+        by_id: dict = {}
+        for entry in models:
+            mid = entry.get("id")
+            if isinstance(mid, str):
+                by_id[mid] = entry
+
+        seen_variant_ids = {entry.get("id") for entry in models}
+        appended: List[dict] = []
+        for base_id, tag in specs:
+            base_entry = by_id.get(base_id)
+            if base_entry is None:
+                print(
+                    f"[OpenRouter Pipe] Variant base not in catalog: "
+                    f"{base_id} (skipping :{tag})"
+                )
+                continue
+            variant_id = f"{base_id}:{tag}"
+            if variant_id in seen_variant_ids:
+                continue
+            base_name = base_entry.get("name", base_id)
+            # If the user set a prefix it's already in base_name; we only need
+            # to suffix the tag label.
+            tag_label = tag.capitalize()
+            appended.append(
+                {
+                    "id": variant_id,
+                    "name": f"{base_name} {tag_label}",
+                }
+            )
+            seen_variant_ids.add(variant_id)
+
+        return models + appended
+
     def _prepare_payload(self, body: dict) -> dict:
         """Sanitize OWUI internals and inject provider routing, reasoning, and fallbacks."""
         payload = copy.deepcopy(body)
@@ -769,8 +1348,21 @@ def _prepare_payload(self, body: dict) -> dict:
             payload["include_reasoning"] = True
 
         effort = self.valves.REASONING_EFFORT.strip().lower()
-        if effort in ("low", "medium", "high"):
-            payload["reasoning"] = {"effort": effort}
+        summary = self.valves.REASONING_SUMMARY_MODE.strip().lower()
+        reasoning_cfg: dict = {}
+        if effort in ("minimal", "low", "medium", "high", "xhigh"):
+            reasoning_cfg["effort"] = effort
+        if summary in ("auto", "concise", "detailed"):
+            reasoning_cfg["summary"] = summary
+        if self.valves.REASONING_MAX_TOKENS > 0:
+            reasoning_cfg["max_tokens"] = int(self.valves.REASONING_MAX_TOKENS)
+        if reasoning_cfg:
+            payload["reasoning"] = reasoning_cfg
+
+        # --- Service tier ---
+        tier = (self.valves.SERVICE_TIER or "").strip().lower()
+        if tier in ("auto", "default", "flex", "priority", "scale"):
+            payload["service_tier"] = tier
 
         # --- Provider routing ---
         provider: dict = {}
@@ -787,6 +1379,29 @@ def _prepare_payload(self, body: dict) -> dict:
         if ignore:
             provider["ignore"] = ignore
 
+        only = self._parse_csv(self.valves.PROVIDER_ONLY)
+        if only:
+            provider["only"] = only
+
+        quantizations = self._parse_csv(self.valves.PROVIDER_QUANTIZATIONS)
+        if quantizations:
+            provider["quantizations"] = [q.lower() for q in quantizations]
+
+        # `allow_fallbacks` defaults to true on OpenRouter, so only emit the
+        # field when the operator opted out.
+        if not self.valves.PROVIDER_ALLOW_FALLBACKS:
+            provider["allow_fallbacks"] = False
+
+        max_price: dict = {}
+        prompt_cap = (self.valves.PROVIDER_MAX_PRICE_PROMPT or "").strip()
+        if prompt_cap:
+            max_price["prompt"] = prompt_cap
+        completion_cap = (self.valves.PROVIDER_MAX_PRICE_COMPLETION or "").strip()
+        if completion_cap:
+            max_price["completion"] = completion_cap
+        if max_price:
+            provider["max_price"] = max_price
+
         if self.valves.REQUIRE_PARAMETERS:
             provider["require_parameters"] = True
 
@@ -794,6 +1409,12 @@ def _prepare_payload(self, body: dict) -> dict:
         if dc == "deny":
             provider["data_collection"] = "deny"
 
+        # ZDR enforcement: forces OpenRouter to route only to Zero Data
+        # Retention endpoints; the call fails fast if none exist for the
+        # selected model.
+        if self.valves.ZDR_ENFORCE:
+            provider["zdr"] = True
+
         if provider:
             payload["provider"] = provider
 
@@ -813,6 +1434,23 @@ def _prepare_payload(self, body: dict) -> dict:
         if self.valves.ENABLE_MIDDLE_OUT:
             payload["transforms"] = ["middle-out"]
 
+        # --- Web search plugin ---
+        # Append (don't overwrite) so the user can stack additional plugins
+        # via the request body. Skip silently if a `web` plugin is already
+        # present — first-match wins.
+        web_plugin = self._build_web_search_plugin()
+        if web_plugin is not None:
+            existing_plugins = payload.get("plugins")
+            if not isinstance(existing_plugins, list):
+                existing_plugins = []
+            already_has_web = any(
+                isinstance(p, dict) and p.get("id") == "web"
+                for p in existing_plugins
+            )
+            if not already_has_web:
+                existing_plugins.append(web_plugin)
+                payload["plugins"] = existing_plugins
+
         # --- Cache control (Anthropic) ---
         if self.valves.ENABLE_CACHE_CONTROL:
             self._inject_cache_control(payload)
@@ -824,8 +1462,13 @@ def _inject_cache_control(self, payload: dict) -> None:
 
         Applies to the first matching role (system, then user) with list-type
         content. Only one chunk is tagged ('first match wins') to avoid
-        excessive cache entries.
+        excessive cache entries. The TTL valve (5m/1h) is propagated into the
+        breakpoint so longer-lived caches are honoured by Anthropic.
         """
+        ttl = (self.valves.ANTHROPIC_PROMPT_CACHE_TTL or "").strip().lower()
+        cache_payload: dict = {"type": "ephemeral"}
+        if ttl in ("5m", "1h"):
+            cache_payload["ttl"] = ttl
         try:
             messages = payload.get("messages", [])
             for role in ("system", "user"):
@@ -841,20 +1484,62 @@ def _inject_cache_control(self, payload: dict) -> None:
                         if length > longest_len:
                             longest_idx, longest_len = idx, length
                     if longest_idx >= 0:
-                        content[longest_idx]["cache_control"] = {"type": "ephemeral"}
+                        content[longest_idx]["cache_control"] = dict(cache_payload)
                         return
         except Exception as exc:  # pragma: no cover
             print(f"[OpenRouter Pipe] cache_control not applied: {exc}")
 
-    def _build_headers(self, include_content_type: bool = True) -> dict:
-        """Build HTTP headers for OpenRouter API requests."""
+    @staticmethod
+    def _is_anthropic_model(model_id: str) -> bool:
+        """Return True if the (possibly variant-suffixed) model ID is Claude."""
+        if not isinstance(model_id, str):
+            return False
+        # Strip leading '~' (latest aliases) before the prefix check.
+        return model_id.lstrip("~").lower().startswith("anthropic/")
+
+    def _resolve_referer(self) -> str:
+        """Pick the HTTP-Referer header sent to OpenRouter.
+
+        Order: explicit valve override → cached WEBUI_URL env → default.
+        Validates that an override is a full URL with scheme; falls back
+        silently otherwise so a misconfigured valve never breaks requests.
+        """
+        override = (self.valves.HTTP_REFERER_OVERRIDE or "").strip()
+        if override.startswith(("http://", "https://")):
+            return override
+        return self._referer
+
+    def _build_headers(
+        self,
+        include_content_type: bool = True,
+        *,
+        model_id: Optional[str] = None,
+    ) -> dict:
+        """Build HTTP headers for OpenRouter API requests.
+
+        ``model_id`` is the (post-clean) ID about to be invoked; passing it
+        lets us inject provider-specific beta headers (e.g. Anthropic's
+        interleaved-thinking) only when relevant.
+        """
         headers = {
             "Authorization": f"Bearer {self.valves.OPENROUTER_API_KEY}",
-            "HTTP-Referer": self._referer,
+            "HTTP-Referer": self._resolve_referer(),
             "X-Title": self._title,
         }
         if include_content_type:
             headers["Content-Type"] = "application/json"
+
+        if (
+            model_id
+            and self.valves.ENABLE_ANTHROPIC_INTERLEAVED_THINKING
+            and self._is_anthropic_model(model_id)
+        ):
+            existing = headers.get("anthropic-beta", "")
+            features = [p.strip() for p in existing.split(",") if p.strip()]
+            if _ANTHROPIC_INTERLEAVED_THINKING_BETA not in features:
+                features.append(_ANTHROPIC_INTERLEAVED_THINKING_BETA)
+            headers["anthropic-beta"] = ",".join(features)
+
         return headers
 
     def _non_stream_response(self, headers: dict, payload: dict) -> str:
@@ -919,6 +1604,11 @@ def _non_stream_response(self, headers: dict, payload: dict) -> str:
                 if cost_info:
                     final_parts.append(cost_info)
 
+            if self.valves.SHOW_GENERATION_ID:
+                gen_footer = _format_generation_id(res.get("id"))
+                if gen_footer:
+                    final_parts.append(gen_footer)
+
             return "".join(final_parts)
         except requests.exceptions.Timeout:
             return f"OpenRouter Error: Request timed out after {self.valves.REQUEST_TIMEOUT}s. Try increasing REQUEST_TIMEOUT or retry."
@@ -937,6 +1627,7 @@ def _stream_response(
         in_think = False
         latest_citations: List[str] = []
         latest_usage: dict = {}
+        latest_generation_id: Optional[str] = None
 
         def _close_think_tag():
             nonlocal in_think
@@ -972,6 +1663,11 @@ def _close_think_tag():
                     yield f"\n\nOpenRouter Error: {msg}"
                     return
 
+                # Generation ID arrives on the first chunk and stays stable.
+                gen_id = chunk.get("id")
+                if gen_id and not latest_generation_id:
+                    latest_generation_id = gen_id
+
                 usage_data = chunk.get("usage")
                 if usage_data:
                     latest_usage = usage_data
@@ -1016,6 +1712,11 @@ def _close_think_tag():
                 cost_info = _format_cost_info(latest_usage, self.valves.COST_CURRENCY)
                 if cost_info:
                     yield cost_info
+
+            if self.valves.SHOW_GENERATION_ID:
+                gen_footer = _format_generation_id(latest_generation_id)
+                if gen_footer:
+                    yield gen_footer
         except requests.exceptions.Timeout:
             close_tag = _close_think_tag()
             if close_tag:
diff --git a/test_pipe.py b/test_pipe.py
index bcd2319..2f0385f 100644
--- a/test_pipe.py
+++ b/test_pipe.py
@@ -116,11 +116,12 @@ def _section(title: str):
     "OPENROUTER_API_KEY", "OPENROUTER_BASE_URL",
     "OPENROUTER_REASONING_EFFORT", "OPENROUTER_INCLUDE_REASONING",
     "OPENROUTER_MODEL_PROVIDERS", "OPENROUTER_INVERT_PROVIDER_LIST",
-    "OPENROUTER_FREE_ONLY", "OPENROUTER_PROVIDER_SORT",
+    "OPENROUTER_FREE_MODEL_FILTER", "OPENROUTER_PROVIDER_SORT",
     "OPENROUTER_PROVIDER_ORDER", "OPENROUTER_PROVIDER_IGNORE",
     "OPENROUTER_REQUIRE_PARAMETERS", "OPENROUTER_DATA_COLLECTION",
     "OPENROUTER_FALLBACK_MODELS", "OPENROUTER_ENABLE_MIDDLE_OUT",
     "OPENROUTER_ENABLE_CACHE_CONTROL", "OPENROUTER_REQUEST_TIMEOUT",
+    "OPENROUTER_OUTPUT_MODALITIES",
 ]:
     _env_backup[k] = os.environ.pop(k, None)
 
@@ -134,7 +135,15 @@ def _section(title: str):
 _assert(v.INCLUDE_REASONING is True, "include_reasoning True by default")
 _assert(v.MODEL_PREFIX is None, "prefix None by default")
 _assert(v.MODEL_PROVIDERS == "ALL", "MODEL_PROVIDERS default is ALL")
-_assert(v.FREE_ONLY is False, "FREE_ONLY false")
+_assert(v.FREE_MODEL_FILTER == "all", "FREE_MODEL_FILTER default is 'all'")
+_assert(v.TOOL_CALLING_FILTER == "all", "TOOL_CALLING_FILTER default is 'all'")
+_assert(v.MODEL_VARIANTS == "", "MODEL_VARIANTS default empty")
+_assert(v.ZDR_MODELS_ONLY is False, "ZDR_MODELS_ONLY default False")
+_assert(v.ZDR_ENFORCE is False, "ZDR_ENFORCE default False")
+_assert(v.REASONING_SUMMARY_MODE == "disabled", "REASONING_SUMMARY_MODE default 'disabled'")
+_assert(v.ENABLE_ANTHROPIC_INTERLEAVED_THINKING is True, "interleaved thinking default True")
+_assert(v.ANTHROPIC_PROMPT_CACHE_TTL == "5m", "ANTHROPIC_PROMPT_CACHE_TTL default '5m'")
+_assert(v.HTTP_REFERER_OVERRIDE == "", "HTTP_REFERER_OVERRIDE default empty")
 _assert(v.PROVIDER_SORT == "", "PROVIDER_SORT empty")
 _assert(v.PROVIDER_ORDER == "", "PROVIDER_ORDER empty")
 _assert(v.PROVIDER_IGNORE == "", "PROVIDER_IGNORE empty")
@@ -147,6 +156,7 @@ def _section(title: str):
 _assert(v.MAX_RETRIES == 2, "MAX_RETRIES 2")
 _assert(v.SHOW_COST_INFO is False, "SHOW_COST_INFO false by default")
 _assert(v.COST_CURRENCY == "USD", "COST_CURRENCY USD by default")
+_assert(v.OUTPUT_MODALITIES == "all", "OUTPUT_MODALITIES default 'all' (full catalog)")
 
 try:
     Pipe.Valves(REQUEST_TIMEOUT=-1)
@@ -260,6 +270,74 @@ def _section(title: str):
 payload4 = pipe2._prepare_payload(body4)
 _assert(payload4["model"] == "openai/gpt-4o", "model without dot left unchanged")
 
+# ── 5d. Extended REASONING_EFFORT levels (minimal, xhigh) ──
+_pipe5d = Pipe()
+_pipe5d.valves = Pipe.Valves(OPENROUTER_API_KEY="k", REASONING_EFFORT="minimal")
+_p5d = _pipe5d._prepare_payload({"model": "openai/o1", "messages": []})
+_assert(_p5d.get("reasoning") == {"effort": "minimal"}, "REASONING_EFFORT='minimal' sent verbatim")
+
+_pipe5d.valves = Pipe.Valves(OPENROUTER_API_KEY="k", REASONING_EFFORT="xhigh")
+_p5d = _pipe5d._prepare_payload({"model": "openai/o1", "messages": []})
+_assert(_p5d.get("reasoning") == {"effort": "xhigh"}, "REASONING_EFFORT='xhigh' sent verbatim")
+
+# Empty/garbage effort drops the key
+_pipe5d.valves = Pipe.Valves(OPENROUTER_API_KEY="k", REASONING_EFFORT="")
+_p5d = _pipe5d._prepare_payload({"model": "openai/o1", "messages": []})
+_assert("reasoning" not in _p5d, "empty REASONING_EFFORT: no reasoning field")
+_pipe5d.valves = Pipe.Valves(OPENROUTER_API_KEY="k", REASONING_EFFORT="bogus")
+_p5d = _pipe5d._prepare_payload({"model": "openai/o1", "messages": []})
+_assert("reasoning" not in _p5d, "garbage REASONING_EFFORT: silently dropped")
+
+# ── 5e. REASONING_SUMMARY_MODE merged into reasoning object ──
+_pipe5e = Pipe()
+_pipe5e.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    REASONING_EFFORT="high",
+    REASONING_SUMMARY_MODE="detailed",
+)
+_p5e = _pipe5e._prepare_payload({"model": "openai/o1", "messages": []})
+_assert(
+    _p5e.get("reasoning") == {"effort": "high", "summary": "detailed"},
+    "effort + summary merged into one reasoning object",
+)
+# Summary alone (no effort)
+_pipe5e.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k", REASONING_EFFORT="", REASONING_SUMMARY_MODE="auto"
+)
+_p5e = _pipe5e._prepare_payload({"model": "openai/o1", "messages": []})
+_assert(_p5e.get("reasoning") == {"summary": "auto"}, "summary-only reasoning object")
+# disabled summary skipped
+_pipe5e.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k", REASONING_EFFORT="", REASONING_SUMMARY_MODE="disabled"
+)
+_p5e = _pipe5e._prepare_payload({"model": "openai/o1", "messages": []})
+_assert("reasoning" not in _p5e, "summary='disabled' + no effort: reasoning key dropped")
+
+# ── 5f. ZDR_ENFORCE injects provider.zdr=true ──
+_pipe5f = Pipe()
+_pipe5f.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ZDR_ENFORCE=True)
+_p5f = _pipe5f._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert(_p5f.get("provider", {}).get("zdr") is True, "ZDR_ENFORCE=True: provider.zdr=true injected")
+
+_pipe5f.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ZDR_ENFORCE=False)
+_p5f = _pipe5f._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert(
+    "provider" not in _p5f or "zdr" not in _p5f.get("provider", {}),
+    "ZDR_ENFORCE=False: no provider.zdr field",
+)
+
+# ZDR_ENFORCE plays nice with other provider fields
+_pipe5f.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    ZDR_ENFORCE=True,
+    PROVIDER_SORT="price",
+    DATA_COLLECTION="deny",
+)
+_p5f = _pipe5f._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert(_p5f["provider"]["zdr"] is True, "ZDR_ENFORCE coexists with sort")
+_assert(_p5f["provider"]["sort"] == "price", "ZDR_ENFORCE: sort preserved")
+_assert(_p5f["provider"]["data_collection"] == "deny", "ZDR_ENFORCE: data_collection preserved")
+
 # ── 6. _build_headers ────────────────────────────────────────────────────────
 
 _section("6. _build_headers()")
@@ -278,6 +356,56 @@ def _section(title: str):
 _assert("Content-Type" not in headers_no_ct, "Content-Type omitted")
 _assert("Authorization" in headers_no_ct, "auth still present")
 
+# 6b. ENABLE_ANTHROPIC_INTERLEAVED_THINKING injects beta header for anthropic models only
+pipe = Pipe()
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_ANTHROPIC_INTERLEAVED_THINKING=True)
+_h_anth = pipe._build_headers(model_id="anthropic/claude-3.5-sonnet")
+_assert(
+    _h_anth.get("anthropic-beta") == "interleaved-thinking-2025-05-14",
+    "anthropic model: interleaved-thinking beta header injected",
+)
+_h_oai = pipe._build_headers(model_id="openai/gpt-4o")
+_assert(
+    "anthropic-beta" not in _h_oai,
+    "non-anthropic model: no interleaved-thinking header",
+)
+# Tilde latest-alias still picks up the header
+_h_alias = pipe._build_headers(model_id="~anthropic/claude-haiku-latest")
+_assert(
+    _h_alias.get("anthropic-beta") == "interleaved-thinking-2025-05-14",
+    "tilde anthropic alias: interleaved-thinking header injected",
+)
+# When the valve is off, no header even on Claude
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_ANTHROPIC_INTERLEAVED_THINKING=False)
+_h_off = pipe._build_headers(model_id="anthropic/claude-3.5-sonnet")
+_assert(
+    "anthropic-beta" not in _h_off,
+    "valve off: no interleaved-thinking header even for Claude",
+)
+
+# 6c. HTTP_REFERER_OVERRIDE: explicit override > env fallback > default
+pipe = Pipe()
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="k")
+_default_ref = pipe._build_headers()["HTTP-Referer"]
+_assert(
+    _default_ref.startswith(("http://", "https://")),
+    "HTTP-Referer falls back to a valid scheme URL when no override set",
+)
+pipe.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    HTTP_REFERER_OVERRIDE="https://my-corp.example.com/owui",
+)
+_assert(
+    pipe._build_headers()["HTTP-Referer"] == "https://my-corp.example.com/owui",
+    "HTTP_REFERER_OVERRIDE: full URL respected",
+)
+# Bogus override (no scheme) → silently falls back
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="k", HTTP_REFERER_OVERRIDE="not-a-url")
+_assert(
+    pipe._build_headers()["HTTP-Referer"] != "not-a-url",
+    "HTTP_REFERER_OVERRIDE: schemeless value silently ignored",
+)
+
 # ── 7. _get_provider_icon ────────────────────────────────────────────────────
 
 _section("7. get_provider_icon()")
@@ -361,14 +489,37 @@ def _section(title: str):
 }
 pipe._inject_cache_control(payload_cc)
 _assert(
-    payload_cc["messages"][0]["content"][1].get("cache_control") == {"type": "ephemeral"},
-    "cache_control applied to longest text chunk",
+    payload_cc["messages"][0]["content"][1].get("cache_control")
+    == {"type": "ephemeral", "ttl": "5m"},
+    "cache_control applied to longest text chunk (default 5m TTL)",
 )
 _assert(
     "cache_control" not in payload_cc["messages"][0]["content"][0],
     "cache_control NOT on shorter chunk",
 )
 
+# Cache TTL valve switches the breakpoint to 1h
+_pipe_ttl_1h = Pipe()
+_pipe_ttl_1h.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    ENABLE_CACHE_CONTROL=True,
+    ANTHROPIC_PROMPT_CACHE_TTL="1h",
+)
+payload_ttl = {
+    "messages": [
+        {
+            "role": "system",
+            "content": [{"type": "text", "text": "long system prompt"}],
+        }
+    ]
+}
+_pipe_ttl_1h._inject_cache_control(payload_ttl)
+_assert(
+    payload_ttl["messages"][0]["content"][0].get("cache_control")
+    == {"type": "ephemeral", "ttl": "1h"},
+    "ANTHROPIC_PROMPT_CACHE_TTL='1h' propagated into breakpoint",
+)
+
 # No list content → no crash
 payload_cc2 = {"messages": [{"role": "system", "content": "plain string"}]}
 pipe._inject_cache_control(payload_cc2)  # Should not raise
@@ -864,7 +1015,7 @@ async def _test_pipe_stream() -> str:
 _assert("info" not in models[0], "pipes: info key removed (dead code)")
 
 # 15b. FREE_ONLY filter
-pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_ONLY=True)
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_MODEL_FILTER="only")
 
 # Mock data with pricing info: one :free suffix, one free-by-pricing, one paid
 mock_models_pricing = {
@@ -1045,7 +1196,7 @@ async def _test_pipe_stream() -> str:
 _assert("No models found" in models[0]["name"], "pipes empty data: correct message")
 
 # 15m. FREE_ONLY + all paid models → "No free models available"
-pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_ONLY=True)
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_MODEL_FILTER="only")
 pipe._models_cache = None
 _mock_all_paid = {
     "data": [
@@ -1118,15 +1269,20 @@ async def _test_pipe_stream() -> str:
     "API key: input type is password",
 )
 
-# 16b. REASONING_EFFORT uses select with 4 options
+# 16b. REASONING_EFFORT uses select with 6 options (disabled, minimal, low, medium, high, xhigh)
 re_field = Pipe.Valves.model_fields["REASONING_EFFORT"]
 _assert(
     re_field.json_schema_extra is not None,
     "REASONING_EFFORT: json_schema_extra present",
 )
 re_options = re_field.json_schema_extra.get("input", {}).get("options", [])
-_assert(len(re_options) == 4, "REASONING_EFFORT: 4 options (disabled, low, medium, high)")
+_assert(
+    len(re_options) == 6,
+    "REASONING_EFFORT: 6 options (disabled, minimal, low, medium, high, xhigh)",
+)
 re_values = [o["value"] for o in re_options]
+_assert("minimal" in re_values, "REASONING_EFFORT: minimal option present")
+_assert("xhigh" in re_values, "REASONING_EFFORT: xhigh option present")
 _assert("" in re_values and "high" in re_values, "REASONING_EFFORT: contains empty and high")
 
 # 16c. PROVIDER_SORT uses select with 4 options
@@ -1222,7 +1378,7 @@ def _counting_get(*args, **kwargs):
 _assert(_call_count == 1, "cache hit: API called only once for two pipes() calls")
 
 # 19b. Changing a valve invalidates cache
-_pipe_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_ONLY=True)
+_pipe_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_MODEL_FILTER="only")
 _call_count = 0
 with patch.object(_pipe_cache._session, "get", side_effect=_counting_get):
     _pipe_cache.pipes()  # should miss cache (valve changed)
@@ -1243,6 +1399,300 @@ def _counting_get(*args, **kwargs):
     _pipe_cache.pipes()
 _assert(_call_count == 1, "cache expired: API called after TTL")
 
+# ── 19d. OUTPUT_MODALITIES query param ──────────────────────────────────────
+_section("19d. OUTPUT_MODALITIES query param on /models")
+
+_mock_modalities_resp = MagicMock()
+_mock_modalities_resp.status_code = 200
+_mock_modalities_resp.json.return_value = {"data": [
+    {"id": "openai/gpt-4o", "name": "GPT-4o"},
+    {"id": "openai/gpt-4o-mini-tts-2025-12-15", "name": "GPT-4o Mini TTS"},
+]}
+_mock_modalities_resp.raise_for_status = MagicMock()
+
+_captured_kwargs = {}
+
+def _capture_get(*args, **kwargs):
+    _captured_kwargs.clear()
+    _captured_kwargs.update(kwargs)
+    return _mock_modalities_resp
+
+# Default valve → params should request 'all'
+_pipe_mod = Pipe()
+_pipe_mod.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key")
+_pipe_mod._models_cache = None
+with patch.object(_pipe_mod._session, "get", side_effect=_capture_get):
+    _models = _pipe_mod.pipes()
+_assert(
+    _captured_kwargs.get("params") == {"output_modalities": "all"},
+    "default OUTPUT_MODALITIES sends params={'output_modalities':'all'}",
+)
+_tts_ids = {m["id"] for m in _models}
+_assert(
+    "openai/gpt-4o-mini-tts-2025-12-15" in _tts_ids,
+    "TTS model surfaced in pipes() output when API returns it",
+)
+
+# Custom valve value → forwarded verbatim
+_pipe_mod = Pipe()
+_pipe_mod.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="text,audio")
+_pipe_mod._models_cache = None
+with patch.object(_pipe_mod._session, "get", side_effect=_capture_get):
+    _pipe_mod.pipes()
+_assert(
+    _captured_kwargs.get("params") == {"output_modalities": "text,audio"},
+    "custom OUTPUT_MODALITIES forwarded as params value",
+)
+
+# Empty/whitespace valve → falls back to 'all'
+_pipe_mod = Pipe()
+_pipe_mod.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="   ")
+_pipe_mod._models_cache = None
+with patch.object(_pipe_mod._session, "get", side_effect=_capture_get):
+    _pipe_mod.pipes()
+_assert(
+    _captured_kwargs.get("params") == {"output_modalities": "all"},
+    "blank OUTPUT_MODALITIES falls back to 'all'",
+)
+
+# 19e. Cache key includes OUTPUT_MODALITIES — toggling invalidates
+_pipe_mod_cache = Pipe()
+_pipe_mod_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="all")
+_key_all = _pipe_mod_cache._build_cache_key()
+_pipe_mod_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="text")
+_key_text = _pipe_mod_cache._build_cache_key()
+_assert(_key_all != _key_text, "_build_cache_key differs for different OUTPUT_MODALITIES")
+
+# Behavioral: pipes() refetches after OUTPUT_MODALITIES changes
+_pipe_mod_cache = Pipe()
+_pipe_mod_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="all")
+_pipe_mod_cache._models_cache = None
+
+_modalities_call_count = 0
+
+def _counting_modalities_get(*args, **kwargs):
+    global _modalities_call_count
+    _modalities_call_count += 1
+    return _mock_modalities_resp
+
+with patch.object(_pipe_mod_cache._session, "get", side_effect=_counting_modalities_get):
+    _pipe_mod_cache.pipes()  # populates cache
+    _pipe_mod_cache.pipes()  # cache hit
+_assert(_modalities_call_count == 1, "OUTPUT_MODALITIES cache hit: 1 API call across 2 pipes() invocations")
+
+_pipe_mod_cache.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", OUTPUT_MODALITIES="text")
+_modalities_call_count = 0
+with patch.object(_pipe_mod_cache._session, "get", side_effect=_counting_modalities_get):
+    _pipe_mod_cache.pipes()
+_assert(
+    _modalities_call_count == 1,
+    "OUTPUT_MODALITIES change invalidates cache: API refetched",
+)
+
+# ── 19f. FREE_MODEL_FILTER trinary (all/only/exclude) ────────────────────────
+_section("19f. FREE_MODEL_FILTER trinary")
+
+_mock_pricing = {
+    "data": [
+        {"id": "openai/gpt-4o", "name": "GPT-4o", "pricing": {"prompt": "5", "completion": "15"}},
+        {"id": "google/gemini-2.0-flash-exp:free", "name": "Gemini 2.0 Flash (Free)", "pricing": {"prompt": "0", "completion": "0"}},
+        {"id": "google/gemma-3-1b-it", "name": "Gemma 3 1B", "pricing": {"prompt": "0", "completion": "0"}},
+    ]
+}
+_mock_pricing_resp = MagicMock()
+_mock_pricing_resp.status_code = 200
+_mock_pricing_resp.json.return_value = _mock_pricing
+_mock_pricing_resp.raise_for_status = MagicMock()
+
+# 'all' = no filter
+_pipe_ff = Pipe()
+_pipe_ff.valves = Pipe.Valves(OPENROUTER_API_KEY="k", FREE_MODEL_FILTER="all")
+_pipe_ff._models_cache = None
+with patch.object(_pipe_ff._session, "get", return_value=_mock_pricing_resp):
+    _all_models = _pipe_ff.pipes()
+_assert(len(_all_models) == 3, "FREE_MODEL_FILTER='all': all 3 models pass through")
+
+# 'exclude' hides free models
+_pipe_ff = Pipe()
+_pipe_ff.valves = Pipe.Valves(OPENROUTER_API_KEY="k", FREE_MODEL_FILTER="exclude")
+_pipe_ff._models_cache = None
+with patch.object(_pipe_ff._session, "get", return_value=_mock_pricing_resp):
+    _paid = _pipe_ff.pipes()
+_paid_ids = {m["id"] for m in _paid}
+_assert("openai/gpt-4o" in _paid_ids, "FREE_MODEL_FILTER='exclude': paid model kept")
+_assert(":free" not in str(_paid_ids), "FREE_MODEL_FILTER='exclude': :free suffix excluded")
+_assert("google/gemma-3-1b-it" not in _paid_ids, "FREE_MODEL_FILTER='exclude': zero-pricing excluded")
+
+# ── 19g. TOOL_CALLING_FILTER ────────────────────────────────────────────────
+_section("19g. TOOL_CALLING_FILTER")
+
+_mock_tools = {
+    "data": [
+        {"id": "openai/gpt-4o", "name": "GPT-4o", "supported_parameters": ["tools", "tool_choice", "temperature"]},
+        {"id": "openai/o1-mini", "name": "o1-mini", "supported_parameters": ["temperature"]},
+        {"id": "openai/gpt-3.5-turbo", "name": "GPT-3.5", "supported_parameters": ["tool_choice"]},
+    ]
+}
+_mock_tools_resp = MagicMock()
+_mock_tools_resp.status_code = 200
+_mock_tools_resp.json.return_value = _mock_tools
+_mock_tools_resp.raise_for_status = MagicMock()
+
+_pipe_tc = Pipe()
+_pipe_tc.valves = Pipe.Valves(OPENROUTER_API_KEY="k", TOOL_CALLING_FILTER="only")
+_pipe_tc._models_cache = None
+with patch.object(_pipe_tc._session, "get", return_value=_mock_tools_resp):
+    _tc_models = _pipe_tc.pipes()
+_tc_ids = {m["id"] for m in _tc_models}
+_assert("openai/gpt-4o" in _tc_ids, "TOOL_CALLING_FILTER='only': model with 'tools' kept")
+_assert("openai/gpt-3.5-turbo" in _tc_ids, "TOOL_CALLING_FILTER='only': model with 'tool_choice' kept")
+_assert("openai/o1-mini" not in _tc_ids, "TOOL_CALLING_FILTER='only': non-tool model dropped")
+
+_pipe_tc = Pipe()
+_pipe_tc.valves = Pipe.Valves(OPENROUTER_API_KEY="k", TOOL_CALLING_FILTER="exclude")
+_pipe_tc._models_cache = None
+with patch.object(_pipe_tc._session, "get", return_value=_mock_tools_resp):
+    _tc_excl = _pipe_tc.pipes()
+_tc_excl_ids = {m["id"] for m in _tc_excl}
+_assert(_tc_excl_ids == {"openai/o1-mini"}, "TOOL_CALLING_FILTER='exclude': only non-tool model kept")
+
+# ── 19h. MODEL_VARIANTS expansion ───────────────────────────────────────────
+_section("19h. MODEL_VARIANTS expansion")
+
+_mock_var = {
+    "data": [
+        {"id": "openai/gpt-4o", "name": "GPT-4o"},
+        {"id": "anthropic/claude-3.5-sonnet", "name": "Claude 3.5 Sonnet"},
+    ]
+}
+_mock_var_resp = MagicMock()
+_mock_var_resp.status_code = 200
+_mock_var_resp.json.return_value = _mock_var
+_mock_var_resp.raise_for_status = MagicMock()
+
+_pipe_var = Pipe()
+_pipe_var.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    MODEL_VARIANTS="openai/gpt-4o:nitro,anthropic/claude-3.5-sonnet:thinking,openai/gpt-4o:exacto",
+)
+_pipe_var._models_cache = None
+with patch.object(_pipe_var._session, "get", return_value=_mock_var_resp):
+    _var_models = _pipe_var.pipes()
+_var_ids = {m["id"] for m in _var_models}
+_assert("openai/gpt-4o" in _var_ids, "MODEL_VARIANTS: base model preserved")
+_assert("openai/gpt-4o:nitro" in _var_ids, "MODEL_VARIANTS: :nitro variant added")
+_assert("openai/gpt-4o:exacto" in _var_ids, "MODEL_VARIANTS: :exacto variant added")
+_assert("anthropic/claude-3.5-sonnet:thinking" in _var_ids, "MODEL_VARIANTS: :thinking variant added")
+_nitro_entry = next(m for m in _var_models if m["id"] == "openai/gpt-4o:nitro")
+_assert("Nitro" in _nitro_entry["name"], "MODEL_VARIANTS: tag label appended to display name")
+_assert("GPT-4o" in _nitro_entry["name"], "MODEL_VARIANTS: base name retained")
+
+# Variant whose base isn't in the catalog → silently skipped
+_pipe_var = Pipe()
+_pipe_var.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    MODEL_VARIANTS="missing/provider-model:nitro,openai/gpt-4o:nitro",
+)
+_pipe_var._models_cache = None
+with patch.object(_pipe_var._session, "get", return_value=_mock_var_resp):
+    _var_models = _pipe_var.pipes()
+_var_ids = {m["id"] for m in _var_models}
+_assert("missing/provider-model:nitro" not in _var_ids, "MODEL_VARIANTS: missing base skipped")
+_assert("openai/gpt-4o:nitro" in _var_ids, "MODEL_VARIANTS: valid variant still added")
+
+# Unrecognised tag → skipped
+_pipe_var = Pipe()
+_pipe_var.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k", MODEL_VARIANTS="openai/gpt-4o:bogus"
+)
+_pipe_var._models_cache = None
+with patch.object(_pipe_var._session, "get", return_value=_mock_var_resp):
+    _var_models = _pipe_var.pipes()
+_assert(
+    not any(m["id"] == "openai/gpt-4o:bogus" for m in _var_models),
+    "MODEL_VARIANTS: unrecognised tag silently dropped",
+)
+
+# Empty MODEL_VARIANTS → no expansion
+_pipe_var = Pipe()
+_pipe_var.valves = Pipe.Valves(OPENROUTER_API_KEY="k", MODEL_VARIANTS="")
+_pipe_var._models_cache = None
+with patch.object(_pipe_var._session, "get", return_value=_mock_var_resp):
+    _var_models = _pipe_var.pipes()
+_assert(len(_var_models) == 2, "MODEL_VARIANTS empty: no virtual entries added")
+
+# ── 19i. ZDR_MODELS_ONLY filter + _load_zdr_model_ids ───────────────────────
+_section("19i. ZDR_MODELS_ONLY filter")
+
+_mock_zdr_resp = MagicMock()
+_mock_zdr_resp.status_code = 200
+_mock_zdr_resp.json.return_value = {
+    "data": ["openai/gpt-4o", "anthropic/claude-3.5-sonnet"]
+}
+_mock_zdr_resp.raise_for_status = MagicMock()
+
+_mock_models_zdr = {
+    "data": [
+        {"id": "openai/gpt-4o", "name": "GPT-4o"},
+        {"id": "anthropic/claude-3.5-sonnet", "name": "Claude"},
+        {"id": "google/gemini-2.0-flash-exp", "name": "Gemini"},
+    ]
+}
+_mock_models_zdr_resp = MagicMock()
+_mock_models_zdr_resp.status_code = 200
+_mock_models_zdr_resp.json.return_value = _mock_models_zdr
+_mock_models_zdr_resp.raise_for_status = MagicMock()
+
+
+def _zdr_router(url, *args, **kwargs):
+    if "/endpoints/zdr" in url:
+        return _mock_zdr_resp
+    return _mock_models_zdr_resp
+
+
+_pipe_zdr = Pipe()
+_pipe_zdr.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ZDR_MODELS_ONLY=True)
+_pipe_zdr._models_cache = None
+with patch.object(_pipe_zdr._session, "get", side_effect=_zdr_router):
+    _zdr_models = _pipe_zdr.pipes()
+_zdr_ids = {m["id"] for m in _zdr_models}
+_assert(_zdr_ids == {"openai/gpt-4o", "anthropic/claude-3.5-sonnet"},
+        "ZDR_MODELS_ONLY: catalog narrowed to ZDR-capable IDs")
+
+# Loader caches: no second HTTP call when called twice
+_pipe_zdr2 = Pipe()
+_pipe_zdr2.valves = Pipe.Valves(OPENROUTER_API_KEY="k")
+_zdr_call_count = 0
+
+
+def _counting_zdr_router(url, *args, **kwargs):
+    global _zdr_call_count
+    if "/endpoints/zdr" in url:
+        _zdr_call_count += 1
+    return _mock_zdr_resp if "/endpoints/zdr" in url else _mock_models_zdr_resp
+
+
+with patch.object(_pipe_zdr2._session, "get", side_effect=_counting_zdr_router):
+    _ = _pipe_zdr2._load_zdr_model_ids()
+    _ = _pipe_zdr2._load_zdr_model_ids()
+_assert(_zdr_call_count == 1, "_load_zdr_model_ids: cached after first call")
+
+# ── 19j. _build_cache_key includes new filters ──────────────────────────────
+_section("19j. cache key includes FREE_MODEL_FILTER / TOOL_CALLING_FILTER / ZDR_MODELS_ONLY / MODEL_VARIANTS")
+
+_keys = []
+for v in [
+    {},
+    {"FREE_MODEL_FILTER": "only"},
+    {"TOOL_CALLING_FILTER": "exclude"},
+    {"ZDR_MODELS_ONLY": True},
+    {"MODEL_VARIANTS": "openai/gpt-4o:nitro"},
+]:
+    _p = Pipe()
+    _p.valves = Pipe.Valves(OPENROUTER_API_KEY="k", **v)
+    _keys.append(_p._build_cache_key())
+_assert(len(set(_keys)) == len(_keys), "cache key fingerprint differs per new-filter valve")
+
 # ── 20. Base URL validator ───────────────────────────────────────────────────
 
 _section("20. Base URL validator")
@@ -1381,7 +1831,7 @@ async def _test_pipe_no_msgs_key():
 
 # 24b. FREE_ONLY with :free suffix
 _pipe_free = Pipe()
-_pipe_free.valves = Pipe.Valves(OPENROUTER_API_KEY="k", FREE_ONLY=True)
+_pipe_free.valves = Pipe.Valves(OPENROUTER_API_KEY="k", FREE_MODEL_FILTER="only")
 _pipe_free._models_cache = None
 _mock_free_resp = MagicMock()
 _mock_free_resp.status_code = 200
@@ -1598,6 +2048,133 @@ async def _test_pipe_no_msgs_key():
 _assert(_is_owui("https://openrouter.ai/images/icons/Anthropic.svg"), "_is_owui_managed_icon: icons path anthropic → True")
 _assert(not _is_owui("https://custom-icon.example.com/icon.png"), "_is_owui_managed_icon: external URL → False")
 _assert(not _is_owui("https://cdn.openai.com/logo.png"), "_is_owui_managed_icon: other https URL → False")
+_assert(
+    _is_owui("https://t0.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&url=https://x.ai/&size=256"),
+    "_is_owui_managed_icon: gstatic faviconV2 URL → True (registry-sourced, overwriteable)",
+)
+
+# ── 25j. _load_provider_registry + _get_provider_icon ────────────────────────
+_section("25j. provider registry auto-discovery")
+
+# Mock the OpenRouter frontend providers payload
+_registry_payload = {
+    "data": [
+        {"slug": "openai", "name": "OpenAI", "icon": {"url": "/images/icons/OpenAI.svg"}},
+        {"slug": "xai", "name": "xAI", "icon": {"url": "https://t0.gstatic.com/faviconV2?url=https://x.ai/&size=256"}},
+        {"slug": "arcee-ai", "name": "Arcee AI", "icon": {"url": "https://t0.gstatic.com/faviconV2?url=https://www.arcee.ai/&size=256"}},
+        {"slug": "broken", "name": "Broken", "icon": {"url": ""}},  # empty icon — must be skipped
+        {"slug": "unsafe", "name": "Unsafe", "icon": {"url": "javascript:alert(1)"}},  # unsafe — must be skipped
+        {"slug": "noicon", "name": "NoIcon"},  # no icon key at all
+    ]
+}
+_mock_reg_resp = MagicMock()
+_mock_reg_resp.status_code = 200
+_mock_reg_resp.json.return_value = _registry_payload
+
+_pipe_reg = Pipe()
+_pipe_reg.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key")
+
+_reg_call_count = 0
+def _counting_reg_get(url, *args, **kwargs):
+    global _reg_call_count
+    if "all-providers" in url:
+        _reg_call_count += 1
+        return _mock_reg_resp
+    return _mock_reg_resp  # fall-through is fine for this test
+
+with patch.object(_pipe_reg._session, "get", side_effect=_counting_reg_get):
+    _r1 = _pipe_reg._load_provider_registry()
+    _r2 = _pipe_reg._load_provider_registry()  # cached, no second fetch
+
+_assert(_reg_call_count == 1, "registry: HTTP fetched exactly once (caching)")
+_assert(_r1 is _r2, "registry: cached object is the same instance on subsequent calls")
+_assert(
+    _r1.get("openai") == "https://openrouter.ai/images/icons/OpenAI.svg",
+    "registry: relative /images/icons/ URL resolved against openrouter.ai",
+)
+_assert(
+    _r1.get("xai", "").startswith("https://t0.gstatic.com/faviconV2"),
+    "registry: gstatic favicon URL kept verbatim",
+)
+_assert(
+    _r1.get("arcee-ai") == _r1.get("arceeai"),
+    "registry: hyphen-stripped slug also indexed (arcee-ai → arceeai)",
+)
+_assert("broken" not in _r1, "registry: empty icon URL skipped")
+_assert("unsafe" not in _r1, "registry: unsafe (non-http) icon URL skipped")
+_assert("noicon" not in _r1, "registry: entry without icon key skipped")
+
+# 25k. _get_provider_icon layered lookup
+_pipe_lookup = Pipe()
+_pipe_lookup.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key")
+with patch.object(_pipe_lookup._session, "get", side_effect=_counting_reg_get):
+    # Hardcoded fast path — registry never consulted
+    _icon_openai = _pipe_lookup._get_provider_icon("openai")
+    _assert(
+        _icon_openai == "https://openrouter.ai/images/icons/OpenAI.svg",
+        "_get_provider_icon: hardcoded dict hit returns OpenAI icon",
+    )
+
+    # Slug not in dict but in registry (exact)
+    _icon_arcee = _pipe_lookup._get_provider_icon("arcee-ai")
+    _assert(
+        _icon_arcee and _icon_arcee.startswith("https://t0.gstatic.com/faviconV2"),
+        "_get_provider_icon: registry exact-slug hit (arcee-ai)",
+    )
+
+    # Hyphen-strip normalization: x-ai (model author) → xai (registry slug)
+    _icon_xai = _pipe_lookup._get_provider_icon("x-ai")
+    _assert(
+        _icon_xai and _icon_xai.startswith("https://t0.gstatic.com/faviconV2"),
+        "_get_provider_icon: hyphen-strip normalization (x-ai → xai)",
+    )
+
+    # Truly unknown provider returns None (registry has no entry)
+    _icon_missing = _pipe_lookup._get_provider_icon("totally-unknown-provider")
+    _assert(
+        _icon_missing is None,
+        "_get_provider_icon: unknown provider returns None",
+    )
+
+    # Empty/None provider key
+    _assert(_pipe_lookup._get_provider_icon("") is None, "_get_provider_icon: empty key → None")
+
+# 25l. Registry network failure → cached empty dict, no retry, dict still works
+_pipe_fail = Pipe()
+_pipe_fail.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key")
+
+_fail_call_count = 0
+def _failing_reg_get(*args, **kwargs):
+    global _fail_call_count
+    _fail_call_count += 1
+    raise Exception("simulated network failure")
+
+with patch.object(_pipe_fail._session, "get", side_effect=_failing_reg_get):
+    _r_fail = _pipe_fail._load_provider_registry()
+    _r_fail_2 = _pipe_fail._load_provider_registry()
+_assert(_r_fail == {}, "registry: network failure → empty dict")
+_assert(_fail_call_count == 1, "registry: failure does not retry (cached empty)")
+
+# Hardcoded dict still works after registry failure
+_assert(
+    _pipe_fail._get_provider_icon("openai") == "https://openrouter.ai/images/icons/OpenAI.svg",
+    "_get_provider_icon: hardcoded dict still resolves after registry failure",
+)
+_assert(
+    _pipe_fail._get_provider_icon("x-ai") is None,
+    "_get_provider_icon: x-ai falls back to None when registry failed",
+)
+
+# 25m. Registry HTTP non-200 → empty dict
+_mock_reg_403 = MagicMock()
+_mock_reg_403.status_code = 403
+_mock_reg_403.json.return_value = {"data": []}
+
+_pipe_403 = Pipe()
+_pipe_403.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key")
+with patch.object(_pipe_403._session, "get", return_value=_mock_reg_403):
+    _r_403 = _pipe_403._load_provider_registry()
+_assert(_r_403 == {}, "registry: HTTP 403 → empty dict (no parse, no retry)")
 
 # ── 26. _stream_response() edge cases ────────────────────────────────────────
 
@@ -1716,7 +2293,7 @@ async def _test_pipe_no_msgs_key():
         {"id": "some/model", "name": "Model", "pricing": {"prompt": "not-a-number", "completion": "0"}},
     ]
 }
-pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_ONLY=True)
+pipe.valves = Pipe.Valves(OPENROUTER_API_KEY="test-key", FREE_MODEL_FILTER="only")
 pipe._models_cache = None
 with patch.object(pipe._session, "get", return_value=_mock_invalid_price):
     models = pipe.pipes()
@@ -1777,7 +2354,8 @@ async def _test_pipe_no_msgs_key():
     "cache_control: image_url chunk skipped in mixed content",
 )
 _assert(
-    payload_mixed_img["messages"][0]["content"][1].get("cache_control") == {"type": "ephemeral"},
+    payload_mixed_img["messages"][0]["content"][1].get("cache_control")
+    == {"type": "ephemeral", "ttl": "5m"},
     "cache_control: text chunk in mixed content gets cache_control",
 )
 
@@ -1795,10 +2373,282 @@ async def _test_pipe_no_msgs_key():
 }
 pipe._inject_cache_control(payload_user_list)
 _assert(
-    payload_user_list["messages"][0]["content"][1].get("cache_control") == {"type": "ephemeral"},
+    payload_user_list["messages"][0]["content"][1].get("cache_control")
+    == {"type": "ephemeral", "ttl": "5m"},
     "cache_control: user role list content gets cache_control when no system role",
 )
 
+# ── 28d. v1.6.0 — Web search plugin builder ─────────────────────────────────
+_section("28d. v1.6.0 web search plugin")
+
+_pipe_ws = Pipe()
+_pipe_ws.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_WEB_SEARCH=False)
+_assert(_pipe_ws._build_web_search_plugin() is None, "web search disabled → None")
+
+_pipe_ws.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    ENABLE_WEB_SEARCH=True,
+    WEB_SEARCH_MAX_RESULTS=8,
+    WEB_SEARCH_PROMPT="Find authoritative sources",
+    WEB_SEARCH_INCLUDE_DOMAINS="*.gov, *.edu",
+    WEB_SEARCH_EXCLUDE_DOMAINS="reddit.com",
+)
+_plugin = _pipe_ws._build_web_search_plugin()
+_assert(_plugin and _plugin["id"] == "web", "web plugin id is 'web'")
+_assert(_plugin["max_results"] == 8, "max_results forwarded")
+_assert(_plugin["search_prompt"] == "Find authoritative sources", "custom search_prompt")
+_assert(_plugin["include_domains"] == ["*.gov", "*.edu"], "include_domains parsed")
+_assert(_plugin["exclude_domains"] == ["reddit.com"], "exclude_domains parsed")
+
+# Payload integration: appended to existing user plugins, never duplicated
+_pipe_ws = Pipe()
+_pipe_ws.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_WEB_SEARCH=True)
+_p_ws = _pipe_ws._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert(any(p.get("id") == "web" for p in _p_ws.get("plugins", [])),
+        "ENABLE_WEB_SEARCH: plugins[] contains web entry")
+
+# User plugins preserved alongside web
+_pipe_ws.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_WEB_SEARCH=True)
+_p_ws = _pipe_ws._prepare_payload({
+    "model": "openai/gpt-4o",
+    "messages": [],
+    "plugins": [{"id": "file-parser"}],
+})
+_p_ids = [p.get("id") for p in _p_ws.get("plugins", [])]
+_assert("file-parser" in _p_ids and "web" in _p_ids, "user plugins coexist with auto web plugin")
+
+# Existing user-supplied web plugin wins
+_pipe_ws.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_WEB_SEARCH=True, WEB_SEARCH_MAX_RESULTS=20)
+_p_ws = _pipe_ws._prepare_payload({
+    "model": "openai/gpt-4o",
+    "messages": [],
+    "plugins": [{"id": "web", "max_results": 3}],
+})
+_assert(
+    sum(1 for p in _p_ws["plugins"] if p.get("id") == "web") == 1,
+    "user-supplied web plugin not duplicated by valve injection",
+)
+_assert(
+    _p_ws["plugins"][0].get("max_results") == 3,
+    "user-supplied web plugin keeps its own max_results",
+)
+
+# Web search disabled → no plugin emitted at all
+_pipe_ws.valves = Pipe.Valves(OPENROUTER_API_KEY="k", ENABLE_WEB_SEARCH=False)
+_p_ws = _pipe_ws._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert("plugins" not in _p_ws, "web search disabled: no plugins key added")
+
+# ── 28e. v1.6.0 — REASONING_MAX_TOKENS ──────────────────────────────────────
+_section("28e. v1.6.0 reasoning max_tokens")
+
+_pipe_rmt = Pipe()
+_pipe_rmt.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k", REASONING_EFFORT="high", REASONING_MAX_TOKENS=2048
+)
+_p_rmt = _pipe_rmt._prepare_payload({"model": "openai/o1", "messages": []})
+_assert(
+    _p_rmt.get("reasoning") == {"effort": "high", "max_tokens": 2048},
+    "reasoning.max_tokens emitted alongside effort",
+)
+
+_pipe_rmt.valves = Pipe.Valves(OPENROUTER_API_KEY="k", REASONING_MAX_TOKENS=0)
+_p_rmt = _pipe_rmt._prepare_payload({"model": "openai/o1", "messages": []})
+_assert("reasoning" not in _p_rmt, "max_tokens=0 + no effort: reasoning key omitted")
+
+# ── 28f. v1.6.0 — Provider extras (only/quantizations/allow_fallbacks/max_price) ──
+_section("28f. v1.6.0 provider preferences extras")
+
+_pipe_pp = Pipe()
+_pipe_pp.valves = Pipe.Valves(
+    OPENROUTER_API_KEY="k",
+    PROVIDER_ONLY="anthropic, openai",
+    PROVIDER_QUANTIZATIONS="bf16, fp8",
+    PROVIDER_ALLOW_FALLBACKS=False,
+    PROVIDER_MAX_PRICE_PROMPT="3.0",
+    PROVIDER_MAX_PRICE_COMPLETION="15.0",
+)
+_p_pp = _pipe_pp._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_p_provider = _p_pp.get("provider", {})
+_assert(_p_provider.get("only") == ["anthropic", "openai"], "provider.only forwarded")
+_assert(_p_provider.get("quantizations") == ["bf16", "fp8"], "provider.quantizations lower-cased")
+_assert(_p_provider.get("allow_fallbacks") is False, "provider.allow_fallbacks=False emitted only when opted out")
+_assert(
+    _p_provider.get("max_price") == {"prompt": "3.0", "completion": "15.0"},
+    "provider.max_price merged",
+)
+
+# Defaults: allow_fallbacks=true is implicit (omit field)
+_pipe_pp.valves = Pipe.Valves(OPENROUTER_API_KEY="k", PROVIDER_ALLOW_FALLBACKS=True)
+_p_pp = _pipe_pp._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert(
+    "provider" not in _p_pp or "allow_fallbacks" not in _p_pp.get("provider", {}),
+    "PROVIDER_ALLOW_FALLBACKS=True (default): field omitted",
+)
+
+# ── 28g. v1.6.0 — SERVICE_TIER ──────────────────────────────────────────────
+_section("28g. v1.6.0 service tier")
+
+for tier in ("auto", "default", "flex", "priority", "scale"):
+    _pipe_st = Pipe()
+    _pipe_st.valves = Pipe.Valves(OPENROUTER_API_KEY="k", SERVICE_TIER=tier)
+    _p_st = _pipe_st._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+    _assert(_p_st.get("service_tier") == tier, f"SERVICE_TIER='{tier}' forwarded")
+
+# Bogus value silently dropped
+_pipe_st = Pipe()
+_pipe_st.valves = Pipe.Valves(OPENROUTER_API_KEY="k", SERVICE_TIER="bogus")
+_p_st = _pipe_st._prepare_payload({"model": "openai/gpt-4o", "messages": []})
+_assert("service_tier" not in _p_st, "garbage SERVICE_TIER silently ignored")
+
+# ── 28h. v1.6.0 — Cached prompt-token cost breakdown ────────────────────────
+_section("28h. v1.6.0 cached prompt token reporting")
+
+_format_cost_info = mod._format_cost_info
+
+# OpenAI / Anthropic shape: prompt_tokens_details.cached_tokens
+_cost_with_cache = _format_cost_info({
+    "prompt_tokens": 1000,
+    "completion_tokens": 200,
+    "total_tokens": 1200,
+    "prompt_tokens_details": {"cached_tokens": 800},
+    "cost": 0.0030,
+}, "USD")
+_assert("800 cached" in _cost_with_cache, "cached tokens shown in token line")
+_assert("200 prompt" in _cost_with_cache, "non-cached prompt tokens shown (1000-800=200)")
+
+# Alternate shape: cache_read_input_tokens (some Anthropic surfaces)
+_cost_alt = _format_cost_info({
+    "prompt_tokens": 500,
+    "completion_tokens": 100,
+    "cache_read_input_tokens": 400,
+}, "USD")
+_assert("400 cached" in _cost_alt, "cache_read_input_tokens recognised")
+
+# No cache info → original format preserved
+_cost_plain = _format_cost_info({
+    "prompt_tokens": 100, "completion_tokens": 50, "total_tokens": 150
+}, "USD")
+_assert("cached" not in _cost_plain, "no cache field: footer unchanged")
+
+# ── 28i. v1.6.0 — Generation ID footer ──────────────────────────────────────
+_section("28i. v1.6.0 generation id footer")
+
+_format_gen = mod._format_generation_id
+_assert(_format_gen(None) == "", "None → empty string")
+_assert(_format_gen("") == "", "empty → empty string")
+out = _format_gen("gen-abc123")
+_assert("gen-abc123" in out, "generation id appears in footer")
+_assert("`gen-abc123`" in out, "generation id wrapped in backticks for click-to-copy")
+
+# Non-stream response surfaces the id when SHOW_GENERATION_ID=True
+_pipe_gen = Pipe()
+_pipe_gen.valves = Pipe.Valves(OPENROUTER_API_KEY="k", SHOW_GENERATION_ID=True)
+_mock_gen_resp = MagicMock()
+_mock_gen_resp.json.return_value = {
+    "id": "gen-zzz111",
+    "model": "openai/gpt-4o",
+    "choices": [{"message": {"content": "hi", "role": "assistant"}}],
+}
+with patch.object(_pipe_gen, "_retryable_request", return_value=_mock_gen_resp):
+    _out = _pipe_gen._non_stream_response({}, {"model": "openai/gpt-4o"})
+_assert("gen-zzz111" in _out, "non-stream: generation id rendered when SHOW_GENERATION_ID=True")
+
+# Toggled off → no footer
+_pipe_gen.valves = Pipe.Valves(OPENROUTER_API_KEY="k", SHOW_GENERATION_ID=False)
+_mock_gen_resp.json.return_value = {
+    "id": "gen-zzz111",
+    "model": "openai/gpt-4o",
+    "choices": [{"message": {"content": "hi", "role": "assistant"}}],
+}
+with patch.object(_pipe_gen, "_retryable_request", return_value=_mock_gen_resp):
+    _out = _pipe_gen._non_stream_response({}, {"model": "openai/gpt-4o"})
+_assert("gen-zzz111" not in _out, "SHOW_GENERATION_ID=False: footer suppressed")
+
+# ── 28j. v1.6.0 — MODEL_CATEGORY query param ────────────────────────────────
+_section("28j. v1.6.0 MODEL_CATEGORY")
+
+_mock_cat_resp = MagicMock()
+_mock_cat_resp.status_code = 200
+_mock_cat_resp.json.return_value = {"data": [{"id": "openai/gpt-4o", "name": "GPT-4o"}]}
+_mock_cat_resp.raise_for_status = MagicMock()
+
+_captured_params = {}
+
+def _capture_cat(*args, **kwargs):
+    _captured_params.clear()
+    _captured_params.update(kwargs)
+    return _mock_cat_resp
+
+_pipe_cat = Pipe()
+_pipe_cat.valves = Pipe.Valves(OPENROUTER_API_KEY="k", MODEL_CATEGORY="programming")
+_pipe_cat._models_cache = None
+with patch.object(_pipe_cat._session, "get", side_effect=_capture_cat):
+    _pipe_cat.pipes()
+_assert(
+    _captured_params.get("params", {}).get("category") == "programming",
+    "MODEL_CATEGORY: '?category=programming' forwarded to /models",
+)
+
+# Empty category → no category param sent
+_pipe_cat = Pipe()
+_pipe_cat.valves = Pipe.Valves(OPENROUTER_API_KEY="k", MODEL_CATEGORY="")
+_pipe_cat._models_cache = None
+with patch.object(_pipe_cat._session, "get", side_effect=_capture_cat):
+    _pipe_cat.pipes()
+_assert(
+    "category" not in _captured_params.get("params", {}),
+    "empty MODEL_CATEGORY: no category param sent",
+)
+
+# ── 28k. v1.6.0 — Deprecated model tagging ──────────────────────────────────
+_section("28k. v1.6.0 deprecated model handling")
+
+_mock_deprec = {
+    "data": [
+        {"id": "openai/gpt-3.5-turbo", "name": "GPT-3.5", "expiration_date": "2026-09-01"},
+        {"id": "openai/gpt-4o", "name": "GPT-4o"},
+    ]
+}
+_mock_deprec_resp = MagicMock()
+_mock_deprec_resp.status_code = 200
+_mock_deprec_resp.json.return_value = _mock_deprec
+_mock_deprec_resp.raise_for_status = MagicMock()
+
+# Default: deprecated kept and tagged
+_pipe_dep = Pipe()
+_pipe_dep.valves = Pipe.Valves(OPENROUTER_API_KEY="k")
+_pipe_dep._models_cache = None
+with patch.object(_pipe_dep._session, "get", return_value=_mock_deprec_resp):
+    _dep_models = _pipe_dep.pipes()
+_dep_by_id = {m["id"]: m["name"] for m in _dep_models}
+_assert("openai/gpt-3.5-turbo" in _dep_by_id, "deprecated model still listed by default")
+_assert("⚠" in _dep_by_id["openai/gpt-3.5-turbo"], "deprecated model tagged with ⚠ marker")
+_assert("(deprecated)" in _dep_by_id["openai/gpt-3.5-turbo"], "deprecated label appended to name")
+_assert("⚠" not in _dep_by_id["openai/gpt-4o"], "live model untouched")
+
+# HIDE_DEPRECATED_MODELS=True drops them
+_pipe_dep = Pipe()
+_pipe_dep.valves = Pipe.Valves(OPENROUTER_API_KEY="k", HIDE_DEPRECATED_MODELS=True)
+_pipe_dep._models_cache = None
+with patch.object(_pipe_dep._session, "get", return_value=_mock_deprec_resp):
+    _dep_models = _pipe_dep.pipes()
+_dep_ids = {m["id"] for m in _dep_models}
+_assert(_dep_ids == {"openai/gpt-4o"}, "HIDE_DEPRECATED_MODELS=True: deprecated rows removed")
+
+# ── 28l. v1.6.0 — Cache-key invalidates on new filter valves ────────────────
+_section("28l. v1.6.0 cache key includes MODEL_CATEGORY / HIDE_DEPRECATED_MODELS")
+
+_keys_v16 = []
+for v in [
+    {},
+    {"MODEL_CATEGORY": "programming"},
+    {"HIDE_DEPRECATED_MODELS": True},
+]:
+    _p = Pipe()
+    _p.valves = Pipe.Valves(OPENROUTER_API_KEY="k", **v)
+    _keys_v16.append(_p._build_cache_key())
+_assert(len(set(_keys_v16)) == len(_keys_v16), "cache key differs per new v1.6 filter valve")
+
 # ── 29. _non_stream_response() edge cases ────────────────────────────────────
 
 _section("29. _non_stream_response() edge cases")