You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/latest/plugins/ai-cache.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ The `ai-cache` Plugin must be used with the [`ai-proxy`](./ai-proxy.md) or [`ai-
48
48
49
49
By default the cache is isolated per route, so two routes never serve each other's entries even when they see the same protocol, model and messages. Set `cache_key.share_across_routes` to `true` to share one cache space across routes.
50
50
51
-
The cache key uses the **requested** model, not the model a route may rewrite to server-side (`ai-proxy``options.model`or `ai-proxy-multi` instance selection). When sharing across routes, isolate routes that rewrite to different upstream models with separate Redis instances or with `cache_key.include_vars`.
51
+
Even with `cache_key.share_across_routes` enabled, responses from different upstream models or providers are kept in separate cache entries, so one model's response is never served for another.
52
52
53
53
:::
54
54
@@ -62,6 +62,7 @@ The cache key uses the **requested** model, not the model a route may rewrite to
62
62
| cache_key.include_vars | array[string]| False |[]|| NGINX variables added to the cache scope (for example `["http_x_tenant"]`), isolating entries by their values. |
63
63
| max_cache_body_size | integer | False | 1048576 | >= 0 | Maximum response body size, in bytes, to cache. Larger responses are not cached. |
64
64
| cache_headers | boolean | False | true || If true, add the `X-AI-Cache-Status` response header (and `X-AI-Cache-Age`, the entry age in seconds, on a hit). |
65
+
| fail_mode | string | False |`"skip"`|`skip`, `warn`, `error`| Behavior when the request is not a recognized AI request that this Plugin can cache (for example, a request that did not pass through `ai-proxy` or `ai-proxy-multi`). `skip`: let the request pass through uncached; `warn`: pass through uncached and log a warning; `error`: reject the request. |
65
66
| bypass_on | array[object]| False ||| Rules that skip the cache entirely (no lookup, no write-back) when any rule matches. |
66
67
| bypass_on[].header | string | True ||| Request header name to match. |
67
68
| bypass_on[].equals | string | True ||| Bypass when the request header's value exactly equals this string. |
0 commit comments