Skip to content

Commit 87132e5

Browse files
authored
feat(plugin): add ModelRouter before auth with single-slot routing targets (router-for-me#3865)
* feat(plugin): add ModelRouter before auth with single-slot routing targets ## Motivation Plugins that need to change execution based on the **original inbound request** (protocol format, raw body, headers, query, stream flag, metadata, etc.) often resorted to virtual/trampoline models or routing inside interceptors. This commit adds **ModelRouter**: a pluggable layer **before** model-to-provider resolution and AuthManager credential selection, so plugins can declare who executes a request without spoofing the client model name. This is a **new capability**, not a bugfix on the existing chain. With no ModelRouter plugins loaded, behavior matches upstream. ## Pipeline placement - `execute`, `stream`, and `count` (and image paths via AuthManager) call `applyModelRouter()` before building `coreexecutor.Request`. - Routing runs **before** the request interceptor (before auth), so routers see the client’s original context. After a plugin executor is chosen, the existing **after-auth interceptor → response/stream interceptor** chain still applies. - Internal `ExecuteModel` / `ExecuteModelStream` (host callbacks) support `SkipRouterPluginID` so nested calls do not re-enter the same router. ## Routing API (single slot, mutually exclusive) `ModelRouteResponse` uses **one target slot** to avoid ambiguity when both `TargetExecutorPluginID` and `TargetProvider` were set and the host ignored one: | Field | Meaning | |-------|---------| | `Handled` | `false`: this router declines; try the next router or default path | | `TargetKind` | `self` \| `executor` \| `provider` (pick one) | | `Target` | `self`/`executor`: plugin ID; `provider`: built-in provider key | | `TargetModel` | Optional on `provider` only; empty keeps client `RequestedModel` | | `Reason` | Optional diagnostic text | - **self**: the router plugin’s own executor (`Target` normalized to the router’s plugin ID). - **executor**: another plugin’s executor; host pre-checks with `executorPluginReady()` (executor declared and provider identifier resolvable) to avoid handled routes that 500 at execution. - **provider**: skip registry model resolution; fixed built-in AuthManager path; optional `TargetModel` for execution model only—**does not** change outward requested-model metadata. Routers run in **descending plugin priority** (tie-break: ascending plugin ID). Panic, error, invalid target, or unavailable executor/provider → log and **fall through to the next router**; if none handle, use the original provider+auth flow. ## Context exposed to routers `ModelRouteRequest` includes: - `SourceFormat`, `RequestedModel`, `Stream` - `Headers`, `Query`, `Body` (defensive copies) - `Metadata` (best-effort read-only context snapshot) - `AvailableProviders`: built-in provider keys with at least one **non-disabled** auth (`AuthManager.AvailableProviders()`). **Does not** reflect per-model cooldown or transient unavailability—treat as an optimistic snapshot. Adds `AuthManager.HasProviderAuth()` and `AvailableProviders()`, excluding `Disabled` and `StatusDisabled` auths consistently with credential selection. ## Host and RPC - Go plugins: `pluginapi.ModelRouter` + `RouteModel()`. - RPC plugins: `pluginabi.MethodModelRoute` (`model.route`), capability flag `model_router`. - `pluginhost.Host` implements `RouteModel` / `RouteModelExcept`; handlers use `SetModelRouterHost` or a `PluginHost` type assertion; **direct executor** paths use `ExecutePluginExecutor*` / `CountPluginExecutor`. - No bundled example ModelRouter plugin; capability is active only when a third-party plugin declares `model_router` and loads. ## Plugin RPC schema (policy A, upstream-aligned) - `pluginabi.SchemaVersion` stays **1**: capability additions (`model_router`, `model.route`) do not bump the number; increment only on breaking RPC JSON changes. - Host sends `schema_version` at register; reject only if the plugin declares a **higher** version than the host. - No unpublished “ModelRouter requires schema ≥ 3” gate (v3 single-slot API was never public). - Existing plugins and examples without `model_router` (`schema_version: 1`) need no changes. - RPC ModelRouter: `schema_version: 1` + `model_router: true` + implement `model.route`. ## Path consistency within this commit - Provider routes reuse image-only model checks (e.g. `gpt-image-2`) on the normalized model, same as the default AuthManager path. - `count` aligned with execute/stream: `SkipRouterPluginID`, query/headers injection, interceptor skip semantics. - Handlers: `modelRoutersEnabled` treats hosts without `HasModelRouters` as disabled (same as before ModelRouter existed); `pluginhost.Host` implements the detector. - API docs: `ModelRouter` explicitly includes built-in **provider** targets (in addition to plugin executors and the router’s own executor). ## Testing go test ./internal/pluginhost ./sdk/api/handlers ./sdk/pluginapi ./sdk/pluginabi ./sdk/cliproxy/auth go build -o test-output ./cmd/server && rm test-output go test ./... * fix(handlers): address ModelRouter review feedback - Use modelExecutionQuery for plugin executor and AuthManager paths so inbound URL query matches router/header behavior - Guard queryFromContext when gin Request.URL is nil - Read plugin executor stream chunks via nextStreamChunk to exit on cancel - Drop redundant clonePluginMetadata on capability record meta Tests cover query propagation, stream cancel, and nil URL safety. * feat(plugin): add Claude web search router example Add a Claude Code web_search ModelRouter example that can route matching Claude requests through Antigravity, Codex, xAI, or Tavily. The plugin includes executor orchestration, backend fallback/penalty handling, Tavily API key support, Claude-compatible response assembly, stream forwarding, and focused unit coverage for detection, fallback routing, model resolution, penalties, stream forwarding, and Tavily behavior. Verification: go test -count=1 ./... in examples/plugin/claude-web-search-router/go; go build -buildmode=c-shared for the plugin; go build ./cmd/server; live local CPA curl coverage for plugin load, four explicit routes, fallback, and Codex spark routing. * fix(pluginhost): validate executor routes before fallback * fix(pluginhost): skip oauth-only executor routes
1 parent 9f940f1 commit 87132e5

44 files changed

Lines changed: 5118 additions & 36 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

examples/plugin/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
EXAMPLES := simple model auth frontend-auth executor protocol-format request-translator request-normalizer response-translator response-normalizer thinking usage cli management-api host-callback host-callback-auth-files host-model-callback
1+
EXAMPLES := simple model auth frontend-auth executor protocol-format request-translator request-normalizer response-translator response-normalizer thinking usage cli management-api host-callback host-callback-auth-files host-model-callback claude-web-search-router
22
LANGUAGES := go c rust
33
BIN_DIR := $(CURDIR)/bin
44
BUILD_DIR := $(BIN_DIR)/build

examples/plugin/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ This directory contains standard dynamic library plugin examples for the CLIProx
1616
- `request-normalizer/`: request normalization capability only.
1717
- `codex-service-tier/`: Go-only request normalizer that sets Codex `gpt-5.5` requests to the priority service tier when enabled.
1818
- `scheduler/`: Go-only scheduler that can select a configured auth ID, delegate to a built-in scheduler, or deny picks.
19+
- `claude-web-search-router/`: ModelRouter + executor for Claude Code built-in `web_search` (antigravity / codex / xai / Tavily). See `claude-web-search-router/README.md`.
1920
- `response-translator/`: response translation capability only.
2021
- `response-normalizer/`: response normalization capability only.
2122
- `thinking/`: thinking applier capability only.
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Claude Code Web Search Router (ModelRouter example)
2+
3+
This plugin demonstrates **ModelRouter** on Claude Code built-in `web_search` requests (see `temp/1.json` in the repo root for a captured request/response).
4+
5+
## What it detects
6+
7+
- Inbound protocol `claude` / `anthropic`
8+
- `tools[]` with `type` `web_search_20250305` or `web_search_20260209`
9+
- Optional Claude Code heuristics: system text like “web search tool use”, or user text
10+
`Perform a web search for the query: …`
11+
12+
## Routes (`route` config)
13+
14+
| Value | Behavior |
15+
| ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
16+
| `fallback` (**default**) | Plugin **executor** runs **antigravity → codex → xai → tavily** (built-ins via `host.model.*`, Tavily in-plugin). On **429/503/502**, tries the next backend in the same request. Backends that fail often are **deprioritized on later requests** (in-memory penalty; no extra config). |
17+
| `antigravity_google` / `codex_web_search` / `xai_web_search` / `tavily` | Same orchestration for that backend’s chain member(s): execution retry + penalty apply when multiple backends are eligible. |
18+
| `default_provider` | `default_provider` + optional `default_provider_model` via built-in AuthManager (not orchestrated). |
19+
Routing for `fallback` requires at least one runnable backend (providers in `AvailableProviders` where needed, resolvable antigravity model, or `tavily_api_keys`).
20+
21+
### xAI web search notes (aligned with upstream docs)
22+
23+
- **Model**: xAI documents `grok-4.3` for server-side `web_search`. This example sets `TargetModel` to **`grok-4.3`** when `xai_model` is empty (do not forward `claude-sonnet-4-6` to xAI).
24+
- **Request shape**: Responses API `input` + `tools[]` with `"type": "web_search"`. Optional `filters.allowed_domains` / `filters.excluded_domains` (max 5 each, mutually exclusive).
25+
- **Claude mapping today**: `internal/translator/codex/claude` copies Claude `allowed_domains``filters.allowed_domains`. Claude `blocked_domains` is **not** mapped to `excluded_domains` yet.
26+
- **Executor**: `xai_executor` normalizes tools (drops unsupported `external_web_access` if present) and posts to `/responses`.
27+
- **Response**: Citations / server tool metadata come back through OpenAI Responses SSE and are converted toward Claude `server_tool_use` / `web_search_tool_result` where the response translator supports it.
28+
29+
## Configuration
30+
31+
Plugin config lives under `plugins.configs.claude-web-search-router` (key must match the plugin name). Load the shared library via `plugins.path`.
32+
33+
### Recommended: fallback chain (default)
34+
35+
Tries **antigravity → codex → xai → tavily**; configure `tavily_api_keys` so the last step can succeed when built-in providers are missing or unavailable.
36+
37+
```yaml
38+
plugins:
39+
path:
40+
- /absolute/path/to/examples/plugin/bin/claude-web-search-router-go.dylib
41+
configs:
42+
claude-web-search-router:
43+
enabled: true
44+
priority: 20
45+
route: fallback
46+
antigravity_model: "" # empty: registry lookup, then first supports_web_search
47+
codex_model: "gpt-5.4-mini"
48+
xai_model: "grok-4.3"
49+
tavily_api_keys:
50+
- "tvly-xxxxxxxx"
51+
# - "tvly-yyyyyyyy" # optional: round-robin
52+
require_web_search_only: true
53+
```
54+
55+
Omit `route` to use the same default (`fallback`).
56+
57+
### Minimal fallback (Tavily as last resort only)
58+
59+
```yaml
60+
plugins:
61+
configs:
62+
claude-web-search-router:
63+
enabled: true
64+
priority: 20
65+
route: fallback
66+
tavily_api_keys:
67+
- "tvly-xxxxxxxx"
68+
require_web_search_only: true
69+
```
70+
71+
### Single backend (no fallback)
72+
73+
**Antigravity only:**
74+
75+
```yaml
76+
plugins:
77+
configs:
78+
claude-web-search-router:
79+
enabled: true
80+
priority: 20
81+
route: antigravity_google
82+
antigravity_model: "gemini-3.1-flash-lite"
83+
require_web_search_only: true
84+
```
85+
86+
**Codex only:**
87+
88+
```yaml
89+
plugins:
90+
configs:
91+
claude-web-search-router:
92+
enabled: true
93+
priority: 20
94+
route: codex_web_search
95+
codex_model: "gpt-5.4-mini"
96+
require_web_search_only: true
97+
```
98+
99+
**xAI only:**
100+
101+
```yaml
102+
plugins:
103+
configs:
104+
claude-web-search-router:
105+
enabled: true
106+
priority: 20
107+
route: xai_web_search
108+
xai_model: "grok-4.3"
109+
require_web_search_only: true
110+
```
111+
112+
**Tavily only (plugin executor):**
113+
114+
```yaml
115+
plugins:
116+
configs:
117+
claude-web-search-router:
118+
enabled: true
119+
priority: 20
120+
route: tavily
121+
tavily_api_keys:
122+
- "tvly-xxxxxxxx"
123+
require_web_search_only: true
124+
```
125+
126+
**Built-in provider via `default_provider`:**
127+
128+
```yaml
129+
plugins:
130+
configs:
131+
claude-web-search-router:
132+
enabled: true
133+
priority: 20
134+
route: default_provider
135+
default_provider: claude
136+
default_provider_model: ""
137+
require_web_search_only: true
138+
```
139+
140+
### Disable or relax detection
141+
142+
```yaml
143+
plugins:
144+
configs:
145+
claude-web-search-router:
146+
enabled: false # plugin declines; host may use default Claude path
147+
148+
# Or keep enabled but allow mixed tool lists:
149+
claude-web-search-router:
150+
enabled: true
151+
route: fallback
152+
require_web_search_only: false
153+
```
154+
155+
### Config field reference
156+
157+
| Field | Description |
158+
| ----- | ----------- |
159+
| `enabled` | `false` → `Handled: false` for all web_search matches |
160+
| `priority` | Host plugin order for ModelRouter (higher runs earlier; see main repo plugins docs) |
161+
| `route` | `fallback` (default), `antigravity_google`, `codex_web_search`, `xai_web_search`, `tavily`, `default_provider` |
162+
| `antigravity_model` | Antigravity execution model; never the client Claude model name |
163+
| `codex_model` | Codex model; empty → `gpt-5.4-mini` |
164+
| `xai_model` | xAI model; empty → `grok-4.3` |
165+
| `default_provider` / `default_provider_model` | Used when `route=default_provider` |
166+
| `tavily_api_keys` | Required for `route=tavily` or fallback last step |
167+
| `require_web_search_only` | `true` matches Claude Code–style exclusive `web_search` tools |
168+
169+
## Build
170+
171+
```bash
172+
make -C examples/plugin bin/claude-web-search-router-go.dylib
173+
```
174+
175+
Use `.so` on Linux and `.dll` on Windows. Point `plugins.path` at the built artifact.
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
package main
2+
3+
import (
4+
"encoding/json"
5+
"fmt"
6+
"strings"
7+
"time"
8+
)
9+
10+
type claudeStreamBuilder struct {
11+
model string
12+
messageID string
13+
toolUseID string
14+
index int
15+
inputTokens int
16+
}
17+
18+
func newClaudeStreamBuilder(model string) *claudeStreamBuilder {
19+
model = strings.TrimSpace(model)
20+
if model == "" {
21+
model = "claude-sonnet-4-6"
22+
}
23+
now := time.Now().UnixNano()
24+
return &claudeStreamBuilder{
25+
model: model,
26+
messageID: fmt.Sprintf("msg_%x", now),
27+
toolUseID: fmt.Sprintf("srvtoolu_%d", now),
28+
inputTokens: 85,
29+
}
30+
}
31+
32+
func (b *claudeStreamBuilder) buildStreamWithQuery(query string, hits []claudeWebSearchHit, answer string) []byte {
33+
var chunks []string
34+
chunks = append(chunks, b.event("message_start", map[string]any{
35+
"type": "message_start",
36+
"message": map[string]any{
37+
"id": b.messageID, "type": "message", "role": "assistant", "content": []any{},
38+
"model": b.model, "stop_reason": nil, "stop_sequence": nil,
39+
"usage": map[string]any{"input_tokens": b.inputTokens, "output_tokens": 0},
40+
},
41+
}))
42+
chunks = append(chunks, b.blockStart(b.index, map[string]any{
43+
"type": "server_tool_use", "id": b.toolUseID, "name": "web_search", "input": map[string]any{},
44+
}))
45+
partial, _ := json.Marshal(map[string]string{"query": query})
46+
chunks = append(chunks, b.event("content_block_delta", map[string]any{
47+
"type": "content_block_delta", "index": b.index,
48+
"delta": map[string]any{"type": "input_json_delta", "partial_json": string(partial)},
49+
}))
50+
chunks = append(chunks, b.event("content_block_stop", map[string]any{"type": "content_block_stop", "index": b.index}))
51+
b.index++
52+
53+
resultContent := webSearchResultBlocks(hits)
54+
chunks = append(chunks, b.blockStart(b.index, map[string]any{
55+
"type": "web_search_tool_result", "tool_use_id": b.toolUseID, "content": resultContent,
56+
}))
57+
chunks = append(chunks, b.event("content_block_stop", map[string]any{"type": "content_block_stop", "index": b.index}))
58+
b.index++
59+
60+
text := composeAnswerText(answer, hits)
61+
outputTokens := estimateTokens(text)
62+
chunks = append(chunks, b.blockStart(b.index, map[string]any{"type": "text", "text": ""}))
63+
chunks = append(chunks, b.event("content_block_delta", map[string]any{
64+
"type": "content_block_delta", "index": b.index,
65+
"delta": map[string]any{"type": "text_delta", "text": text},
66+
}))
67+
chunks = append(chunks, b.event("content_block_stop", map[string]any{"type": "content_block_stop", "index": b.index}))
68+
69+
chunks = append(chunks, b.event("message_delta", map[string]any{
70+
"type": "message_delta",
71+
"delta": map[string]any{"stop_reason": "end_turn", "stop_sequence": nil},
72+
"usage": map[string]any{
73+
"input_tokens": b.inputTokens, "output_tokens": outputTokens,
74+
"server_tool_use": map[string]any{"web_search_requests": 1},
75+
},
76+
}))
77+
chunks = append(chunks, b.event("message_stop", map[string]any{"type": "message_stop"}))
78+
return []byte(strings.Join(chunks, ""))
79+
}
80+
81+
func (b *claudeStreamBuilder) buildMessageJSON(query string, hits []claudeWebSearchHit, answer string) []byte {
82+
text := composeAnswerText(answer, hits)
83+
content := []map[string]any{
84+
{"type": "server_tool_use", "id": b.toolUseID, "name": "web_search", "input": map[string]string{"query": query}},
85+
{"type": "web_search_tool_result", "tool_use_id": b.toolUseID, "content": webSearchResultBlocks(hits)},
86+
{"type": "text", "text": text},
87+
}
88+
out := map[string]any{
89+
"id": b.messageID, "type": "message", "role": "assistant", "model": b.model,
90+
"content": content, "stop_reason": "end_turn", "stop_sequence": nil,
91+
"usage": map[string]any{
92+
"input_tokens": b.inputTokens, "output_tokens": estimateTokens(text),
93+
"server_tool_use": map[string]any{"web_search_requests": 1},
94+
},
95+
}
96+
raw, _ := json.Marshal(out)
97+
return raw
98+
}
99+
100+
func webSearchResultBlocks(hits []claudeWebSearchHit) []map[string]any {
101+
resultContent := make([]map[string]any, 0, len(hits))
102+
for _, hit := range hits {
103+
title := hit.Title
104+
if title == "" {
105+
title = hostFromURL(hit.URL)
106+
}
107+
resultContent = append(resultContent, map[string]any{
108+
"type": "web_search_result", "title": title, "url": hit.URL, "page_age": nil,
109+
})
110+
}
111+
return resultContent
112+
}
113+
114+
func (b *claudeStreamBuilder) event(eventType string, data map[string]any) string {
115+
raw, _ := json.Marshal(data)
116+
return fmt.Sprintf("event: %s\ndata: %s\n\n", eventType, string(raw))
117+
}
118+
119+
func (b *claudeStreamBuilder) blockStart(index int, block map[string]any) string {
120+
return b.event("content_block_start", map[string]any{
121+
"type": "content_block_start", "index": index, "content_block": block,
122+
})
123+
}
124+
125+
func composeAnswerText(answer string, hits []claudeWebSearchHit) string {
126+
if strings.TrimSpace(answer) != "" {
127+
return answer
128+
}
129+
if len(hits) == 0 {
130+
return "No web search results were returned."
131+
}
132+
var buf strings.Builder
133+
for i, hit := range hits {
134+
if i > 0 {
135+
buf.WriteString("\n\n")
136+
}
137+
if hit.Title != "" {
138+
buf.WriteString(hit.Title)
139+
buf.WriteString("\n")
140+
}
141+
if hit.URL != "" {
142+
buf.WriteString(hit.URL)
143+
buf.WriteString("\n")
144+
}
145+
if hit.Snippet != "" {
146+
buf.WriteString(hit.Snippet)
147+
}
148+
}
149+
return buf.String()
150+
}
151+
152+
func hostFromURL(raw string) string {
153+
raw = strings.TrimSpace(raw)
154+
if raw == "" {
155+
return ""
156+
}
157+
withoutScheme := raw
158+
if idx := strings.Index(raw, "://"); idx >= 0 {
159+
withoutScheme = raw[idx+3:]
160+
}
161+
if slash := strings.Index(withoutScheme, "/"); slash >= 0 {
162+
return withoutScheme[:slash]
163+
}
164+
return withoutScheme
165+
}
166+
167+
func estimateTokens(text string) int {
168+
n := len([]rune(text)) / 4
169+
if n < 1 {
170+
return 1
171+
}
172+
return n
173+
}
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
package main
2+
3+
import "testing"
4+
5+
func TestConfigurePreservesDefaultBooleansWhenConfigIsPartial(t *testing.T) {
6+
raw := mustJSON(t, lifecycleRequest{ConfigYAML: []byte("route: codex_web_search\n")})
7+
8+
if errConfigure := configure(raw); errConfigure != nil {
9+
t.Fatalf("configure() error = %v", errConfigure)
10+
}
11+
12+
cfg := loadedConfig()
13+
if !cfg.Enabled {
14+
t.Fatal("Enabled = false, want default true")
15+
}
16+
if !cfg.RequireWebSearchOnly {
17+
t.Fatal("RequireWebSearchOnly = false, want default true")
18+
}
19+
if cfg.Route != string(backendCodexWebSearch) {
20+
t.Fatalf("Route = %q, want codex_web_search", cfg.Route)
21+
}
22+
}

0 commit comments

Comments
 (0)