Skip to content

Commit e7eb887

Browse files
authored
feat: model_override adapter with parameterized adapter config (#438)
## Why Some LLM providers don't respect reasoning-related fields in the request body. Instead, they expose reasoning variants as **separate model names** — e.g. `deepseek-r1` (with reasoning) vs `deepseek-r1-fast` (without). When a client sends `reasoning: { effort: "none" }` or `enable_thinking: false`, the provider ignores it because the client addressed the "thinking" model, not the "fast" one. Plexus currently has no way to bridge this gap — the router picks a target model, and that's what gets sent to the provider. There was no mechanism to say "if the client is asking for no reasoning, switch to the fast variant instead." ## What This Enables **Conditional model rewriting based on request content.** You can now configure rules that say: > "When the target model is `deepseek-r1` AND any of these conditions are true (reasoning disabled, thinking turned off, budget set to zero, etc.), rewrite the model name to `deepseek-r1-fast` before sending it to the provider." This means: - Clients can keep using a single model alias (e.g. `glm-5.1`) and let Plexus automatically pick the right provider variant based on what features they're requesting - Providers that use separate model names for reasoning/non-reasoning variants are now first-class citizens in Plexus — no manual model switching required - The rewrite is transparent to the client — they still see the canonical model name in billing/usage tracking ## Bonus Fix The OpenAI chat→chat transformer was silently dropping any fields it didn't explicitly know about (e.g. `enable_thinking`, `chat_template_kwargs`, `thinking_budget`, `budget_tokens`, `thinking`). This meant even if you configured conditions for those fields, the adapter could never match them because the fields were gone by the time it saw the payload. Now, same-API-type transformations (chat→chat) preserve all unknown fields from the original request body, overlaying only the fields that need explicit transformation. This makes the adapter useful for any non-standard field a provider might use. ## How to Use It Configure the `model_override` adapter on a **per-model** basis in the provider settings (via the Plexus UI or management API): 1. Open the provider in the UI, expand the model entry (e.g. `zai-org/GLM-5.1-FP8`) 2. Under **Model Adapters**, enable **Model Override** 3. Add rules specifying: - **Match model** — the provider model name to match against (e.g. `zai-org/GLM-5.1-FP8`) - **Rewrite to** — the model name to send instead (e.g. `glm-5.1-fast`) - **Conditions** — dotted-path fields and values, any match triggers the rewrite (OR semantics) Example conditions for detecting "no reasoning" requests: | Field path | Value | |---|---| | `reasoning.enabled` | `false` | | `reasoning.effort` | `"none"` | | `enable_thinking` | `false` | | `thinking.type` | `"disabled"` | | `chat_template_kwargs.enable_thinking` | `false` | | `thinking_budget` | `0` | | `budget_tokens` | `0` | If `value` is left empty, the condition matches on field presence alone (any value). ## Implementation Notes - **Adapter config** upgraded from bare strings to uniform `{ name, options }` entries. Legacy format is normalized lazily on read — no DB migration needed, rows self-heal on next save - **Adapter interface** gains optional `options` parameter on all hooks; existing adapters are backward-compatible - **Frontend** includes an inline rule editor for model_override conditions in the per-model adapter section - 1344 backend tests passing, frontend build + typecheck + lint clean
1 parent 989140d commit e7eb887

17 files changed

Lines changed: 1105 additions & 115 deletions

File tree

docs/CONFIGURATION.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ Adapters can be set at **provider level** (applied to every model under the prov
140140
|---------|-------------|
141141
| `reasoning_content` | Renames `reasoning` / `thinking.content``reasoning_content` on outbound assistant messages for providers that use Fireworks/DeepSeek field naming (e.g. Fireworks DeepSeek-R1). Fixes *"Extra inputs are not permitted, field: messages[N].reasoning"* errors. |
142142
| `suppress_developer_role` | Rewrites the `developer` role to `system` on outbound messages for providers that do not support the newer OpenAI `developer` role. |
143+
| `model_override` | Conditionally rewrites the provider model name based on request payload fields. Used for providers that expose reasoning variants as separate model names rather than respecting reasoning-related fields in the request body. See [Model Override Adapter](#model-override-adapter) below. |
143144

144145
**Example — provider-level:**
145146
```json
@@ -164,6 +165,66 @@ PUT /v0/management/providers/fireworks
164165

165166
Adapters are applied in order on outbound (preDispatch) and in reverse on inbound (postDispatch). Pass-through optimisation is automatically disabled when any adapter is active.
166167

168+
### Model Override Adapter
169+
170+
The `model_override` adapter conditionally rewrites the provider model name based on the values or presence of fields in the request payload. This is useful for providers that expose reasoning variants as **separate model names** (e.g. `model-name` with reasoning, `model-name-fast` without) rather than respecting reasoning-related fields in the request body.
171+
172+
**How it works:**
173+
174+
When the resolved provider model matches a rule's `model` field AND **any** of the rule's conditions are satisfied (OR semantics), the model name is rewritten to `rewriteTo`. Conditions use dotted paths into the request payload.
175+
176+
**Configuration:**
177+
178+
The `model_override` adapter is configured at **model level** only (not provider level). It accepts a `rules` array in its options:
179+
180+
```json
181+
{
182+
"models": {
183+
"zai-org/GLM-5.1-FP8": {
184+
"adapter": [
185+
{
186+
"name": "model_override",
187+
"options": {
188+
"rules": [
189+
{
190+
"model": "zai-org/GLM-5.1-FP8",
191+
"rewriteTo": "glm-5.1-fast",
192+
"conditions": [
193+
{ "field": "enable_thinking", "value": false },
194+
{ "field": "reasoning.enabled", "value": false },
195+
{ "field": "reasoning.effort", "value": "none" },
196+
{ "field": "budget_tokens", "value": 0 }
197+
]
198+
}
199+
]
200+
}
201+
}
202+
]
203+
}
204+
}
205+
}
206+
```
207+
208+
**Rule fields:**
209+
210+
| Field | Description |
211+
|-------|-------------|
212+
| `model` | The provider model name to match against (must match the resolved target model) |
213+
| `rewriteTo` | The model name to send to the provider instead |
214+
| `conditions` | Array of conditions; **any** match triggers the rewrite |
215+
216+
**Condition fields:**
217+
218+
| Field | Description |
219+
|-------|-------------|
220+
| `field` | Dotted path into the request payload (e.g. `reasoning.enabled`, `chat_template_kwargs.enable_thinking`) |
221+
| `value` | Value to match (strict equality). If omitted, the condition matches when the field is present (any value) |
222+
223+
**Notes:**
224+
- The adapter operates on the **transformed provider payload**, so fields must survive API transformation to be matchable. For chat-to-chat requests, all fields from the original request body are preserved (including non-standard fields like `enable_thinking`, `thinking_budget`, etc.).
225+
- Multiple rules are evaluated in order; only the first matching rule applies.
226+
- The rewrite is transparent to the client — billing and usage tracking still reference the original canonical model name.
227+
167228
### Provider Quota Checkers
168229

169230
Quota checkers monitor upstream provider rate limits and prevent routing to exhausted providers.

docs/openapi/components/schemas/ProviderConfig.yaml

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -109,11 +109,25 @@ properties:
109109
- type: string
110110
- type: array
111111
items:
112-
type: string
112+
oneOf:
113+
- type: string
114+
- type: object
115+
required:
116+
- name
117+
properties:
118+
name:
119+
type: string
120+
description: Adapter name.
121+
options:
122+
type: object
123+
additionalProperties: true
124+
description: Adapter-specific options.
113125
description: >
114126
Adapter name(s) applied to this specific model. Appended after
115-
any provider-level adapters. See the provider-level `adapter`
116-
field for available values.
127+
any provider-level adapters. Accepts bare strings or
128+
`{ name, options }` objects for parameterized adapters
129+
(e.g. `model_override` with rules). See the provider-level
130+
`adapter` field for available values.
117131
description: >
118132
Map of model ID → per-model configuration. Allows per-model pricing
119133
overrides and access aliases.
@@ -176,13 +190,28 @@ properties:
176190
- type: string
177191
- type: array
178192
items:
179-
type: string
193+
oneOf:
194+
- type: string
195+
- type: object
196+
required:
197+
- name
198+
properties:
199+
name:
200+
type: string
201+
description: Adapter name.
202+
options:
203+
type: object
204+
additionalProperties: true
205+
description: Adapter-specific options.
180206
description: >
181207
Adapter name(s) applied to every model under this provider.
182208
Adapters rewrite outbound request payloads (preDispatch) and inbound
183209
provider responses (postDispatch) to fix provider-specific field-name
184210
incompatibilities. Applied before model-level adapters.
185211
212+
Adapters can be specified as bare strings (backward compatible) or as
213+
objects with `{ name, options }` for adapters that require configuration.
214+
186215
Available adapters:
187216
188217
- `reasoning_content` — Renames `reasoning`/`thinking.content` →
@@ -192,6 +221,13 @@ properties:
192221
- `suppress_developer_role` — Rewrites the `developer` role to `system`
193222
for providers that do not support the OpenAI `developer` role.
194223
224+
- `model_override` — Conditionally rewrites the provider model name
225+
based on request payload fields. Accepts a `rules` array in options
226+
where each rule specifies a `model` to match, a `rewriteTo` model name,
227+
and an array of `conditions` (OR semantics) using dotted field paths.
228+
Used for providers that expose reasoning variants as separate model
229+
names rather than respecting reasoning-related request fields.
230+
195231
Pass-through optimisation is automatically disabled when any adapter
196232
is active.
197233
timeoutMs:

packages/backend/src/__tests__/adapters/adapter-resolver.test.ts

Lines changed: 68 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,7 @@ import { resolveAdapters } from '../../services/adapter-resolver';
33
import type { RouteResult } from '../../services/router';
44

55
// Minimal RouteResult factory
6-
function makeRoute(
7-
providerAdapter?: string | string[],
8-
modelAdapter?: string | string[]
9-
): RouteResult {
6+
function makeRoute(providerAdapter?: any[], modelAdapter?: any[]): RouteResult {
107
return {
118
provider: 'test-provider',
129
model: 'test-model',
@@ -29,48 +26,86 @@ describe('resolveAdapters', () => {
2926
expect(resolveAdapters(route)).toHaveLength(0);
3027
});
3128

32-
it('resolves a provider-level string adapter', () => {
33-
const route = makeRoute('reasoning_content');
34-
const adapters = resolveAdapters(route);
35-
expect(adapters).toHaveLength(1);
36-
expect(adapters[0]!.name).toBe('reasoning_content');
29+
it('resolves a provider-level adapter entry', () => {
30+
const route = makeRoute([{ name: 'reasoning_content', options: {} }]);
31+
const resolved = resolveAdapters(route);
32+
expect(resolved).toHaveLength(1);
33+
expect(resolved[0]!.adapter.name).toBe('reasoning_content');
34+
expect(resolved[0]!.options).toEqual({});
3735
});
3836

39-
it('resolves a provider-level array adapter', () => {
40-
const route = makeRoute(['reasoning_content', 'suppress_developer_role']);
41-
const adapters = resolveAdapters(route);
42-
expect(adapters.map((a) => a.name)).toEqual(['reasoning_content', 'suppress_developer_role']);
37+
it('resolves a model-level adapter entry', () => {
38+
const route = makeRoute(undefined, [{ name: 'suppress_developer_role', options: {} }]);
39+
const resolved = resolveAdapters(route);
40+
expect(resolved).toHaveLength(1);
41+
expect(resolved[0]!.adapter.name).toBe('suppress_developer_role');
42+
expect(resolved[0]!.options).toEqual({});
4343
});
4444

45-
it('resolves a model-level string adapter', () => {
46-
const route = makeRoute(undefined, 'suppress_developer_role');
47-
const adapters = resolveAdapters(route);
48-
expect(adapters).toHaveLength(1);
49-
expect(adapters[0]!.name).toBe('suppress_developer_role');
45+
it('merges provider-level then model-level adapters in order', () => {
46+
const route = makeRoute(
47+
[{ name: 'reasoning_content', options: {} }],
48+
[{ name: 'suppress_developer_role', options: {} }]
49+
);
50+
const resolved = resolveAdapters(route);
51+
expect(resolved.map((r) => r.adapter.name)).toEqual([
52+
'reasoning_content',
53+
'suppress_developer_role',
54+
]);
5055
});
5156

52-
it('merges provider-level then model-level adapters in order', () => {
53-
const route = makeRoute('reasoning_content', 'suppress_developer_role');
54-
const adapters = resolveAdapters(route);
55-
expect(adapters.map((a) => a.name)).toEqual(['reasoning_content', 'suppress_developer_role']);
57+
it('passes options through from config', () => {
58+
const rules = [
59+
{
60+
model: 'deepseek-r1',
61+
rewriteTo: 'deepseek-r1-fast',
62+
conditions: [{ field: 'reasoning.enabled', value: false }],
63+
},
64+
];
65+
const route = makeRoute([{ name: 'model_override', options: { rules } }]);
66+
const resolved = resolveAdapters(route);
67+
expect(resolved).toHaveLength(1);
68+
expect(resolved[0]!.adapter.name).toBe('model_override');
69+
expect(resolved[0]!.options).toEqual({ rules });
5670
});
5771

5872
it('skips and warns on unknown adapter names (does not throw)', () => {
59-
const route = makeRoute('nonexistent_adapter');
60-
// Should not throw; unknown names are skipped
61-
const adapters = resolveAdapters(route);
62-
expect(adapters).toHaveLength(0);
73+
const route = makeRoute([{ name: 'nonexistent_adapter', options: {} }]);
74+
const resolved = resolveAdapters(route);
75+
expect(resolved).toHaveLength(0);
6376
});
6477

6578
it('handles mixed valid and invalid adapter names', () => {
66-
const route = makeRoute(['reasoning_content', 'bogus'], 'suppress_developer_role');
67-
const adapters = resolveAdapters(route);
68-
expect(adapters.map((a) => a.name)).toEqual(['reasoning_content', 'suppress_developer_role']);
79+
const route = makeRoute(
80+
[
81+
{ name: 'reasoning_content', options: {} },
82+
{ name: 'bogus', options: {} },
83+
],
84+
[{ name: 'suppress_developer_role', options: {} }]
85+
);
86+
const resolved = resolveAdapters(route);
87+
expect(resolved.map((r) => r.adapter.name)).toEqual([
88+
'reasoning_content',
89+
'suppress_developer_role',
90+
]);
91+
});
92+
93+
it('handles multiple provider-level adapter entries', () => {
94+
const route = makeRoute([
95+
{ name: 'reasoning_content', options: {} },
96+
{ name: 'suppress_developer_role', options: {} },
97+
]);
98+
const resolved = resolveAdapters(route);
99+
expect(resolved.map((r) => r.adapter.name)).toEqual([
100+
'reasoning_content',
101+
'suppress_developer_role',
102+
]);
69103
});
70104

71-
it('handles model-level array adapters', () => {
72-
const route = makeRoute(undefined, ['reasoning_content', 'suppress_developer_role']);
73-
const adapters = resolveAdapters(route);
74-
expect(adapters.map((a) => a.name)).toEqual(['reasoning_content', 'suppress_developer_role']);
105+
it('resolves model_override adapter', () => {
106+
const route = makeRoute([{ name: 'model_override', options: { rules: [] } }]);
107+
const resolved = resolveAdapters(route);
108+
expect(resolved).toHaveLength(1);
109+
expect(resolved[0]!.adapter.name).toBe('model_override');
75110
});
76111
});

packages/backend/src/config.ts

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,61 @@ const PricingSchema = z.discriminatedUnion('source', [
6161
}),
6262
]);
6363

64-
const AdapterConfigSchema = z.union([z.string(), z.array(z.string())]).optional();
64+
// ─── Adapter Config ─────────────────────────────────────────────────
65+
// Adapters are configured as an array of { name, options } entries.
66+
// Legacy bare-string forms are normalized at read time in config-repository.
67+
68+
const ModelOverrideConditionSchema = z.object({
69+
/** JSON dotted path into the payload (e.g. "reasoning.enabled", "reasoning.effort"). */
70+
field: z.string().min(1),
71+
/** If omitted, matches when the field is present (any value). If set, matches when value equals this. */
72+
value: z.any().optional(),
73+
});
74+
75+
const ModelOverrideRuleSchema = z.object({
76+
/** The model name in the payload to match against (e.g. "deepseek-r1"). */
77+
model: z.string().min(1),
78+
/** The model name to rewrite to when conditions match (e.g. "deepseek-r1-fast"). */
79+
rewriteTo: z.string().min(1),
80+
/** Conditions — ANY match triggers the rewrite (OR semantics). */
81+
conditions: z.array(ModelOverrideConditionSchema).min(1),
82+
});
83+
84+
const ModelOverrideOptionsSchema = z.object({
85+
rules: z.array(ModelOverrideRuleSchema).min(1),
86+
});
87+
88+
const AdapterEntrySchema = z.object({
89+
name: z.string().min(1),
90+
options: z.record(z.string(), z.any()).default({}),
91+
});
92+
93+
/**
94+
* Accepts both the legacy format (string | string[]) and the new
95+
* uniform format ({ name, options }[]) and normalizes everything
96+
* to AdapterEntry[]. This ensures backward compatibility with
97+
* existing YAML configs while enforcing the canonical shape at
98+
* validation time.
99+
*/
100+
const AdapterConfigSchema = z.preprocess((val) => {
101+
if (val === undefined || val === null) return undefined;
102+
// Already an array (or single entry) — normalize each element
103+
const arr = Array.isArray(val) ? val : [val];
104+
return arr.map((entry: any) => {
105+
if (typeof entry === 'string') {
106+
return { name: entry, options: {} };
107+
}
108+
if (entry && typeof entry === 'object' && 'name' in entry) {
109+
return { name: entry.name, options: entry.options ?? {} };
110+
}
111+
return entry; // Let Zod produce a clear validation error
112+
});
113+
}, z.array(AdapterEntrySchema).optional());
114+
115+
export type ModelOverrideCondition = z.infer<typeof ModelOverrideConditionSchema>;
116+
export type ModelOverrideRule = z.infer<typeof ModelOverrideRuleSchema>;
117+
export type ModelOverrideOptions = z.infer<typeof ModelOverrideOptionsSchema>;
118+
export type AdapterEntry = z.infer<typeof AdapterEntrySchema>;
65119

66120
const ModelProviderConfigSchema = z.object({
67121
pricing: PricingSchema.default({

0 commit comments

Comments
 (0)