feat(ai-proxy): add provider-aware max_tokens override with priority control#13251
feat(ai-proxy): add provider-aware max_tokens override with priority control#13251Baoyuantop merged 12 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new override.request_body configuration to ai-proxy and ai-proxy-multi to allow protocol-specific, deep-merged request body overrides after protocol conversion (and after options), enabling operators to set nested/provider-specific parameters reliably.
Changes:
- Introduces a deep-merge helper and applies per-target-protocol request body patches in the provider request build path.
- Extends ai-proxy schemas to validate
override.request_bodykeys against the registered protocol set (dynamic viaai-protocols.names()). - Adds a dedicated test suite covering schema validation, deep-merge semantics, precedence, multi-instance behavior, and converter (client→target protocol) behavior; updates EN/ZH docs.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
apisix/plugins/ai-proxy/merge.lua |
Adds deep-merge helper used for request body overrides. |
apisix/plugins/ai-proxy/schema.lua |
Adds override.request_body schema and reuses a shared override_schema. |
apisix/plugins/ai-proxy/base.lua |
Passes override.request_body into provider build options (extra_opts). |
apisix/plugins/ai-providers/base.lua |
Applies protocol-keyed deep-merge into outgoing request body after options. |
apisix/plugins/ai-protocols/init.lua |
Exposes names() to enumerate registered protocol names. |
t/plugin/ai-proxy-request-body-override.t |
Adds coverage for schema rejection, merge semantics, precedence, converter path, and backward compatibility. |
docs/en/latest/plugins/ai-proxy.md |
Documents override.request_body for ai-proxy. |
docs/en/latest/plugins/ai-proxy-multi.md |
Documents instances.override.request_body for ai-proxy-multi. |
docs/zh/latest/plugins/ai-proxy.md |
Documents override.request_body for ai-proxy (ZH). |
docs/zh/latest/plugins/ai-proxy-multi.md |
Documents instances.override.request_body for ai-proxy-multi (ZH). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds override.request_body to ai-proxy and ai-proxy-multi, letting
operators set arbitrary nested fields on the outgoing request body,
keyed by target protocol.
The existing options field can only overwrite top-level fields and
is protocol-agnostic, so it cannot express protocol-specific params
like max_tokens vs max_output_tokens vs generationConfig.maxOutputTokens.
request_body is keyed by target protocol (openai-chat, openai-responses,
openai-embeddings, anthropic-messages) because converters only do
structural format conversion, not per-parameter semantic normalization.
The override is applied after converter + options, deep-merged into
the body: objects recursive, scalars/arrays replace wholesale.
Examples:
override:
request_body:
openai-chat: { max_tokens: 500 }
openai-responses: { max_output_tokens: 500 }
anthropic-messages: { max_tokens: 500, stop_sequences: ['Human:'] }
6258162 to
0210be4
Compare
…redundant backward-compat case
…sed together in practice)
Add override.request_body_force_override (boolean, default false) to control whether override values or client request body fields take priority: - false (default): client fields win, override fills missing fields only - true: override values forcefully overwrite client fields
Instead of swapping deep_merge arguments and deepcopy-ing the patch table, pass a force boolean directly into deep_merge. This way we always iterate over the (smaller) patch table and decide at the leaf level whether patch or target wins, avoiding the overhead of deepcopy entirely.
membphis
left a comment
There was a problem hiding this comment.
I think this this plugin design is bad, it is not easy to control and understand
request_body:
openai-chat: { max_tokens: 500, temperature: 0.2 }
openai-responses: { max_output_tokens: 500 }
The better one:
request_body: { max_tokens: 500, temperature: 0.2 }
APISIX intelligently sets the corresponding fields based on the different upstream protocols.
We can include the mappings for which fields APISIX has built-in in the documentation to inform users.
(Using OpenAI's field naming conventions is recommended.)
…rovider capability hooks
Replace the per-protocol keyed request_body override design with a simpler
flat schema where users set max_tokens and APISIX automatically maps it to
the correct field name via rewrite_request_body hooks in provider capabilities.
Config changes from:
request_body: { "openai-chat": { max_tokens: 500 } }
to:
request_body: { max_tokens: 500 }
Each provider's capability entry now has a rewrite_request_body(body, override, force)
hook that sets the provider-native field:
- OpenAI: max_completion_tokens
- OpenAI Responses: max_output_tokens
- Gemini/Vertex-AI: max_completion_tokens
- DeepSeek/Anthropic/OpenRouter/AIMLAPI/Azure/Compatible: max_tokens
Removed:
- merge.lua (deep merge no longer needed)
- protocols.names() (no longer needed by schema)
Rename local helper functions in providers with direct (non-factory) implementations for consistency with the capability field name.
aad6ac8 to
45e5df4
Compare
45e5df4 to
963d1e9
Compare
|
CI failures are all unrelated to this PR:
The |
Description
Add
override.request_body.max_tokenstoai-proxyandai-proxy-multiplugins. Users set a singlemax_tokensvalue and APISIX automatically maps it to the correct field name for each provider viarewrite_request_bodyhooks in provider capabilities.Config example
{ "override": { "request_body": { "max_tokens": 500 }, "request_body_force_override": false } }Provider field mapping
max_completion_tokensmax_output_tokensmax_tokensmax_output_tokensmax_tokensmax_tokensmax_completion_tokensmax_tokensmax_tokensmax_tokensmax_completion_tokensPriority control
request_body_force_override: false(default): client request fields take priority, override only fills missing fieldsrequest_body_force_override: true: override forcefully overwrites client fieldsDesign
Each provider capability entry has an optional
rewrite_request_body(body, override, force)hook function. This is called inbuild_request()after protocol conversion andprepare_outgoing_request(). The hook directly mutates the request body, making it easy to extend for nested objects or future fields.