feat(config): allow context length for OpenAI-compatible models#3794
feat(config): allow context length for OpenAI-compatible models#3794EricLi404 wants to merge 8 commits into
Conversation
|
This pull request targeted The base branch has been automatically changed to |
There was a problem hiding this comment.
Code Review
This pull request introduces a ContextLength configuration option to override the advertised model context window for OpenAI compatibility models. This field is integrated into the model hashing mechanism, configuration mapping, and verified with unit tests. Feedback suggests validating and clamping negative ContextLength values to zero to prevent downstream issues.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 329e758e01
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Match the management API convention where all config keys are kebab-case (base-url, api-key-entries). Using snake_case here would silently ignore 'context-length' in JSON payloads.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 610dc37785
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
/gemini Please review the latest head commit b1427c7. The PR now exposes context-length in both the Codex client catalog and the standard OpenAI-compatible /v1/models response, with tests covering both paths. |
There was a problem hiding this comment.
Code Review
This pull request introduces support for overriding the advertised model context window (context-length) in OpenAI compatibility configurations, updating the configuration schema, hashing logic, API response filtering, and model building processes, along with adding corresponding unit tests. Feedback highlights a backward compatibility issue and an overwrite bug in applyCodexClientContextWindowOverride and applyCodexClientModelMetadata where user-configured overrides could be ignored or template defaults could be unintentionally overwritten.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
84182e4 to
95116e0
Compare
|
/gemini Please review the latest head commit 95116e0. The PR history was rewritten only to normalize commit author/committer identity; the code diff is unchanged from the prior head. |
There was a problem hiding this comment.
Code Review
This pull request introduces support for overriding the advertised context window (context length) for OpenAI compatibility models. It updates configuration structures, hashing logic, model metadata builders, and API handlers to support and expose this new property, backed by comprehensive unit tests. The review feedback suggests two key improvements: first, to only include context_length in the /v1/models response if it is strictly greater than zero to avoid misleading client applications; second, to simplify the signature of applyCodexClientContextWindowOverride by removing the unused id parameter.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Only advertise context_length in the standard models response when the parsed value is positive. Also removes the unused Codex context helper parameter and adds coverage for omitted zero values.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new context-length configuration option for OpenAI compatibility models, allowing users to override the advertised context window. The changes propagate this setting through the configuration, hashing, and API response handlers, and include clamping logic for negative values. Comprehensive unit tests have been added to verify the new behavior. There are no review comments, and I have no additional feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Summary
Add optional
context-lengthconfiguration toopenai-compatibility.models[], allowing proxy operators to override the advertised context window for OpenAI-compatible models.Motivation
Some large-context models (e.g. DeepSeek V4 Pro, MiniMax-M3 1M, Mimo V2.5 Pro) support up to 1,000,000 tokens of context, but their upstream
/v1/modelsendpoint may not advertise this — or a proxy may want to explicitly set it regardless of upstream metadata.Without this field, Codex and other clients that rely on
/v1/modelsto determine model capabilities see an incorrect or defaultcontext_window, which can cause:Changes
internal/config/config.go: addContextLength intfield toOpenAICompatibilityModelwith yaml tagcontext-lengthsdk/cliproxy/service.go: propagate configured context length into registered model metadata; clamp negative values to 0internal/watcher/diff/model_hash.go: include context length in model hash so hot reloads detect metadata-only changessdk/cliproxy/openai_compat_models_test.go: unit tests for context length propagation and negative value clampinginternal/watcher/diff/model_hash_test.go: unit test for hash sensitivity to context length changesConfiguration Example
Backward Compatibility
context-lengthis optional and defaults to 0 (omitted from JSON output). Existing configurations without this field behave identically to before.Test Plan
GOWORK=off go test ./internal/watcher/diff ./sdk/cliproxy -run 'TestComputeOpenAICompatModelsHash|TestBuildOpenAICompatibilityConfigModelsIncludesContextLength' -count=1GOWORK=off go build -o /tmp/cliproxyapi-pr-build ./cmd/server