Skip to content

feat(config): allow context length for OpenAI-compatible models#3794

Open
EricLi404 wants to merge 8 commits into
router-for-me:devfrom
EricLi404:ericli404/openai-compat-context-length
Open

feat(config): allow context length for OpenAI-compatible models#3794
EricLi404 wants to merge 8 commits into
router-for-me:devfrom
EricLi404:ericli404/openai-compat-context-length

Conversation

@EricLi404

@EricLi404 EricLi404 commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Add optional context-length configuration to openai-compatibility.models[], allowing proxy operators to override the advertised context window for OpenAI-compatible models.

Motivation

Some large-context models (e.g. DeepSeek V4 Pro, MiniMax-M3 1M, Mimo V2.5 Pro) support up to 1,000,000 tokens of context, but their upstream /v1/models endpoint may not advertise this — or a proxy may want to explicitly set it regardless of upstream metadata.

Without this field, Codex and other clients that rely on /v1/models to determine model capabilities see an incorrect or default context_window, which can cause:

  • Truncation warnings at thresholds far below the model's actual capacity
  • Unnecessary context compaction when the model could handle more
  • Misleading model selection where users avoid models that appear to have smaller windows

Changes

  • internal/config/config.go: add ContextLength int field to OpenAICompatibilityModel with yaml tag context-length
  • sdk/cliproxy/service.go: propagate configured context length into registered model metadata; clamp negative values to 0
  • internal/watcher/diff/model_hash.go: include context length in model hash so hot reloads detect metadata-only changes
  • sdk/cliproxy/openai_compat_models_test.go: unit tests for context length propagation and negative value clamping
  • internal/watcher/diff/model_hash_test.go: unit test for hash sensitivity to context length changes

Configuration Example

openai-compatibility:
  - name: "deepseek"
    base-url: "https://api.deepseek.com/v1"
    api-key-entries:
      - api-key: "sk-..."
    models:
      - name: "deepseek-v4-pro"
        context-length: 1000000  # advertise 1M context window
      - name: "deepseek-chat"
        # omit context-length to use upstream default

Backward Compatibility

context-length is optional and defaults to 0 (omitted from JSON output). Existing configurations without this field behave identically to before.

Test Plan

  • GOWORK=off go test ./internal/watcher/diff ./sdk/cliproxy -run 'TestComputeOpenAICompatModelsHash|TestBuildOpenAICompatibilityConfigModelsIncludesContextLength' -count=1
  • GOWORK=off go build -o /tmp/cliproxyapi-pr-build ./cmd/server

@github-actions github-actions Bot changed the base branch from main to dev June 10, 2026 07:18
@github-actions

Copy link
Copy Markdown

This pull request targeted main.

The base branch has been automatically changed to dev.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a ContextLength configuration option to override the advertised model context window for OpenAI compatibility models. This field is integrated into the model hashing mechanism, configuration mapping, and verified with unit tests. Feedback suggests validating and clamping negative ContextLength values to zero to prevent downstream issues.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdk/cliproxy/service.go

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 329e758e01

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/config/config.go Outdated
Match the management API convention where all config keys are
kebab-case (base-url, api-key-entries). Using snake_case here
would silently ignore 'context-length' in JSON payloads.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 610dc37785

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread sdk/cliproxy/service.go
@EricLi404

Copy link
Copy Markdown
Author

/gemini Please review the latest head commit b1427c7. The PR now exposes context-length in both the Codex client catalog and the standard OpenAI-compatible /v1/models response, with tests covering both paths.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for overriding the advertised model context window (context-length) in OpenAI compatibility configurations, updating the configuration schema, hashing logic, API response filtering, and model building processes, along with adding corresponding unit tests. Feedback highlights a backward compatibility issue and an overwrite bug in applyCodexClientContextWindowOverride and applyCodexClientModelMetadata where user-configured overrides could be ignored or template defaults could be unintentionally overwritten.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdk/api/handlers/openai/codex_client_models.go Outdated
@EricLi404 EricLi404 force-pushed the ericli404/openai-compat-context-length branch from 84182e4 to 95116e0 Compare June 11, 2026 14:36
@EricLi404

Copy link
Copy Markdown
Author

/gemini Please review the latest head commit 95116e0. The PR history was rewritten only to normalize commit author/committer identity; the code diff is unchanged from the prior head.

@EricLi404

Copy link
Copy Markdown
Author

@codex review latest head commit 95116e0. The PR history was rewritten only to normalize commit author/committer identity; the code diff is unchanged from the prior head.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for overriding the advertised context window (context length) for OpenAI compatibility models. It updates configuration structures, hashing logic, model metadata builders, and API handlers to support and expose this new property, backed by comprehensive unit tests. The review feedback suggests two key improvements: first, to only include context_length in the /v1/models response if it is strictly greater than zero to avoid misleading client applications; second, to simplify the signature of applyCodexClientContextWindowOverride by removing the unused id parameter.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdk/api/handlers/openai/openai_handlers.go Outdated
Comment thread sdk/api/handlers/openai/codex_client_models.go Outdated
Comment thread sdk/api/handlers/openai/codex_client_models.go Outdated
@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Only advertise context_length in the standard models response when the parsed value is positive. Also removes the unused Codex context helper parameter and adds coverage for omitted zero values.
@EricLi404

Copy link
Copy Markdown
Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new context-length configuration option for OpenAI compatibility models, allowing users to override the advertised context window. The changes propagate this setting through the configuration, hashing, and API response handlers, and include clamping logic for negative values. Comprehensive unit tests have been added to verify the new behavior. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant