You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds an optional schedulingStrategy setting (hybrid | sequential). In
sequential / drain-first mode the runtime proxy routes all new requests to one
active account until it is fully exhausted (rate-limited / cooling down /
circuit-open), then advances to the next available account; earlier accounts
reclaim the active slot once their quota window recovers, staggering recovery
across the pool. Default stays hybrid, so existing behavior is unchanged.
- accounts.ts: getCurrentOrNextForFamilySequential sticky selector (per
ModelFamily), policy-aware so it never anchors on a blocked account
- runtime-rotation-proxy.ts: chooseAccount branches on strategy; sequential
skips per-session affinity, manual pin still wins; shared linear-scan
fallback does not advance the drain-first primary
- config/schemas: schedulingStrategy enum (default hybrid), getSchedulingStrategy
accessor, CODEX_AUTH_SCHEDULING_STRATEGY env override, config-explain parity
- docs: configuration, settings reference, config-field inventory
- tests: selector stickiness/advance/wrap/recovery, per-family isolation,
cooldown/circuit-open/disabled paths, policy-block guard, affinity-override
and pin-precedence, config accessor
Closes#509
|`CODEX_AUTH_MIN_ROTATION_INTERVAL_MS=<ms>`| Minimum time between global account switches (default `60000`). The proxy biases selection toward the last-served account within this window to reduce the rate at which different OAuth tokens appear from the same IP. Set to `0` to disable. |
77
+
|`CODEX_AUTH_SCHEDULING_STRATEGY=hybrid/sequential`| Account scheduling strategy (default `hybrid`). `sequential` (drain-first) keeps one active account until it is fully exhausted before advancing to the next; see [Sequential / drain-first scheduling](#sequential--drain-first-scheduling). |
77
78
|`CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS=<ms>`| Cooldown applied to an account when the upstream or token-refresh endpoint explicitly revokes its OAuth token (default `300000`, 5 minutes). Raise this if accounts continue to be re-invalidated after re-login. |
78
79
79
80
---
@@ -117,6 +118,15 @@ The proxy preserves request bodies and streaming responses, replaces outbound au
117
118
-**Token-invalidation detection**: when the upstream or the token-refresh endpoint returns an explicit OAuth revocation message, the proxy returns the error directly to the client instead of rotating to the next account. The affected account receives a 5-minute cooldown (`tokenInvalidationCooldownMs`, default `300000`) instead of the generic 30-second auth-failure cooldown. Configure via `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS`.
118
119
-**Rotation-rate throttle**: the proxy biases account selection toward the last-served account for a configurable window (default 60 seconds, `minRotationIntervalMs`). Accounts that are rate-limited or cooling down are still rotated around. Configure via `CODEX_AUTH_MIN_ROTATION_INTERVAL_MS` or set to `0` to disable.
119
120
121
+
### Sequential / drain-first scheduling
122
+
123
+
`schedulingStrategy` controls how the proxy picks an account for each request:
124
+
125
+
-`hybrid` (default) spreads load across all available accounts using a weighted health/token/freshness score. Both accounts tend to consume quota at a similar pace.
126
+
-`sequential` (drain-first) routes every new request to one active account and only advances to the next available account once the current one is fully exhausted (rate-limited, cooling down, or circuit-open). Because the scan wraps the pool, an earlier account that has recovered its quota window is reclaimed as soon as the current account drains. This staggers quota recovery across accounts for longer uninterrupted sessions.
127
+
128
+
In `sequential` mode a manual pin (`codex-multi-auth switch <index>`) still takes precedence and is never overridden. Sequential mode intentionally ignores per-session affinity: once the active account changes, all subsequent requests follow the new active account regardless of which account originally handled a conversation. Enable it with `schedulingStrategy: "sequential"` in settings or `CODEX_AUTH_SCHEDULING_STRATEGY=sequential` for a per-process trial.
129
+
120
130
Microsoft/Outlook SSO accounts may be more sensitive to proxy-mediated token use. If an Outlook-linked account is invalidated on every first request through the proxy but works normally on ChatGPT web, the root cause is likely IP or device binding on the Microsoft side. Raising `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS` and re-logging in the affected account typically resolves the cascade. If the problem persists, consider excluding the Microsoft account from the rotation pool via `codex-multi-auth switch`.
121
131
122
132
For `codex app` launches that go through the wrapper, the wrapper automatically starts a small internal helper so rotation can keep working if the desktop app launcher detaches. The helper stores only local runtime status, uses the same per-session proxy client key as the CLI path, and exits after an idle timeout.
Copy file name to clipboardExpand all lines: docs/development/CONFIG_FIELDS.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,6 +65,7 @@ Used only for host plugin mode through the host runtime config file.
65
65
66
66
| Key | Default |
67
67
| --- | --- |
68
+
|`schedulingStrategy`|`hybrid`|
68
69
|`retryAllAccountsRateLimited`|`true`|
69
70
|`retryAllAccountsMaxWaitMs`|`0`|
70
71
|`retryAllAccountsMaxRetries`|`Infinity`|
@@ -73,6 +74,8 @@ Used only for host plugin mode through the host runtime config file.
73
74
|`fallbackToGpt52OnUnsupportedGpt53`|`true`|
74
75
|`unsupportedCodexFallbackChain`|`{}`|
75
76
77
+
`schedulingStrategy` selects how the runtime proxy picks an account per request. `hybrid` (default) keeps the weighted health/token/freshness selection that spreads load across all available accounts. `sequential` (drain-first) sticks to one active account and only advances to the next available account once the current one is fully exhausted (rate-limited / cooling down / circuit-open); earlier accounts become eligible again as soon as their quota window recovers, staggering recovery across the pool. A manual pin still overrides this, and sequential mode intentionally ignores per-session affinity so all new requests follow the single active account. Overridable per-process via `CODEX_AUTH_SCHEDULING_STRATEGY`.
Copy file name to clipboardExpand all lines: docs/reference/settings.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -143,6 +143,7 @@ Named backup behavior:
143
143
| Key | Default | Effect |
144
144
| --- | --- | --- |
145
145
|`codexRuntimeRotationProxy`|`true`| Enable the default-on localhost Responses proxy for forwarded official Codex CLI/app sessions |
146
+
|`schedulingStrategy`|`hybrid`| Account scheduling: `hybrid` spreads load across all available accounts; `sequential` (drain-first) keeps one active account until it is fully exhausted, then advances to the next |
146
147
|`preemptiveQuotaEnabled`|`true`| Defer requests before remaining quota is critically low |
0 commit comments