You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A **credential pool** lets one phantom identity the agent sees be backed by **N real OAuth credentials**. The agent always holds a single pool-scoped phantom pair (`SLUICE_PHANTOM:<pool>.access` / `SLUICE_PHANTOM:<pool>.refresh`); sluice maps it to the *currently active member's* real tokens at injection time and persists refreshed tokens back to the member that issued them. Primary use case: two OpenAI Codex OAuth accounts behind one agent so quota exhaustion on one account transparently rolls onto the other. Pool members must be `oauth` credentials — `static` members are rejected. `cred remove` errors on a credential that is a live pool member. **One credential belongs to at most one pool**: proxy attribution (`PoolResolver.PoolForMember`) maps a member back to a single pool, so a credential shared across pools would persist/audit a token response against the wrong pool's phantom and leave the agent with an unreplaceable phantom. `pool create` rejects a member that is already in another pool (enforced inside the same transaction as the member insert).
217
+
A **credential pool** lets one phantom identity the agent sees be backed by **N real OAuth credentials**. The agent always holds a single pool-scoped phantom pair, byte-stable across member switches: the **access** phantom is a synthetic pool-stable JWT (HS256, `sub: sluice-pool:<pool>`, `iss: sluice-phantom`, fixed far-future `exp`, built by `poolStablePhantomAccess`) — byte-identical for a given pool regardless of which member is active; the **refresh** phantom is the static string `SLUICE_PHANTOM:<pool>.refresh` (from `oauthPhantomRefresh`'s request-side strip path). Sluice maps the pair to the *currently active member's* real tokens at injection time and persists refreshed tokens back to the member that issued them. Primary use case: two OpenAI Codex OAuth accounts behind one agent so quota exhaustion on one account transparently rolls onto the other. Pool members must be `oauth` credentials — `static` members are rejected. `cred remove` errors on a credential that is a live pool member. **One credential belongs to at most one pool**: proxy attribution (`PoolResolver.PoolForMember`) maps a member back to a single pool, so a credential shared across pools would persist/audit a token response against the wrong pool's phantom and leave the agent with an unreplaceable phantom. `pool create` rejects a member that is already in another pool (enforced inside the same transaction as the member insert).
218
218
219
219
**CLI:**
220
220
@@ -235,13 +235,13 @@ Auto-failover on 429/401 is the primary mechanism; `pool rotate` is an operator
235
235
-**Single chokepoint (I2):** every `binding.Credential` / `OAuthIndex.Has` / `extractInjectableSecret` / persist consumer on the HTTP/HTTPS OAuth path routes through `PoolResolver.ResolveActive` (`resolveInjectionTarget` for pass-1 header + pass-2 phantom swap; `resolveOAuthResponseAttribution` for the response/persist path). `idx.Has` is always called with the resolved member name, never the pool. Plain (non-pool) credentials pass through `ResolveActive` unchanged. SSH/mail/QUIC are non-OAuth and out of scope.
236
236
-**Active-member selection:** healthy or expired-cooldown members first, by configured position; if all members are in cooldown, the soonest-recovering member is returned with a WARNING (degrade, never hard-fail). Recovery is lazy — evaluated in `ResolveActive`, no scheduler.
237
237
-**R1 refresh-token attribution / fail-closed:** when pass-2 swaps `SLUICE_PHANTOM:<pool>.refresh`, sluice records `realRefreshToken → member` in a short-TTL map. On the token-endpoint response it recovers the member by that real refresh token and persists to that member (`persistAddonOAuthTokens(member, ...)`, singleflight key `"persist:"+member`). The join key is the real **refresh** token sluice injected — never the access token, the client connection, or `OAuthIndex.Match` (two pooled members share `auth.openai.com`'s token URL and collide there). If the member is unrecoverable: WARNING + skip the vault write, never guess. Rotating refresh tokens are single-use, so a mis-attributed write would brick both accounts — fail-closed is mandatory.
238
-
-**R3 pool-stable phantom JWT:** Codex access tokens are JWTs and the per-real-token `resignJWT` would emit a *different* phantom after every cross-member refresh, breaking the "agent never notices" guarantee. Pooled OAuth `oauthPhantomAccess`/`resignJWT` instead build the phantom JWT from a deterministic synthetic payload keyed on the **pool name** (stable `sub`/`iss`, far-future `exp`), HMAC'd with the existing fixed key — byte-identical across member switches while still a structurally valid JWT. Static-form fallback (`SLUICE_PHANTOM:<pool>.access`) is documented for the case where the agent is verified to treat the access token as opaque.
238
+
- **R3 pool-stable phantom JWT:** Codex access tokens are JWTs and the per-real-token `resignJWT` would emit a *different* phantom after every cross-member refresh, breaking the "agent never notices" guarantee. The dedicated `poolStablePhantomAccess` (in `internal/proxy/oauth_response.go`) instead builds the phantom JWT from a deterministic synthetic payload keyed on the **pool name** (`sub: sluice-pool:<pool>`, `iss: sluice-phantom`, fixed far-future `exp`, no `iat`), HMAC-SHA256'd with the existing fixed key — byte-identical across member switches while still a structurally valid JWT. The pool name is JSON-marshaled (never concatenated) so a name with quotes/control chars cannot inject claims. Static-form fallback (`SLUICE_PHANTOM:<pool>.access`) is emitted only on the unreachable `json.Marshal` failure of the fixed struct (and is documented as the equivalent for an agent verified to treat the access token as opaque). The **refresh** phantom is unaffected — it stays the static `SLUICE_PHANTOM:<pool>.refresh`.
239
239
240
240
**Phase 2 — auto-failover on 429 / 401:**
241
241
242
242
-**Classification** (`classifyFailover` in `internal/proxy/pool_failover.go`, called from `SluiceAddon.Response` for pooled destinations): `429` or `403 + insufficient_quota` → rate-limited; `401` or token-body `invalid_grant` / `invalid_token` → auth-failure; `5xx` / other → no-op. The token-endpoint body is only trusted when the request URL matched the OAuth index.
243
243
-**Pool attribution for the response** (`poolForResponse`): a response is attributed to a pool either (a) when the flow's CONNECT host has a pooled binding (the API-host 429/403 path), **or** (b) when the request URL matches the OAuth token-URL index for a credential that is a pool member (the token-endpoint 401 / `invalid_grant` path). Case (b) is essential: an OAuth refresh hits the credential's token-URL host (e.g. `auth.openai.com`), which has no pool binding — only the API host (e.g. `api.openai.com`) does — so without the token-URL index match the token-endpoint classification would be dead code for the Codex deployment. `idx.Match` is strict 1:1 token_url→credential, so case (b) cools the exact member whose refresh token was injected.
244
-
-**Synchronous in-memory failover (I1):** health is updated in-process *before* the response returns — `MarkCooldown` takes the resolver write lock, `ResolveActive` the read lock — so the active-member switch never waits on the 2s data-version watcher (which only reconciles). A detached `onFailover` callback also writes `SetCredentialHealth(member, 'cooldown', now+ttl, reason)` for durability. Cooldown TTLs: `vault.RateLimitCooldown` = 60s, `vault.AuthFailCooldown` = 300s. No in-flight retry — the next request uses the new member.
244
+
- **Synchronous in-memory failover (I1):** health is updated in-process *before* the response returns — `MarkCooldown` takes the resolver write lock, `ResolveActive` the read lock — so the active-member switch never waits on the 2s data-version watcher (which only reconciles). A detached `onFailover` callback also writes `SetCredentialHealth(member, 'cooldown', now+ttl, reason)` for durability. Cooldown TTLs: `vault.RateLimitCooldown` = 60s, `vault.AuthFailCooldown` = 300s. **Cooldown extension is monotonic on both layers:** a member parked for an auth failure (300s) that subsequently trips a rate-limit (60s) keeps the LATER expiry — `MarkCooldown` (in-memory) and `SetCredentialHealth`'s `cooldown` upsert (durable, via a `CASE`/comparison against the stored future `cooldown_until`) both keep `max(existing-future, new)` so a known-bad credential is never made eligible early. Only the extend path is monotonic: an explicit clear (zero/past `until` in `MarkCooldown`) and any transition to `status='healthy'` still shorten/clear (recovery intact), and lazy expiry still wins over an already-expired stored cooldown. No in-flight retry — the next request uses the new member.
245
245
-**Reload does not resurrect a cooled member:** because the durable `SetCredentialHealth` write is detached and best-effort, any reload (SIGHUP or the 2s data-version watcher firing on *any* unrelated DB write) rebuilds the resolver from store rows alone via `NewPoolResolver`. `Server.StorePool` therefore calls `PoolResolver.MergeLiveCooldowns(prev)` to carry forward still-active in-memory cooldowns from the resolver being replaced before the atomic swap. The merge is monotonic (a live cooldown is never shortened/erased by an unrelated reload) and drops cooldowns for credentials no longer in any pool.
246
246
-**Audit:** a `cred_failover` event (Verdict `failover`, Credential = the cooled-down member) with `Reason = "<pool>:<from>-><to>:<429|403|401|invalid_grant>"`, emitted synchronously in `handlePoolFailover`.
247
247
-**Telegram:** a best-effort non-blocking notice "pool <name> failed over <a> -> <b> (<reason>)" (plain text — `TelegramChannel.Notify` sends with no parse mode); the store write and every broker channel `Notify` are detached into their own goroutine so the response path never blocks.
A credential pool lets a single phantom identity the agent sees be backed by **N real OAuth credentials**, with sluice auto-failing-over to the next member when the upstream rejects the active one. Primary use case: two OpenAI Codex OAuth accounts driven by one agent, so quota exhaustion on one account transparently rolls onto the other. The agent always holds one pool-scoped phantom pair (`SLUICE_PHANTOM:<pool>.access` / `.refresh`); sluice maps it to the currently active member's real token at injection time and persists refreshed tokens back to the member that issued them.
291
+
A credential pool lets a single phantom identity the agent sees be backed by **N real OAuth credentials**, with sluice auto-failing-over to the next member when the upstream rejects the active one. Primary use case: two OpenAI Codex OAuth accounts driven by one agent, so quota exhaustion on one account transparently rolls onto the other. The agent always holds one pool-scoped phantom pair, byte-stable across member switches: the **access** phantom is a synthetic pool-stable JWT (HS256, `sub: sluice-pool:<name>`, `iss: sluice-phantom`, far-future `exp`) that is byte-identical for a given pool regardless of which member is active, so a cross-member failover never changes the access token the agent holds; the **refresh** phantom is the static string `SLUICE_PHANTOM:<pool>.refresh`. Sluice maps the pair to the currently active member's real token at injection time and persists refreshed tokens back to the member that issued them.
292
292
293
293
```bash
294
294
sluice pool create <name> --members credA,credB[,credC] [--strategy failover]
0 commit comments