` of a pooled member must either cascade-remove the member row or mark it missing. Decision: `cred remove` errors if the credential is a live pool member ("remove it from pool `` first"); document in CLI help. (No silent dangling rows.)
+3. **CLI** `cmd/sluice/cred.go` new `pool` subtree: `pool create --members a,b[,c] [--strategy failover]`, `pool list`, `pool status ` (member order + health + active), `pool rotate ` (manual override), `pool remove `.
+4. **Namespace**: pool names and credential names share one namespace. `pool create` rejects a name that collides with an existing credential; `cred add` rejects a name colliding with an existing pool. Bind a pool via `sluice binding add --destination ` (pool name stored verbatim in `bindings.credential`).
+
+Phase 0 exit: pools definable/inspectable; `reloadAll` loads pool + health tables into a new in-memory `PoolResolver` (atomic-pointer-swapped, parallel to `StoreResolver`), but injection does not consult it.
+
+### Phase 1 — Phantom indirection (pool phantom → active member)
+
+Active member changes only via `pool rotate` in this phase.
+
+1. **Single chokepoint for pool→member expansion** `internal/vault/pool.go` (new):
+ - `PoolResolver.IsPool(name) bool`; `ResolveActive(name) (member string, ok bool)` — if `name` is a pool, first member whose health is `healthy` or whose `cooldown_until <= now`, in `position` order; if all in cooldown, return the soonest-recovering member and log a WARNING. If `name` is a plain credential, return it unchanged.
+ - **Mandatory task: enumerate and route every `binding.Credential` / `OAuthIndex.Has` / `extractInjectableSecret` / `findAdder`/persist consumer through `ResolveActive` at one chokepoint** (grep `binding.Credential`, `\.Has(`, `extractInjectableSecret`). Do **not** scatter `IsPool` checks across pass-1/pass-2 only — that was the original gap.
+2. **Injection** `internal/proxy/addon.go`: pass-1 header and pass-2 phantom swap call the chokepoint so the *real* value injected is the active member's, while the agent's pool-scoped phantom string is what gets matched/replaced.
+3. **Per-request member tag — precise join key** (resolves Risk R1):
+ - When pass-2 swaps the agent's `SLUICE_PHANTOM:.refresh` to a real refresh token in an outbound token-endpoint request, sluice **records `realRefreshToken → member`** in a short-TTL map (the refresh token value is sluice's own injected bytes, unique per member, and is the field actually present in an RFC-6749 refresh-grant body — *not* the access token, which a refresh POST need not carry). `connState` keyed by `ClientConn.Id` is insufficient (one client conn multiplexes both members' h2 streams), so the join key is the real refresh-token value, not the connection.
+ - On the token-endpoint **response**, the handler recovers `member` from that map by the real refresh token sluice sent in the matching request. Persist refreshed tokens to *that member* (`persistAddonOAuthTokens(member,...)`, singleflight `"persist:"+member`).
+ - **Fail-closed (mandatory enumerated task + unit test):** if the member cannot be recovered, do **not** guess and do **not** fall back to `OAuthIndex.Match` for pooled token URLs — log a WARNING and skip the vault write so the next refresh retries. Dedicated unit test: two members, same token URL, assert a B-refresh never overwrites A's vault entry, and a missing tag results in zero writes.
+4. **Pool-stable phantom** (resolves Risk R3): for pooled OAuth creds, `oauthPhantomAccess`/`resignJWT` produce a JWT from a deterministic synthetic payload keyed on the **pool name** (not the member's real token), so it is byte-identical across member switches. Enumerated unit test asserts byte-identity across a switch. Document the static-form fallback and the reason it is not the default.
+5. `cmd/sluice/main.go:reloadAll` builds & swaps `PoolResolver` + health snapshot alongside the existing swaps.
+
+Phase 1 exit: `pool rotate` flips the backing account; agent's phantom unchanged byte-for-byte; refreshes attributed correctly; fail-closed proven by test.
+
+### Phase 2 — Auto-failover on 429 / 401
+
+1. **Failure classification** in `SluiceAddon.Response` for pooled destinations:
+ - `429`, or `403` with body error `insufficient_quota`/quota-exhaustion → rate-limited.
+ - `401`, or token-endpoint body `invalid_grant`/`invalid_token` → auth-failure.
+ - `5xx` and everything else → no-op (upstream-side; failing over would thrash both accounts — documented choice).
+2. **Prompt failover (resolves Important I1):** on classification, update the in-memory `PoolResolver` health **synchronously before the response returns** (atomic-pointer swap or dedicated mutex on the health map — call out the locking discipline), so the *very next* request injects the new active member. Also write `SetCredentialHealth(member, 'cooldown', now+ttl, reason)` to the store for durability; the 2s data-version watcher then merely reconciles. Do **not** rely on the 2s watcher for the active-member change — that lag was an error amplifier.
+ - Cooldown TTLs as named consts in `internal/vault/pool.go`: rate-limit 60s, auth-fail 300s (a broken refresh token will not self-heal quickly). Lazy recovery: `ResolveActive` treats expired cooldown as eligible — no scheduler.
+3. **Audit**: emit `cred_failover` with `Reason = ":->:<429|403|401|invalid_grant>"`.
+4. **Telegram notify** (best-effort, non-blocking, never blocks injection): one-line "pool `` failed over ``→`` ()".
+5. **No in-flight retry** of the triggering request in Phase 2 (it returns its error; the next request uses the new member). Transparent retry is out of scope (needs body buffering; unsafe for non-idempotent calls).
+
+Phase 2 exit: e2e proves A 429 → next request uses B → B's refresh persists to B → phantom byte-unchanged.
+
+## Out of scope / future work
+
+Transparent in-flight retry; round-robin/weighted (`strategy` reserved, `failover` only); active health probes / half-open; multi-agent pools with independent active pointers.
+
+## Risks / decisions
+
+- **R1 (critical, resolved in Phase 1.3):** refresh-token mis-attribution corrupts both accounts (rotating refresh tokens are single-use; filing B's new token under A invalidates A and bricks B's old one). Join key is the real **refresh** token sluice injected, not the access token, not the client connection. Fail-closed, never guess. Dedicated collision unit test mandatory.
+- **R3 (critical, resolved in Phase 1.4):** `resignJWT` is per-real-token, so the agent's phantom would change on every cross-member refresh — the headline guarantee. Resolved via pool-keyed synthetic JWT; byte-identity unit test mandatory; static-form fallback documented.
+- **I1 (important, resolved in Phase 2.2):** the 2s data-version watcher must not gate the active-member switch — synchronous in-memory health update on `Response`, store write only reconciles.
+- **I2 (important, resolved in Phase 1.1):** all `binding.Credential`/`OAuthIndex.Has`/`extractInjectableSecret`/`findAdder` consumers routed through one `ResolveActive` chokepoint, not just the two injection passes.
+- Namespace collision resolved by mutual-exclusion at create time (Phase 0.4). Orphan pool members resolved by blocking `cred remove` of a live member (Phase 0.2).
+- Alternative rejected for this use case: scheduled `sluice cred update` rotation — cannot react to a 429 in real time and races the async OAuth vault writer.