Skip to content

Commit 8d36a74

Browse files
committed
fix: 4 bugs found during deep stress test
[MEDIUM] lib/storage/flagged-storage.ts: normalizeFlaggedStorage was silently dropping workspaces, currentWorkspaceIndex, accessToken, and expiresAt from flagged accounts, defeating the Zod schema's .passthrough() intent and causing multi-workspace accounts to permanently lose their workspace list after a flag/restore round-trip. [MEDIUM] lib/codex-manager/login-menu-data.ts: refreshQuotaCacheForMenu had a dead catch block because loadQuotaCache() never throws — it returns empty maps on any read failure. When the persisted cache came back empty (e.g. transient lock), only this run's probed entries were saved, wiping all other accounts' still-valid quota data. Fixed by explicitly detecting the empty-load case and falling back to nextCache (the full in-memory clone). [HIGH] lib/runtime-rotation-proxy.ts: persistRuntimeActiveAccount advanced the sequential drain-first pointer via markSwitchedLocked in the legacy routing branch even when schedulingStrategy was 'sequential', defeating the #509 invariant. Added the same schedulingStrategy !== 'sequential' guard that the enabled branch already had. [MEDIUM] lib/runtime/rotation-token-refresh.ts: ensureFreshAccessToken could return an expired access token when commitRefreshedAuthOnce returned null (account missing from storage after persist), because updatedAccount fell back to the original account which still carried the old token. Fixed to use refreshResult.access (the just-issued token) in the null-commit case, while preserving the committed account's stored token on the success path (required for the dedup-commit case where two concurrent callers share one commit).
1 parent e453111 commit 8d36a74

4 files changed

Lines changed: 66 additions & 16 deletions

File tree

lib/codex-manager/login-menu-data.ts

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -219,22 +219,34 @@ export async function refreshQuotaCacheForMenu(
219219
// entries meanwhile, and writing the stale clone back whole-file would
220220
// silently discard them (last write wins). Re-apply this run's results
221221
// onto the freshest persisted cache instead.
222+
//
223+
// loadQuotaCache() is documented to never throw — on any read failure it
224+
// returns empty maps. We also guard the empty-maps case explicitly: if the
225+
// persisted cache came back empty (read failure / empty file), fall back to
226+
// nextCache so non-probed entries are not wiped. The try/catch handles any
227+
// mocked or future implementation that does throw.
222228
let cacheToSave = nextCache;
223229
try {
224230
const persisted = await loadQuotaCache();
225-
for (const { account, snapshot } of appliedSnapshots) {
226-
updateQuotaCacheForAccount(
227-
persisted,
228-
account,
229-
snapshot,
230-
storage.accounts,
231-
emailFallbackState,
232-
);
231+
const persistedHasData =
232+
Object.keys(persisted.byAccountId).length > 0 ||
233+
Object.keys(persisted.byEmail).length > 0;
234+
if (persistedHasData) {
235+
for (const { account, snapshot } of appliedSnapshots) {
236+
updateQuotaCacheForAccount(
237+
persisted,
238+
account,
239+
snapshot,
240+
storage.accounts,
241+
emailFallbackState,
242+
);
243+
}
244+
cacheToSave = persisted;
233245
}
234-
cacheToSave = persisted;
246+
// else: persisted came back empty; keep cacheToSave = nextCache.
235247
} catch {
236-
// Fall back to the snapshot clone; saving slightly stale data beats
237-
// dropping this run's probe results.
248+
// loadQuotaCache threw (unexpected); fall back to the snapshot clone
249+
// so non-probed entries survive.
238250
}
239251
try {
240252
await saveQuotaCache(cacheToSave);

lib/runtime-rotation-proxy.ts

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,6 +293,7 @@ async function persistRuntimeActiveAccount(
293293
account: ManagedAccount,
294294
family: ModelFamily,
295295
isPinned: boolean,
296+
schedulingStrategy: string,
296297
): Promise<void> {
297298
if (isPinned) {
298299
// When the user has manually pinned an account, the proxy MUST NOT
@@ -319,7 +320,16 @@ async function persistRuntimeActiveAccount(
319320
// `saveToDiskDebounced` + CLI sync still run in both modes: they snapshot the
320321
// current in-memory state / mirror the CLI selection and do not advance the
321322
// in-memory rotation cursor.
322-
if (accountManager.getRoutingMutexMode() !== "enabled") {
323+
//
324+
// Sequential scheduling fix (#509): in "sequential" mode a within-request
325+
// fallback to a different account must NOT advance the drain-first primary
326+
// pointer — only a true primary-exhaustion in chooseAccount may do that.
327+
// Apply the same guard here as the in-loop re-commit (lines 939-958) so
328+
// the legacy branch cannot silently clobber the sequential drain order.
329+
if (
330+
accountManager.getRoutingMutexMode() !== "enabled" &&
331+
schedulingStrategy !== "sequential"
332+
) {
323333
await accountManager.markSwitchedLocked(account, "rotation", family);
324334
}
325335
accountManager.saveToDiskDebounced();
@@ -1349,6 +1359,7 @@ async function handleRequestInner(
13491359
refreshed.account,
13501360
context.family,
13511361
isPinned && refreshed.account.index === pinnedIndex,
1362+
state.schedulingStrategy,
13521363
);
13531364

13541365
const forwarded = await forwardStreamingResponse(

lib/runtime/rotation-token-refresh.ts

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -118,15 +118,27 @@ export async function ensureFreshAccessToken(params: {
118118
expires: refreshResult.expires,
119119
};
120120
try {
121-
const updatedAccount = (await commitRefreshedAuthOnce(
121+
const updatedAccount = await commitRefreshedAuthOnce(
122122
accountManager,
123123
account,
124124
auth,
125-
)) ?? account;
125+
);
126+
// If commit succeeded, use the stored access token from the committed
127+
// account — this also handles the dedup case where two concurrent callers
128+
// share one commit and both end up with the same committed token.
129+
// If commit returned null (account missing from storage after persist),
130+
// fall back to the original account object but always use refreshResult.access:
131+
// the original account may carry an expired token, and using that would
132+
// cause a downstream 401 and incorrectly trigger invalidation-cooldown logic.
133+
const resolvedAccount = updatedAccount ?? account;
134+
const accessToken =
135+
updatedAccount !== null
136+
? (updatedAccount.access ?? refreshResult.access)
137+
: refreshResult.access;
126138
return {
127139
ok: true,
128-
accessToken: updatedAccount.access ?? refreshResult.access,
129-
account: updatedAccount,
140+
accessToken,
141+
account: resolvedAccount,
130142
};
131143
} catch {
132144
accountManager.recordFailure(account, family, model);

lib/storage/flagged-storage.ts

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,14 @@ export function normalizeFlaggedStorage(
102102
: undefined,
103103
email:
104104
typeof rawAccount.email === "string" ? rawAccount.email : undefined,
105+
accessToken:
106+
typeof rawAccount.accessToken === "string"
107+
? rawAccount.accessToken
108+
: undefined,
109+
expiresAt:
110+
typeof rawAccount.expiresAt === "number"
111+
? rawAccount.expiresAt
112+
: undefined,
105113
enabled:
106114
typeof rawAccount.enabled === "boolean"
107115
? rawAccount.enabled
@@ -113,6 +121,13 @@ export function normalizeFlaggedStorage(
113121
? rawAccount.coolingDownUntil
114122
: undefined,
115123
cooldownReason,
124+
workspaces: Array.isArray(rawAccount.workspaces)
125+
? (rawAccount.workspaces as FlaggedAccountMetadataV1["workspaces"])
126+
: undefined,
127+
currentWorkspaceIndex:
128+
typeof rawAccount.currentWorkspaceIndex === "number"
129+
? rawAccount.currentWorkspaceIndex
130+
: undefined,
116131
flaggedAt,
117132
flaggedReason:
118133
typeof rawAccount.flaggedReason === "string"

0 commit comments

Comments
 (0)