fix: resolve post-merge integration issues across 34 audit PRs

ndycode · ndycode · commit e453111025d2 · 2026-06-11T00:23:22.000+08:00
- fix: restore ensureFirstRunSetup wiring in codex-manager.ts (lost during audit-29 extraction) - fix: update ui/copy.js -> ui/ui-copy.js imports in extracted login modules (audit-17 rename) - fix: convert ci.yml to LF line endings so extractJobBlock regex works in tests - fix: update documentation test to expect config.schema.json gitattributes entry (audit-16) - fix: update codex-manager-cli.test.ts quota cache mock to use mockImplementation - fix: add options param and waitFor to stale-refresh test assertions (audit-32) - fix: merge fs-retry.test.ts conflict: combine shouldRetryFileOperation suite from PR #516 with comprehensive withRetry/withRetrySync suite from audit-07
diff --git a/.bugsweep-index.md b/.bugsweep-index.md
@@ -0,0 +1,118 @@
+# Bug Sweep: index.ts / lib/cli.ts / lib/index.ts
+
+Scope: HIGH-confidence correctness defects only. `lib/index.ts` is pure re-exports (clean).
+
+---
+
+## 1. `index.ts:3163-3214` — "Set as current account" is silently ignored in the plugin login flow
+
+**Severity: HIGH | Confidence: HIGH**
+
+The `promptLoginMode` menu in `lib/cli.ts` can return a manage action carrying `switchAccountIndex` (from the "set as current" / "set-current-account" menu choices):
+
+```ts
+// lib/cli.ts:237-241 and 257-260
+if (accountAction === "set-current") {
+    const index = resolveAccountSourceIndex(action.account);
+    if (index >= 0) return { mode: "manage", switchAccountIndex: index };
+    ...
+}
+...
+case "set-current-account": {
+    const index = resolveAccountSourceIndex(action.account);
+    if (index >= 0) return { mode: "manage", switchAccountIndex: index };
+    ...
+}
+```
+
+But the `mode === "manage"` handler in `index.ts` only inspects `deleteAccountIndex`, `toggleAccountIndex`, and `refreshAccountIndex`:
+
+```ts
+// index.ts:3163-3214
+if (menuResult.mode === "manage") {
+    if (typeof menuResult.deleteAccountIndex === "number") { ... continue; }
+    if (typeof menuResult.toggleAccountIndex === "number") { ... continue; }
+    if (typeof menuResult.refreshAccountIndex === "number") {
+        refreshAccountIndex = menuResult.refreshAccountIndex;
+        startFresh = false;
+        break;
+    }
+    continue;   // <-- switchAccountIndex falls through to here and is dropped
+}
+```
+
+There is **no** `switchAccountIndex` branch. (The only consumer of `switchAccountIndex` is `lib/codex-manager.ts:2713`, a different, standalone CLI path — not this plugin login loop.)
+
+**Triggering input -> actual vs expected:** User runs `codex-multi-auth login`, opens an account's detail view, and chooses "Set as current". Expected: that account becomes the active account (`storage.activeIndex` / `activeIndexByFamily` updated and persisted). Actual: the manage block matches none of the three handled indexes, hits the trailing `continue`, and the menu simply re-renders. The active account is never changed and nothing is persisted — the action is a no-op.
+
+**Suggested fix:** Add a handler in the manage block, e.g.:
+```ts
+if (typeof menuResult.switchAccountIndex === "number") {
+    const target = workingStorage.accounts[menuResult.switchAccountIndex];
+    if (target) {
+        const now = Date.now();
+        target.lastUsed = now;
+        target.lastSwitchReason = "rotation";
+        workingStorage.activeIndex = menuResult.switchAccountIndex;
+        workingStorage.activeIndexByFamily = workingStorage.activeIndexByFamily ?? {};
+        for (const family of MODEL_FAMILIES) {
+            workingStorage.activeIndexByFamily[family] = menuResult.switchAccountIndex;
+        }
+        await saveAccounts(workingStorage);
+        invalidateRuntimeAccountManagerCache(accountManagerCacheInvalidationDeps);
+    }
+    continue;
+}
+```
+
+---
+
+## 2. `index.ts:1203-1419` — `attempted` set and `accountCount` go stale after mid-loop `removeAccount`, causing account-traversal desync
+
+**Severity: MEDIUM | Confidence: MEDIUM**
+
+The account-attempt loop captures the pool size and tracks attempts by numeric index, captured/created once per outer iteration:
+
+```ts
+// index.ts:1204-1205
+const accountCount = accountManager.getAccountCount();
+const attempted = new Set<number>();
+...
+accountAttemptLoop: while (attempted.size < Math.max(1, accountCount)) {
+    ...
+    attempted.add(account.index);
+```
+
+Inside this loop, on repeated auth-refresh failure the account is removed, which **reindexes** every remaining account (`acc.index = index` after splice — see `lib/accounts.ts:1540-1590`):
+
+```ts
+// index.ts:1392-1404
+if (authFailurePolicy.removeAccount) {
+    const removedIndex = account.index;
+    sessionAffinityStore?.forgetAccount(removedIndex);
+    accountManager.removeAccount(account);   // shifts indices of all higher accounts down by 1
+    sessionAffinityStore?.reindexAfterRemoval(removedIndex);
+    ...
+    continue;   // back to accountAttemptLoop with stale `attempted` + `accountCount`
+}
+```
+
+**Wrong behavior:** After a removal, `attempted` still holds pre-removal indices. Every account that was above the removed slot now occupies index N-1, so the indices stored in `attempted` no longer identify the same accounts. An account that was already tried can be selected again under its new (not-yet-in-`attempted`) index, and `accountCount` is now one larger than the real pool, so the `attempted.size < accountCount` guard can permit extra/duplicate attempts (or, conversely, mark a fresh account as "attempted"). The session-affinity reindex is handled, but the local `attempted`/`accountCount` traversal state is not. Net effect: the rotation can re-hit a known-bad account and/or mis-count remaining candidates after any in-loop account removal.
+
+**Suggested fix:** After `removeAccount`, re-derive traversal state — clear/rebuild `attempted` (it is keyed by a now-invalid index space) and recompute `accountCount` from `accountManager.getAccountCount()` before continuing, or track attempts by a stable identity key (accountId/refreshToken) instead of numeric index.
+
+---
+
+## Notes / inspected-but-not-reported
+
+- `index.ts:2807` `if (response.status >= 500)` inside the empty-response retry block is effectively dead (this branch is only reached after `response.ok` was true), but it is not a correctness defect with wrong output — omitted.
+- `index.ts:3438-3454` add-account loop uses local `accounts.length` (which also grows on duplicate logins) for slot/limit guards; behavior is benign (duplicates merge), so not reported as a high-confidence bug.
+- `lib/cli.ts` arg/index parsing (`resolveAccountSourceIndex`, `promptAccountSelection` bounds, `Number.parseInt` guards) reviewed — correct.
+- `index.ts` `codex-switch` / `codex-remove` index math and `activeIndex` reindex-after-removal (4405-4427) reviewed — correct.
+
+---
+
+## Count by severity
+- HIGH: 1
+- MEDIUM: 1
+- LOW: 0
diff --git a/.bugsweep-transform.md b/.bugsweep-transform.md
@@ -0,0 +1,129 @@
+# Bug Sweep: request transform / response handler / adapters
+
+Scope reviewed:
+- lib/request/request-transformer.ts
+- lib/request/response-handler.ts
+- lib/request/fetch-helpers.ts
+- lib/request/helpers/{tool-utils,input-utils,model-map}.ts
+- lib/oc-chatgpt-import-adapter.ts
+- lib/oc-chatgpt-orchestrator.ts
+- lib/oc-chatgpt-target-detection.ts
+- lib/prompts/codex.ts
+
+---
+
+## Finding 1 — `trimInputForFastSession` discards the leading developer/system context it deliberately preserved
+
+**File:** lib/request/request-transformer.ts:646-650
+
+**Buggy code:**
+```ts
+const trimmed = input.filter((_item, index) => keepIndexes.has(index));
+if (trimmed.length === 0) return input;
+if (input.length <= maxItems && excludedHeadIndexes.size === 0) return input;
+if (trimmed.length <= safeMax) return trimmed;
+return trimmed.slice(trimmed.length - safeMax);
+```
+
+**Why it's wrong:**
+Earlier in the function the head loop (lines 623-639) deliberately adds the first one or two
+short `developer`/`system` items to `keepIndexes`, and the tail loop (lines 641-644) adds the
+last `safeMax` items (`safeMax = Math.max(8, Math.floor(maxItems))`). Because `trimmed` is built
+via `input.filter(...has(index))`, the preserved head items appear FIRST in `trimmed`.
+
+When the conversation is longer than `safeMax`, the head indices (0/1) are distinct from the tail
+range (`input.length - safeMax .. end`), so `trimmed.length` becomes `safeMax + 1` or `safeMax + 2`.
+The final `trimmed.slice(trimmed.length - safeMax)` then keeps only the last `safeMax` entries —
+which are exactly the tail items — and slices the leading head items off the front.
+
+The function's own docstring states: "Keeps a small leading developer/system context plus the most
+recent items." The slice undoes that for any history longer than `safeMax`.
+
+**Triggering input -> actual vs expected:**
+- Input: 50 items, `maxItems = 30` (`safeMax = 30`), item 0 = short `developer` instruction,
+  `preferLatestUserOnly = false` (non-trivial turn — the common multi-turn fast-session case).
+- keepIndexes = {0, 1, 20..49} -> trimmed.length = 32 -> `trimmed.slice(2)` = items[20..49].
+- Actual: the leading developer/system instruction (items 0/1) is dropped.
+- Expected: leading developer/system context retained alongside the most recent items.
+
+Reachable in production: `resolveFastSessionInputTrimPlan` only sets `preferLatestUserOnly=true`
+for trivial single-line turns (which return early before this slice). Non-trivial fast-session
+turns hit this path with `preferLatestUserOnly=false`.
+
+**Severity:** MEDIUM (degraded prompt context in fast-session mode; non-host project/developer
+instructions are stripped from the request).
+**Confidence:** HIGH (deterministic; the code adds the head indices then unconditionally removes them).
+
+**Suggested fix:** Reserve room for the head when slicing, e.g. partition `trimmed` into head vs
+tail and cap only the tail, or compute the slice as
+`[...headItems, ...tailItems.slice(tailItems.length - (safeMax - headItems.length))]`, so the
+preserved leading context is never sliced away.
+
+---
+
+## Finding 2 — `response.output_text.done` (and reasoning-summary `.done`) read text via `getStringField`, which can wipe accumulated deltas on an empty/whitespace final payload
+
+**File:** lib/request/response-handler.ts:630-639 (and 670-677 for reasoning summary)
+
+**Buggy code:**
+```ts
+if (data.type === "response.output_text.done") {
+	setOutputTextValue(
+		state,
+		outputIndex,
+		getNumberField(eventRecord, "content_index"),
+		getStringField(eventRecord, "text"),   // <-- trimmed/non-empty gate
+		eventRecord.phase,
+	);
+	return;
+}
+```
+`getStringField` returns `null` when the value is empty or whitespace-only
+(`value.trim().length > 0 ? value : null`). The file's own doc comment (lines 54-59) explicitly
+warns: "For textual payloads where whitespace is meaningful, use a field-specific accessor such as
+`getDeltaField` instead of reusing this helper." The `.delta` handlers correctly use `getDeltaField`,
+but the `.done` handlers use `getStringField`.
+
+**Why it's wrong / wrong behavior:**
+`setOutputTextValue(..., null, ...)` deletes the accumulated key:
+```ts
+if (!text) {
+	state.outputText.delete(key);   // line 262-265
+	setPhaseTextSegment(state, phase, key, null);
+	return;
+}
+```
+So if deltas accumulated content (e.g. "Hello") and a terminal `response.output_text.done` arrives
+with an empty or whitespace-only `text`, the accumulated text for that `output:content` key is
+deleted instead of finalized, dropping it from the synthesized final response.
+
+**Triggering input -> actual vs expected:**
+- Stream: `output_text.delta` "Hello world", then `output_text.done` with `text: ""` (or `"   "`).
+- Actual: accumulated "Hello world" is deleted; final JSON loses that content part's text.
+- Expected: the accumulated delta text is preserved as the final value.
+
+**Severity:** LOW (the OpenAI Responses API normally carries the full text on `.done`; an empty/
+whitespace `.done` is the edge that triggers loss).
+**Confidence:** MEDIUM (depends on upstream emitting an empty terminal text event).
+
+**Suggested fix:** Use `getDeltaField` (length>0 only, no trim) for the `.done` text fields, and/or
+guard `setOutputTextValue` so an empty `.done` does not delete already-accumulated delta text.
+
+---
+
+## Notes / examined and considered correct
+
+- `getModelConfig` variant parsing, `coerceReasoningEffort` fallback tables, and `resolveInclude`
+  (always re-adds `reasoning.encrypted_content`) behave as documented.
+- Model-family mapping in model-map.ts (codex aliases -> gpt-5.3-codex; gpt-5.4/5.5 -> gpt-5.2
+  prompt family) is internally consistent with `MODEL_PROFILES`.
+- `mergeRecord` / `applyAccumulatedOutputText` / `appendPhaseTextSegment` ordering and the
+  delta-append fast path are correct.
+- `filterInput` stripIds gating (stripped only when not background mode) is correct.
+- Import-adapter dedup precedence, `remapActiveIndex`, and `matchDestination` index handling are
+  correct; `previewOcChatgptImportMerge` index alignment between `merged.accounts` and
+  `destinationAccounts` is sound.
+- Orchestrator atomic write (temp+rename, 0o600/0o700, retry) and target-detection scope/ambiguity
+  logic are correct.
+- `convertSseToJson` pre-append size cap, malformed-chunk handling, and `readWithTimeout` cleanup
+  are correct.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -1,4 +1,4 @@
-name: CI
+﻿name: CI
 
 on:
   push:
diff --git a/lib/codex-manager.ts b/lib/codex-manager.ts
@@ -28,6 +28,7 @@ import {
 } from "./codex-manager/quota-cache-helpers.js";
 import { runAccountCommand } from "./codex-manager/commands/account.js";
 import { ACCOUNT_MANAGER_COMMANDS } from "./codex-manager/account-manager-commands.js";
+import { ensureFirstRunSetup } from "./runtime/first-run.js";
 import { runBudgetCommand } from "./codex-manager/commands/budget.js";
 import { runBridgeCommand } from "./codex-manager/commands/bridge.js";
 import { runCheckCommand } from "./codex-manager/commands/check.js";
@@ -651,6 +652,13 @@ const CLI_COMMAND_HANDLERS: ReadonlyMap<string, CliCommandHandler> = new Map<
 ]);
 
 export async function runCodexMultiAuthCli(rawArgs: string[]): Promise<number> {
+	// Lazy install setup (audit roadmap §4.5.4): app detection, Codex app bind,
+	// and launcher routing moved out of npm postinstall to the first CLI run.
+	// ensureFirstRunSetup never throws; the catch is belt-and-braces so no
+	// command can ever fail because of first-run housekeeping.
+	await ensureFirstRunSetup({
+		notify: (message) => console.error(`codex-multi-auth: ${message}`),
+	}).catch(() => undefined);
 	const startupDisplaySettings = await loadDashboardDisplaySettings();
 	applyUiThemeFromDashboardSettings(startupDisplaySettings);
 
diff --git a/test/codex-manager-cli.test.ts b/test/codex-manager-cli.test.ts
@@ -740,10 +740,14 @@ describe("codex manager cli commands", () => {
 			primary: {},
 			secondary: {},
 		});
-		quotaCacheMocks.loadQuotaCache.mockResolvedValue({
+		// Fresh object per call, like the real loadQuotaCache (a fresh disk read
+		// each time): refreshQuotaCacheForMenu rebases onto and mutates the
+		// loaded cache before saving, and a shared singleton would leak that
+		// mutation into every later load in the same test.
+		quotaCacheMocks.loadQuotaCache.mockImplementation(async () => ({
 			byAccountId: {},
 			byEmail: {},
-		});
+		}));
 		storageMocks.loadFlaggedAccounts.mockResolvedValue({
 			version: 1,
 			accounts: [],
@@ -9063,7 +9067,7 @@ describe("codex manager cli commands", () => {
 
 		let promptCallCount = 0;
 		promptLoginModeMock
-			.mockImplementationOnce(async (accounts) => {
+			.mockImplementationOnce(async (accounts, options) => {
 				promptCallCount += 1;
 				expect(promptCallCount).toBe(1);
 				expect(
@@ -9076,6 +9080,14 @@ describe("codex manager cli commands", () => {
 				queueMicrotask(() => {
 					releaseFirstRefresh.resolve();
 				});
+				// Wait for the first refresh to fully settle (statusMessage clears in
+				// the same .finally that releases the pending slot) so the second
+				// menu pass deterministically starts its own refresh; the refresh
+				// chain gained an await (the save-time cache rebase), so returning
+				// immediately could reach the pass-2 guard while pass 1 is pending.
+				await vi.waitFor(() => {
+					expect(options?.statusMessage?.()).toBeUndefined();
+				});
 				return { mode: "manage", deleteAccountIndex: 99 };
 			})
 			.mockImplementationOnce(async (accounts, options) => {
@@ -9248,13 +9260,18 @@ describe("codex manager cli commands", () => {
 
 		let promptCallCount = 0;
 		promptLoginModeMock
-			.mockImplementationOnce(async () => {
+			.mockImplementationOnce(async (_accounts, options) => {
 				promptCallCount += 1;
 				expect(promptCallCount).toBe(1);
 
 				queueMicrotask(() => {
 					releaseFirstRefresh.resolve();
 				});
+				// See the stale-refresh test above: settle pass 1's refresh before
+				// returning so pass 2's auto-fetch guard sees a free slot.
+				await vi.waitFor(() => {
+					expect(options?.statusMessage?.()).toBeUndefined();
+				});
 				return { mode: "manage", deleteAccountIndex: 99 };
 			})
 			.mockImplementationOnce(async (_accounts, options) => {
diff --git a/test/documentation.test.ts b/test/documentation.test.ts
@@ -697,6 +697,7 @@ describe("Documentation Integrity", () => {
 			"*.mjs -linguist-detectable",
 			"*.sh -linguist-detectable",
 			"*.html -linguist-detectable",
+			"config/schema/config.schema.json text eol=lf",
 		]);
 	});
 

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-name: CI`
	`1`	`+name: CI`
`2`	`2`
`3`	`3`	`on:`
`4`	`4`	`push:`