Skip to content

Commit 1317fe9

Browse files
authored
fix(security,resilience): pre-release hardening from deep stress test (#503)
* fix(security,resilience): pre-release hardening from deep stress test Found by a whole-tree adversarial audit + runtime fuzz of the merged main (both #499 and #500). The tree was already well-hardened — these are the residual MEDIUM/LOW gaps, all verified. Egress hardening (never forward inbound client credentials upstream): - runtime-rotation-proxy.ts + local-bridge.ts: strip inbound `cookie` and `proxy-authorization` on the outbound request, alongside the existing authorization/x-api-key stripping. A client cookie would otherwise ride upstream with the managed OAuth Bearer. - runtime-rotation-proxy.ts readErrorBody: the outbound fetch's abort timer is cleared once headers arrive, so a stalled ERROR body (429/5xx path) could hang the handler forever — the success path is per-chunk stall-bounded, the error path was not. Read it via a reader with a per-read idle timeout + a 1MB cap, cancelling the stream on stall/overflow. Windows fs resilience (close the transient-lock gaps the fuzz reproduced): - storage.ts: the secret-bearing WAL write and temp write had NO retry, and the temp cleanup was single-shot — a transient EBUSY/EPERM could fail a valid save or strand a token-bearing *.tmp. Wrap all three in withFileOperationRetry. Also align copyFileWithRetry / renameFileWithRetry to the shared shouldRetryFileOperation taxonomy (adds ENOTEMPTY/EACCES/EAGAIN). The primary commit rename keeps its existing EPERM/EBUSY contract unchanged. - quota-cache.ts: widen RETRYABLE_FS_CODES to the shared set. Least-privilege: - runtime/runtime-observability.ts: persist the snapshot (account id/label) with mode 0o600 and the dir 0o700, matching the other sensitive writers (no-op on win32, owner-only on POSIX). Tests: cookie strip on both egress paths; existing storage WAL/temp/backup retry coverage still green (rename contract unchanged). Also cleaned a stale eslint-disable + unused import in the merged mcodex statusline test. typecheck + lint clean; full suite 4246 passed / 1 skipped. * chore(release): cut v2.1.13-beta.3 + bump vitest to ^4.1.8 - Bump vitest / @vitest/ui / @vitest/coverage-v8 to ^4.1.8 (dev-only), clearing GHSA-5xrq-8626-4rwp so `npm run audit:ci` exits 0. audit:prod was already clean (nothing test-related ships); this unblocks the CI release gate. - Version → 2.1.13-beta.3. - Add docs/releases/v2.1.13-beta.3.md covering the Phase 1 audit (#499), the mcodex launcher + statusline (#500), and the pre-release hardening (#503). Update the docs portal + root README prerelease pointers (documentation.test.ts parity). typecheck + lint + audit:ci clean; full suite 4246 passed / 1 skipped on vitest 4.1.8; documentation parity 25/25. * fix: address #503 review (dir-mode + retry/strip coverage + version pins) - runtime/runtime-observability.ts: re-assert dir 0o700 via fs.chmod after mkdir — mkdir's mode is a no-op on a pre-existing multi-auth dir, so the upgrade path kept old (possibly permissive) perms. POSIX-only, best-effort. - Bump .codex-plugin/plugin.json + AGENTS.md to 2.1.13-beta.3 (plugin-manifest test asserts manifest version == package version; the beta.3 bump left them stale). Tests: - local-bridge + runtime-rotation-proxy: assert inbound `proxy-authorization` (alongside cookie) is stripped on the outbound request, so a refactor dropping the delete is caught. - quota-cache: read-retry (EACCES) + rename-retry (ENOTEMPTY) + last-write-wins queue-ordering regressions, pinning the widened retry taxonomy. - runtime-observability-dir-mode (new, POSIX-only): pre-create CODEX_MULTI_AUTH_DIR world-writable, persist a snapshot, assert the dir becomes owner-only — proves the chmod-after-mkdir hardening on the upgrade path. - runtime-observability test fs-mock now exposes chmod. typecheck + lint + audit:ci clean; full suite 4249 passed / 2 skipped. * fix: address #503 round-2 review (chmod error handling + win32/strip coverage) - runtime/runtime-observability.ts: stop swallowing every chmod failure. Only ENOENT (dir removed by a concurrent process) is ignored; any other failure is surfaced rather than silently leaving a world-readable dir holding account ids/labels. - runtime-observability.test.ts: add a win32 case asserting the snapshot persists WITHOUT calling chmod (POSIX-only branch), and reset the chmod mock per test. - local-bridge.test.ts: extend the auth-enabled runtime-proxy path test to assert cookie + proxy-authorization are stripped too (parity with the no-key path). typecheck + lint clean; full suite 4250 passed / 2 skipped. * test(observability): assert snapshot temp file is written 0o600 Addresses the #503 round-3 review: the win32 persistence test now also asserts writeFile received mode 0o600, so an owner-only-perms regression on the snapshot file fails the suite.
1 parent b9b1693 commit 1317fe9

18 files changed

Lines changed: 784 additions & 269 deletions

.codex-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "codex-multi-auth",
3-
"version": "2.1.13-beta.2",
3+
"version": "2.1.13-beta.3",
44
"description": "Install and operate codex-multi-auth for the official @openai/codex CLI with multi-account OAuth rotation, switching, health checks, and recovery tools.",
55
"interface": {
66
"composerIcon": "./assets/codex-multi-auth-icon.svg"

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Generated: 2026-04-25
44
Commit: a87e005
55
Branch: main
6-
Package version: 2.1.13-beta.2
6+
Package version: 2.1.13-beta.3
77

88
## OVERVIEW
99

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ codex-multi-auth doctor --json
383383

384384
## Release Notes
385385

386-
- Current prerelease: [docs/releases/v2.1.13-beta.2.md](docs/releases/v2.1.13-beta.2.md) — install via `npm i -g codex-multi-auth@beta`
386+
- Current prerelease: [docs/releases/v2.1.13-beta.3.md](docs/releases/v2.1.13-beta.3.md) — install via `npm i -g codex-multi-auth@beta`
387387
- Current stable: [docs/releases/v2.1.12.md](docs/releases/v2.1.12.md)
388388
- Previous stable: [docs/releases/v2.1.11.md](docs/releases/v2.1.11.md)
389389
- Earlier stable: [docs/releases/v2.1.10.md](docs/releases/v2.1.10.md)

docs/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Public documentation for the `codex-multi-auth` Codex CLI multi-account OAuth ma
3232

3333
| Document | Focus |
3434
| --- | --- |
35-
| [releases/v2.1.13-beta.2.md](releases/v2.1.13-beta.2.md) | Current prerelease notes (install via `npm i -g codex-multi-auth@beta`) |
35+
| [releases/v2.1.13-beta.3.md](releases/v2.1.13-beta.3.md) | Current prerelease notes (install via `npm i -g codex-multi-auth@beta`) |
3636
| [releases/v2.1.12.md](releases/v2.1.12.md) | Current stable release notes |
3737
| [releases/v2.1.11.md](releases/v2.1.11.md) | Prior stable release notes |
3838
| [releases/v2.1.10.md](releases/v2.1.10.md) | Earlier stable release notes |

docs/releases/v2.1.13-beta.3.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# v2.1.13-beta.3
2+
3+
Beta prerelease that lands the Phase 1 correctness/security audit (#499), the new
4+
`mcodex` launcher + cached statusline (#500), and a round of pre-release
5+
hardening surfaced by a whole-tree stress test (#503). It carries forward the
6+
cascade OAuth token-invalidation fix from v2.1.13-beta.2, the multi-workspace
7+
support from v2.1.13-beta.1, and the pinned-account 503 diagnostic from
8+
v2.1.13-beta.0.
9+
10+
This is a **prerelease**. Stable `v2.1.13` will land once the issue #486 root
11+
cause is identified and patched.
12+
13+
## Install
14+
15+
```bash
16+
npm i -g codex-multi-auth@beta
17+
```
18+
19+
## mcodex launcher (#500)
20+
21+
A lightweight launcher and cached multi-auth status display for Codex sessions.
22+
23+
- `mcodex` — launch Codex with a cached status line printed before startup
24+
(model, reasoning effort, cwd, active account, quota usage, plan, cache age).
25+
- `mcodex --tmux` — launch inside a tmux session with mouse scrollback.
26+
- `mcodex --tmux --live-accounts` — add a live `codex-multi-auth list` monitor pane.
27+
- `mcodex --monitor` — monitor-only mode.
28+
29+
### Status line
30+
31+
- Reads local `quota-cache.json`, `runtime-observability.json`, and account
32+
storage; never calls OpenAI on launch.
33+
- Refreshes quota data in the background only when the cache is stale (default
34+
10 min, `CODEX_MULTI_AUTH_STATUS_QUOTA_REFRESH_INTERVAL_MS`), behind a lock so
35+
concurrent launches don't double-refresh. Stale refresh locks recover after
36+
10 min.
37+
- Resolves the **per-project** account pool when `perProjectAccounts` is enabled
38+
and Codex CLI sync is off (mirrors the runtime's own account scoping), falling
39+
back to the global pool otherwise. Quota/observability stay global.
40+
- Toggle with `CODEX_MULTI_AUTH_STATUSLINE=0`.
41+
42+
### Hardening
43+
44+
- `MCODEX_MONITOR_INTERVAL` / `MCODEX_TMUX_HISTORY_LIMIT` are validated as numeric
45+
before being interpolated into `watch` / tmux commands (no shell injection).
46+
- `--monitor` / `--live-accounts` fail fast with a clear message when `watch`
47+
isn't installed instead of spawning a broken pane.
48+
- The status path resolves `~` correctly on Windows (`path.sep`, not a hardcoded
49+
`/`), and reads the account pool without blocking the event loop.
50+
51+
## Phase 1 audit remediation (#499)
52+
53+
Security and correctness hardening across the runtime, storage, and prompt layers.
54+
55+
### Security
56+
57+
- **Prompt cache integrity.** Cached Codex instructions are SHA-256 verified; a
58+
tampered cache is discarded, and a legacy entry with no recorded digest is
59+
treated as unverified — it is never fast-path served and never drives a
60+
conditional `304` revalidation (which could otherwise launder un-vetted bytes).
61+
A full fetch mints the first digest; old bytes are kept only as an offline
62+
fallback.
63+
- **Path-traversal defense in recovery.** Stored message/part records are
64+
validated before their ids are used to build filesystem paths; a
65+
parseable-but-unsafe id (e.g. `../poison`) or a non-numeric `time.created` is
66+
quarantined instead of escaping into a traversal read or a `NaN` sort.
67+
- **Loopback-only egress.** The runtime rotation proxy and local bridge bind
68+
loopback-only with no opt-out, and never forward inbound client credentials
69+
(`authorization`, `x-api-key`, `cookie`, `proxy-authorization`) upstream
70+
alongside the managed token. IPv6 loopback (`::1` / `[::1]`) is normalized
71+
consistently for both the socket bind and the emitted base URL.
72+
- **Token/email redaction.** Log, debug-bundle, and status sinks mask tokens and
73+
emails; the debug bundle redacts the home prefix (case- and
74+
separator-insensitive on Windows), strips credentials from config values, and
75+
masks the account id.
76+
77+
### Correctness & resilience
78+
79+
- **No event-loop blocking.** Removed synchronous `Atomics.wait` sleeps from the
80+
config load path and the logger's directory-creation retry; both now retry
81+
without freezing the event loop.
82+
- **Bounded network reads.** Prompt and release-metadata fetches are bounded by
83+
connect+body timeouts that actually cancel a stalled body.
84+
- **Windows filesystem resilience.** Account-store WAL/temp writes, temp cleanup,
85+
backup copy/rename, quota-cache, flagged-store, and export operations retry the
86+
shared transient-lock taxonomy (EBUSY/EPERM/ENOTEMPTY/EACCES/EAGAIN) so an
87+
antivirus/indexer/concurrent-reader lock doesn't fail a valid operation or
88+
strand a secret-bearing temp file.
89+
- **Atomic, self-healing account store.** Writes go through a checksummed WAL +
90+
temp-and-rename; a torn write self-heals on the next read.
91+
92+
### CLI
93+
94+
- `codex-multi-auth status` / `list` gained `--json` for machine-readable output,
95+
with a stable shape whether or not accounts are configured.
96+
97+
## Pre-release hardening (#503)
98+
99+
- Strip inbound `cookie` / `proxy-authorization` on both egress paths.
100+
- Bound the proxy's upstream error-body read (previously unbounded on 4xx/5xx).
101+
- Persist `runtime-observability.json` owner-only (`0o600` / dir `0o700`) on POSIX.
102+
- Bump `vitest` to `^4.1.8` (dev-only) to clear GHSA-5xrq-8626-4rwp.
103+
104+
## Verification
105+
106+
Full test suite (4,200+ tests) green; `npm run audit:ci` clean; typecheck and
107+
lint pass.

lib/local-bridge.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,10 @@ function forwardHeaders(headers: Headers, runtimeClientApiKey?: string): Headers
9898
// caller's local credential across the bridge boundary and could change which
9999
// auth the runtime proxy evaluates — strip it unconditionally.
100100
result.delete("x-api-key");
101+
// Same contract: never carry an inbound Cookie / proxy-auth header upstream
102+
// alongside the managed token.
103+
result.delete("cookie");
104+
result.delete("proxy-authorization");
101105
// runtime-proxy-03: present the runtime proxy's client token. We replace the
102106
// inbound client's Authorization (already validated by the bridge) rather than
103107
// forwarding it verbatim, so the bridge can authenticate to an auth-enabled

lib/quota-cache.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,16 @@ interface QuotaCacheFile {
3232

3333
const QUOTA_CACHE_PATH = join(getCodexMultiAuthDir(), "quota-cache.json");
3434
const QUOTA_CACHE_LABEL = basename(QUOTA_CACHE_PATH);
35-
const RETRYABLE_FS_CODES = new Set(["EBUSY", "EPERM"]);
35+
// Align with the shared FILE_RETRY_CODES taxonomy (lib/fs-retry.ts) so a
36+
// transient Windows lock (AV/indexer/concurrent reader) on the quota cache is
37+
// retried consistently with every other fs path, not just EBUSY/EPERM.
38+
const RETRYABLE_FS_CODES = new Set([
39+
"EBUSY",
40+
"EPERM",
41+
"EAGAIN",
42+
"ENOTEMPTY",
43+
"EACCES",
44+
]);
3645
let quotaCacheWriteQueue: Promise<void> = Promise.resolve();
3746

3847
function isRetryableFsError(error: unknown): boolean {

lib/runtime-rotation-proxy.ts

Lines changed: 63 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,10 @@ function createOutboundHeaders(
268268
}
269269
headers.delete("host");
270270
headers.delete("x-api-key");
271+
// Never forward inbound client credentials upstream: a Cookie / proxy-auth
272+
// header would ride along with the managed OAuth Bearer to OpenAI.
273+
headers.delete("cookie");
274+
headers.delete("proxy-authorization");
271275
headers.set("authorization", `Bearer ${accessToken}`);
272276
headers.set(OPENAI_HEADERS.ACCOUNT_ID, accountId);
273277
headers.set(OPENAI_HEADERS.BETA, OPENAI_HEADER_VALUES.BETA_RESPONSES);
@@ -892,9 +896,62 @@ function parseRetryAfterBodyMs(bodyText: string, now: number): number | null {
892896
return null;
893897
}
894898

895-
async function readErrorBody(response: Response): Promise<string> {
899+
async function readErrorBody(
900+
response: Response,
901+
timeoutMs: number,
902+
maxBytes = 1024 * 1024,
903+
): Promise<string> {
904+
// The outbound fetch's abort timer is cleared once headers arrive, so a
905+
// stalled error body would otherwise hang this handler forever (the success
906+
// path is per-chunk stall-bounded; the error path was not). Read the body via
907+
// a reader, bound it by an idle timeout AND a size cap, and cancel the stream
908+
// on timeout/overflow so the socket is released.
909+
const body = response.body;
910+
if (!body || typeof body.getReader !== "function") {
911+
// Fallback for impls without a streamable body: race text() against a timer.
912+
try {
913+
return await withTimeout(
914+
response.text(),
915+
timeoutMs,
916+
() => undefined,
917+
"error body stalled",
918+
);
919+
} catch {
920+
return "";
921+
}
922+
}
923+
const reader = body.getReader();
924+
const chunks: Uint8Array[] = [];
925+
let total = 0;
926+
try {
927+
for (;;) {
928+
let idleTimer: ReturnType<typeof setTimeout> | undefined;
929+
const idle = new Promise<never>((_resolve, reject) => {
930+
idleTimer = setTimeout(
931+
() => reject(new Error("error body stalled")),
932+
Math.max(1, timeoutMs),
933+
);
934+
});
935+
let result: Awaited<ReturnType<typeof reader.read>>;
936+
try {
937+
result = await Promise.race([reader.read(), idle]);
938+
} finally {
939+
if (idleTimer) clearTimeout(idleTimer);
940+
}
941+
if (result.done) break;
942+
if (result.value) {
943+
total += result.value.byteLength;
944+
if (total > maxBytes) break; // cap: enough for diagnostics, no OOM
945+
chunks.push(result.value);
946+
}
947+
}
948+
} catch {
949+
// stalled or errored — fall through with whatever we have
950+
} finally {
951+
await reader.cancel().catch(() => undefined);
952+
}
896953
try {
897-
return await response.text();
954+
return Buffer.concat(chunks).toString("utf8");
898955
} catch {
899956
return "";
900957
}
@@ -1734,7 +1791,7 @@ export async function startRuntimeRotationProxy(
17341791
}
17351792

17361793
if (upstream.status === HTTP_STATUS.TOO_MANY_REQUESTS) {
1737-
const bodyText = await readErrorBody(upstream);
1794+
const bodyText = await readErrorBody(upstream, streamStallTimeoutMs);
17381795
const retryAfterMs =
17391796
parseRetryAfterHeaderMs(upstream.headers, now()) ??
17401797
parseRetryAfterBodyMs(bodyText, now()) ??
@@ -1759,7 +1816,7 @@ export async function startRuntimeRotationProxy(
17591816
}
17601817

17611818
if (upstream.status === 402 || upstream.status === HTTP_STATUS.FORBIDDEN) {
1762-
const bodyText = await readErrorBody(upstream);
1819+
const bodyText = await readErrorBody(upstream, streamStallTimeoutMs);
17631820
const errorCode = extractErrorCodeFromBody(bodyText);
17641821
if (isWorkspaceDisabledError(upstream.status, errorCode, bodyText)) {
17651822
const accountWasEnabled =
@@ -1856,7 +1913,7 @@ export async function startRuntimeRotationProxy(
18561913
}
18571914

18581915
if (upstream.status === HTTP_STATUS.UNAUTHORIZED) {
1859-
const bodyText = await readErrorBody(upstream);
1916+
const bodyText = await readErrorBody(upstream, streamStallTimeoutMs);
18601917
accountManager.refundToken(refreshed.account, context.family, context.model);
18611918
accountManager.recordFailure(refreshed.account, context.family, context.model);
18621919
if (isTokenInvalidationError(bodyText)) {
@@ -1902,7 +1959,7 @@ export async function startRuntimeRotationProxy(
19021959
}
19031960

19041961
if (upstream.status >= 500) {
1905-
await readErrorBody(upstream);
1962+
await readErrorBody(upstream, streamStallTimeoutMs);
19061963
accountManager.refundToken(refreshed.account, context.family, context.model);
19071964
accountManager.recordFailure(refreshed.account, context.family, context.model);
19081965
accountManager.markAccountCoolingDown(

lib/runtime/runtime-observability.ts

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -195,13 +195,34 @@ function ensureSnapshotState(): RuntimeObservabilitySnapshot {
195195
async function writeSnapshot(snapshot: RuntimeObservabilitySnapshot): Promise<void> {
196196
const dir = getCodexMultiAuthDir();
197197
const path = getSnapshotPath();
198-
await fs.mkdir(dir, { recursive: true });
198+
// The snapshot persists account identifiers (lastAccountId/label/index), so
199+
// keep it owner-only on POSIX like the other sensitive writers (logger,
200+
// local-client-tokens). mode is a no-op on win32 (ACL-based).
201+
await fs.mkdir(dir, { recursive: true, mode: 0o700 });
202+
// mkdir's mode only applies to a freshly-created dir; an upgrade path with a
203+
// pre-existing multi-auth dir keeps its old (possibly permissive) perms, so
204+
// re-assert 0o700 on POSIX. Only ENOENT is swallowed (the dir was removed by a
205+
// concurrent process — the snapshot write below will recreate/fail as needed);
206+
// any other chmod failure is surfaced rather than silently leaving a
207+
// world-readable dir to hold account ids/labels.
208+
if (process.platform !== "win32") {
209+
try {
210+
await fs.chmod(dir, 0o700);
211+
} catch (error) {
212+
if ((error as NodeJS.ErrnoException | undefined)?.code !== "ENOENT") {
213+
throw error;
214+
}
215+
}
216+
}
199217
let lastError: unknown = null;
200218
for (let attempt = 0; attempt < 3; attempt += 1) {
201219
const tempPath = `${path}.${process.pid}.${Date.now()}.${attempt}.tmp`;
202220
let moved = false;
203221
try {
204-
await fs.writeFile(tempPath, JSON.stringify(snapshot, null, 2), "utf-8");
222+
await fs.writeFile(tempPath, JSON.stringify(snapshot, null, 2), {
223+
encoding: "utf-8",
224+
mode: 0o600,
225+
});
205226
await fs.rename(tempPath, path);
206227
moved = true;
207228
return;

lib/storage.ts

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { existsSync, promises as fs, readFileSync } from "node:fs";
22
import { basename, dirname, join } from "node:path";
33
import { ACCOUNT_LIMITS } from "./constants.js";
44
import { StorageError } from "./errors.js";
5+
import { shouldRetryFileOperation, withFileOperationRetry } from "./fs-retry.js";
56
import { createLogger } from "./logger.js";
67
import {
78
exportNamedBackupFile,
@@ -273,8 +274,7 @@ async function copyFileWithRetry(
273274
return;
274275
}
275276
const canRetry =
276-
(code === "EPERM" || code === "EBUSY") &&
277-
attempt + 1 < BACKUP_COPY_MAX_ATTEMPTS;
277+
shouldRetryFileOperation(error) && attempt + 1 < BACKUP_COPY_MAX_ATTEMPTS;
278278
if (canRetry) {
279279
await new Promise((resolve) =>
280280
setTimeout(resolve, BACKUP_COPY_BASE_DELAY_MS * 2 ** attempt),
@@ -295,10 +295,8 @@ async function renameFileWithRetry(
295295
await fs.rename(sourcePath, destinationPath);
296296
return;
297297
} catch (error) {
298-
const code = (error as NodeJS.ErrnoException).code;
299298
const canRetry =
300-
(code === "EPERM" || code === "EBUSY" || code === "EAGAIN") &&
301-
attempt + 1 < BACKUP_COPY_MAX_ATTEMPTS;
299+
shouldRetryFileOperation(error) && attempt + 1 < BACKUP_COPY_MAX_ATTEMPTS;
302300
if (!canRetry) {
303301
throw error;
304302
}
@@ -1834,13 +1832,20 @@ async function saveAccountsUnlocked(storage: AccountStorageV3): Promise<void> {
18341832
checksum: computeSha256(content),
18351833
content,
18361834
};
1837-
await fs.writeFile(walPath, JSON.stringify(journalEntry), {
1838-
encoding: "utf-8",
1839-
mode: 0o600,
1840-
});
1835+
// Secret-bearing WAL write: retry transient Windows locks via the shared
1836+
// taxonomy so a momentary AV/indexer/concurrent-reader lock can't fail an
1837+
// otherwise-valid save (EBUSY/EPERM/EAGAIN/ENOTEMPTY/EACCES).
1838+
await withFileOperationRetry(() =>
1839+
fs.writeFile(walPath, JSON.stringify(journalEntry), {
1840+
encoding: "utf-8",
1841+
mode: 0o600,
1842+
}),
1843+
);
18411844
},
18421845
writeTemp: (tempPath: string, content: string) =>
1843-
fs.writeFile(tempPath, content, { encoding: "utf-8", mode: 0o600 }),
1846+
withFileOperationRetry(() =>
1847+
fs.writeFile(tempPath, content, { encoding: "utf-8", mode: 0o600 }),
1848+
),
18441849
statTemp: (tempPath: string) => fs.stat(tempPath),
18451850
renameTempToPath: async (tempPath: string) => {
18461851
let lastError: NodeJS.ErrnoException | null = null;
@@ -1875,8 +1880,11 @@ async function saveAccountsUnlocked(storage: AccountStorageV3): Promise<void> {
18751880
}
18761881
},
18771882
cleanupTemp: async (tempPath: string) => {
1883+
// The temp file holds the full account store (refresh tokens, 0o600).
1884+
// Retry cleanup so a transient lock can't strand a secret-bearing *.tmp
1885+
// next to the destination; swallow a persistent failure (best effort).
18781886
try {
1879-
await fs.unlink(tempPath);
1887+
await withFileOperationRetry(() => fs.unlink(tempPath));
18801888
} catch {
18811889
// Ignore cleanup failure.
18821890
}

0 commit comments

Comments
 (0)