fix(config): atomic JSON writes to prevent half-read races during startup#965
fix(config): atomic JSON writes to prevent half-read races during startup#965dongsheng123132 wants to merge 1 commit into
Conversation
`writeJsonFile` (used for openclaw.json, auth-profiles.json, models.json)
calls `fs.writeFile` directly. Several startup paths run concurrently
and read these files at the same time:
- sanitizeOpenClawConfig (this module)
- listAccounts / listLegacyProviders (provider registry)
- channel routes (channel.runtime)
- capability-monitor (gateway)
When a writer interleaves with a reader's `readFile + JSON.parse`, the
reader can observe a half-written file and crash with
"Unexpected end of JSON input". On a Windows portable install running
off a USB drive (slower flush latency), this is reproducible on every
gateway restart that happens to coincide with a sanitize or readback.
Fix: introduce `writeFileAtomic(filePath, data, encoding?)` in
`electron/utils/atomic-write.ts`:
1. Write to a unique tmp file `<file>.tmp.<pid>.<counter>` to avoid
concurrent writers stomping each other.
2. Rename tmp → real path. Same-disk rename is atomic on POSIX and
NTFS — readers always see the complete old or new file.
3. On Windows the rename can fail with EPERM/EBUSY/EACCES if the
target file is held by another process (especially common on USB
drives). Retry up to 5× with linear backoff, then fall back to a
direct in-place write so we never lose data.
Routed all `writeJsonFile` calls through the helper. Also routed the
one direct `writeFile` (modelsPath cleanup at line 914) through it so
the same atomicity guarantee covers per-agent model registry writes.
Tests: 5 unit tests cover the basic write, EPERM retry, fallback after
5 retries, concurrent writer non-collision via unique tmp paths, and
correct encoding pass-through.
`pnpm typecheck` clean. `pnpm exec vitest run tests/unit/atomic-write.test.ts`
5/5 pass.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9f28dbef04
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| await writeFile(tmpPath, data, encoding); | ||
|
|
||
| for (let attempt = 0; attempt < 5; attempt++) { | ||
| try { | ||
| await rename(tmpPath, filePath); |
There was a problem hiding this comment.
Preserve existing file mode when replacing config files
This atomic-write implementation creates a brand new temp file with default permissions and then renames it over the target, which resets the target's mode bits instead of preserving them. If openclaw.json/auth-profiles.json were previously locked down (for example 0600 from external tooling), a subsequent write here can silently widen access (commonly to 0644), exposing API keys/tokens to other local users. Please carry over the existing file mode (e.g., stat target and apply mode to temp before rename, or chmod after rename) when the destination already exists.
Useful? React with 👍 / 👎.
Summary
writeJsonFile (in electron/utils/openclaw-auth.ts, used for openclaw.json, auth-profiles.json, models.json) calls fs.writeFile directly. Several startup paths run concurrently and read these files at the same time: sanitizeOpenClawConfig, listAccounts / listLegacyProviders, channel.runtime, capability-monitor, etc. When a writer interleaves with a reader\u0027s readFile + JSON.parse, the reader can observe a half-written file and crash with ```Unexpected end of JSON input```.
On a Windows portable install running off a USB drive (slower flush latency), this is reproducible on every gateway restart that happens to coincide with a sanitize or readback. Customer log signature:
```
Failed to read OpenClaw config: SyntaxError: Unexpected end of JSON input
at JSON.parse ...
```
Fix
Introduce writeFileAtomic in a new helper electron/utils/atomic-write.ts:
Routed every writeJsonFile call through the helper, and also routed the one direct writeFile call (modelsPath cleanup) through it so the same atomicity guarantee covers per-agent model registry writes too.
Verification
Type of Change
Pre-Submission Checklist
Related
Companion to #964 (gateway port-deadlock) — both bugs surface mostly under Windows portable + USB workloads but apply across platforms. Discovered while running our downstream portable fork (dongsheng123132/clawx-portable@experimental).