fix(cli): close sandbox filesystem TOCTOU race#11591
Conversation
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (1 files)
Previous Review Summaries (7 snapshots, latest commit e69828e)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit e69828e)Status: 1 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Files Reviewed (27 files)
Previous review (commit 61c7f97)Status: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Files Reviewed (25 files)
Previous review (commit 10b06b9)Status: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Files Reviewed (21 files)
Previous review (commit e4e8d43)Status: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous review (commit 06fac86)Status: 1 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Files Reviewed (2 files)
Previous review (commit e232ca9)Status: No Issues Found | Recommendation: Merge Files Reviewed (3 files)
Previous review (commit 76a9d9b)Status: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
SUGGESTION
Files Reviewed (15 files)
Reviewed by gpt-5.4-20260305 · Input: 53.5K · Output: 4.4K · Cached: 197.4K Review guidance: REVIEW.md from base branch |
|
I profiled the filesystem mutation path before and after this change in the same disposable macOS repository through an isolated VS Code extension instance. The repository had Tool durations come from persisted tool-part timestamps. Parallel envelope measures the interval from the earliest call start to the latest call completion.
I also measured the compiled one-shot worker directly over 10 runs:
The VS Code renderer profiles showed that frontend CPU was not the bottleneck. Most of the regression is process startup: each finite mutation launches the full compiled Kilo binary under These are single-run before/after tool measurements plus a 10-run direct worker sample, so they are directional rather than statistically complete. The result is still clear: the security boundary works, but the current full-binary one-shot worker has a material interactive performance cost that should be optimized before treating this design as final. |
|
Updated performance results after replacing full CLI initialization with the bundled This supersedes the performance conclusion in my earlier benchmark comment. These measurements use the final packaged
Final packaged-worker distributions:
Relative to the full-CLI worker, this is approximately:
The optimized design keeps the one-shot process boundary and call-local OS sandbox profile. The change is that the child now loads only a roughly 4 KB bundled mutation worker instead of initializing the complete CLI application graph. Ordered batching also lets directory creation, writes, mode changes, and multi-file ApplyPatch operations share one confined worker launch while the kernel still checks every filesystem syscall. Compared with the original unconfined implementation, the remaining measured median cost is approximately 116-121 ms for sequential tool operations and 46 ms across the six-write parallel envelope. Measurement limitation: the latest 20-sample numbers isolate the final packaged worker and batch protocol rather than including model latency or VS Code rendering. The earlier original/full-worker values are persisted real tool durations from the isolated VS Code benchmark. The real macOS confinement suite remains the functional security validation for the optimized path. |
PR #11591 rewrote build.ts during the sandbox refactor, which recreated the file with default 644 permissions and dropped the executable bit (100755 -> 100644). The same change stripped the bit from build-node.ts and publish.ts. The pre-release publish workflow invokes ./packages/opencode/script/build.ts directly, relying on its #!/usr/bin/env bun shebang. On Linux CI the non-executable mode fails with Permission denied before the build starts. Restore mode 100755 on all three scripts. Content is unchanged.
PR Kilo-Org#11591 rewrote build.ts during the sandbox refactor, which recreated the file with default 644 permissions and dropped the executable bit (100755 -> 100644). The same change stripped the bit from build-node.ts and publish.ts. The pre-release publish workflow invokes ./packages/opencode/script/build.ts directly, relying on its #!/usr/bin/env bun shebang. On Linux CI the non-executable mode fails with Permission denied before the build starts. Restore mode 100755 on all three scripts. Content is unchanged.
PR Kilo-Org#11591 rewrote build.ts during the sandbox refactor, which recreated the file with default 644 permissions and dropped the executable bit (100755 -> 100644). The same change stripped the bit from build-node.ts and publish.ts. The pre-release publish workflow invokes ./packages/opencode/script/build.ts directly, relying on its #!/usr/bin/env bun shebang. On Linux CI the non-executable mode fails with Permission denied before the build starts. Restore mode 100755 on all three scripts. Content is unchanged.
The macOS sandbox previously validated built-in filesystem mutations in TypeScript and then performed the mutation through the same pathname in the long-lived Kilo process. A concurrent process could replace a validated parent directory with a symlink after validation but before the filesystem syscall, redirecting
Write,Edit, orApplyPatchoutside the allowed project roots or into protected.gitstate. Shell children were already protected because Seatbelt evaluates their actual destination at syscall time, but built-in mutations did not have that kernel-enforced boundary.This closes the gap by routing every finite Effect filesystem mutation through a one-shot worker launched under the active, replaceable OS sandbox backend. The existing pathname checks remain useful for early diagnostics, but Seatbelt now decides whether the mutation is allowed when the worker performs the actual syscall. Additional
realpathorlstatchecks are deliberately not treated as a security boundary.Mitigation design
ensureDirandwriteWithDirsparent creation through the decorated filesystem rather than bypassing it with rawfs.mkdircalls.external_directoryauthorizes the tool request but does not add that directory to the sandbox profile, and.gitremains denied by name.Attack vectors considered
The security assessment covered both direct pathname attacks and attempts to avoid pathname checks entirely:
.git./tmpaliases,//privatealiases, case variants, and macOS/System/Volumes/Dataaliases.openat,mkdirat,renameat, andlinkatoperations, plus relative writes afterfchdirinto a protected directory.Regression boundary
The macOS integration harness invokes real
/usr/bin/sandbox-execbehavior. Its adversarial hook starts atomic parent swapping after userspace validation and before the confined mutation, so the assertion is not dependent on lucky timing. The harness was also exercised with a vulnerable direct mutation runner to confirm that the protected sentinel is modified without syscall-time confinement.The suite drives the real shared filesystem capability used by
Write,Edit, andApplyPatchagainst both external and.gittargets. It also covers allowed writes, safe in-project symlinks, shell confinement, permission approval and denial, background-process rejection, UTF-16 LE, Windows-1251, UTF-8 BOM, formatter synchronization, operation options, and temporary-resource cleanup. The real Seatbelt suite is wired only into the macOS CI lane; other platforms explicitly skip OS-specific cases while retaining portable fail-closed coverage.Tradeoffs
openhandles are intentionally unavailable until a descriptor-safe cross-process design exists.Fixes #11554
Parent epic: #11538