Skip to content

Commit 5d3f92f

Browse files
Add sandbox-first redesign specification (#135)
1 parent 152ad6a commit 5d3f92f

1 file changed

Lines changed: 237 additions & 0 deletions

File tree

docs/sandbox-redesign.md

Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
# Sandbox-First Redesign Specification
2+
3+
This document is a concrete, implementation-ready specification for rebuilding the execution engine around **sandbox-first primitives**. The goal is to replace the current “exec + ulimit” runtime with a Linux namespace + cgroup based sandbox runner while preserving the existing controller/runner flow.
4+
5+
---
6+
7+
## Goals
8+
9+
### Primary
10+
- **Hard isolation**: separate PID, mount, UTS, and user namespaces for each run.
11+
- **Deterministic resource limits**: enforce CPU, memory, and PIDs using cgroups.
12+
- **Minimal host exposure**: run with a scratch filesystem; no network by default.
13+
- **API compatibility**: keep the existing CodeRunner + Controller flow intact.
14+
15+
### Non-goals (for the initial POC)
16+
- Full container image support (e.g., OCI images).
17+
- Multi-node scheduling.
18+
- Network policy enforcement beyond “no network”.
19+
20+
---
21+
22+
## Proposed Architecture (Sandbox-First)
23+
24+
### 1) New Interfaces
25+
Create a new sandbox runtime module that becomes the primary execution abstraction.
26+
27+
```go
28+
// engine/sandbox/types.go
29+
package sandbox
30+
31+
type SandboxPolicy struct {
32+
CpuCores int // cpu quota or shares
33+
MemoryBytes int64 // memory limit
34+
PidsMax int // max processes
35+
TimeoutSec int // wall-clock timeout
36+
EnableNet bool // default: false
37+
ReadonlyRoot bool // default: true
38+
}
39+
40+
type SandboxInput struct {
41+
SourceFiles map[string][]byte // filename -> contents
42+
WorkDir string
43+
Command []string
44+
}
45+
46+
type SandboxOutput struct {
47+
Stdout string
48+
Stderr string
49+
ExitCode int
50+
}
51+
52+
type SandboxRunner interface {
53+
Run(input SandboxInput, policy SandboxPolicy) (*SandboxOutput, error)
54+
}
55+
```
56+
57+
### 2) Linux Sandbox Implementation
58+
Add a `linux` sandbox implementation using namespaces, cgroups, and seccomp (phase 2).
59+
60+
```
61+
engine/sandbox/
62+
linux/
63+
runner.go
64+
namespaces.go
65+
cgroups.go
66+
filesystem.go
67+
seccomp.go (phase 2)
68+
```
69+
70+
### 3) Integration Points
71+
- **CodeRunner** builds compile/run commands and passes them to the sandbox runner.
72+
- **Controller** remains the concurrency + scheduling layer.
73+
- **RuntimeAgent** becomes a thin wrapper around `SandboxRunner`.
74+
75+
---
76+
77+
## Implementation Checklist (Actionable)
78+
79+
### Phase 0 — Repo scaffolding
80+
- [ ] Create `engine/sandbox` module with types and interfaces.
81+
- [ ] Add a `linux` subpackage for namespace + cgroup implementation.
82+
- [ ] Add basic logging utilities (reusing existing `util/print`).
83+
84+
**Success criteria**
85+
- `go test ./...` succeeds with the new module added.
86+
87+
---
88+
89+
### Phase 1 — Linux namespace runner (no cgroups yet)
90+
- [ ] Implement `unshare` / `clone` logic (PID + mount + UTS + IPC).
91+
- [ ] Ensure child process runs with isolated namespace context.
92+
- [ ] Mount a tmpfs or scratch directory as root; bind-mount language runtimes as needed.
93+
- [ ] Disable network by default (unshare network namespace, no interfaces).
94+
- [ ] Route stdout/stderr to parent for capture.
95+
96+
**Success criteria**
97+
- Running a sandboxed command cannot see host processes (`ps` shows only itself).
98+
- Running inside sandbox cannot access `/etc/shadow` or host filesystem.
99+
- No outbound network unless explicitly enabled.
100+
101+
**Testing criteria**
102+
-`go test ./engine/sandbox/linux -run TestNamespaces`
103+
- ✅ Integration test that runs `ls /` inside sandbox and confirms minimal FS.
104+
105+
---
106+
107+
### Phase 2 — Cgroup enforcement
108+
- [ ] Implement `cgroups.go` with CPU, memory, and pids cgroup setup (v2 preferred).
109+
- [ ] Apply resource limits before running the child process.
110+
- [ ] Ensure subprocess trees are restricted.
111+
112+
**Success criteria**
113+
- CPU-bound infinite loop is throttled and/or killed by cgroup limits.
114+
- Memory exhaustion triggers OOM kill within sandbox, not host.
115+
- Fork bomb fails with `pids.max` restriction.
116+
117+
**Testing criteria**
118+
-`go test ./engine/sandbox/linux -run TestCgroupLimits`
119+
- ✅ Manual: run a fork bomb script and ensure it terminates without host impact.
120+
121+
---
122+
123+
### Phase 3 — Wire into CodeRunner
124+
- [ ] Replace `RuntimeAgent.SafeRunCmd` usage with `SandboxRunner.Run`.
125+
- [ ] Convert `RunnerProps` into `SandboxInput`.
126+
- [ ] Maintain compile → run flow (compile step also sandboxed).
127+
128+
**Success criteria**
129+
- Existing API calls still return stdout/stderr/errors correctly.
130+
- All compile/run languages still work (python, node, go, etc.).
131+
132+
**Testing criteria**
133+
-`go test ./engine/coderunner/v2 -run TestRunner`
134+
- ✅ End-to-end: CLI invocation executes Python and C++ in sandbox.
135+
136+
---
137+
138+
### Phase 4 — Optional (Security hardening)
139+
- [ ] Add seccomp allowlist for syscalls.
140+
- [ ] Drop all Linux capabilities.
141+
- [ ] Set no-new-privileges.
142+
143+
**Success criteria**
144+
- Common languages still run with restricted syscall profile.
145+
- Obvious privileged syscalls fail inside sandbox.
146+
147+
**Testing criteria**
148+
-`go test ./engine/sandbox/linux -run TestSeccomp`
149+
150+
---
151+
152+
## Detailed Implementation Notes
153+
154+
### Namespaces
155+
Use `clone`/`unshare` with:
156+
- `CLONE_NEWPID`
157+
- `CLONE_NEWNS`
158+
- `CLONE_NEWUTS`
159+
- `CLONE_NEWIPC`
160+
- `CLONE_NEWNET`
161+
162+
Set up mount namespace with:
163+
- `mount("tmpfs", "/", "tmpfs", 0, "")`
164+
- Bind-mount required runtime paths (`/usr/bin/python3`, `/lib`, `/lib64`, etc.)
165+
166+
### Filesystem
167+
- Create per-job work directory (e.g., `/tmp/sandbox/<job-id>`).
168+
- Bind-mount that directory as `/work` inside sandbox.
169+
- Optionally use read-only root with overlayfs if needed.
170+
171+
### Cgroups v2
172+
- Create cgroup per job under `/sys/fs/cgroup/sandbox/<job-id>`.
173+
- Set `memory.max`, `cpu.max`, `pids.max`.
174+
- Move child PID to cgroup.
175+
176+
### Execution Model
177+
- Parent sets up sandbox environment.
178+
- Child executes `execve` within namespace and cgroup.
179+
- Parent captures stdout/stderr and enforces timeout.
180+
181+
---
182+
183+
## Suggested Package Layout
184+
```
185+
engine/
186+
sandbox/
187+
types.go
188+
linux/
189+
runner.go
190+
namespaces.go
191+
filesystem.go
192+
cgroups.go
193+
seccomp.go (optional)
194+
```
195+
196+
---
197+
198+
## Example POC User Flow
199+
1. CodeRunner receives request.
200+
2. CodeRunner creates `SandboxInput` and `SandboxPolicy`.
201+
3. SandboxRunner sets namespaces + cgroups.
202+
4. Sandbox executes compile and run steps.
203+
5. Output returned to API.
204+
205+
---
206+
207+
## Acceptance Checklist (Final)
208+
- [ ] `SandboxRunner` interface defined and used.
209+
- [ ] Linux namespace runner implemented.
210+
- [ ] Cgroup limits enforced with v2.
211+
- [ ] CodeRunner integrated and working.
212+
- [ ] Tests cover namespace isolation + cgroup enforcement.
213+
- [ ] No network by default.
214+
215+
---
216+
217+
## Why this design fits your current repo
218+
- Keeps your existing controller + runner architecture intact.
219+
- Isolates the new sandbox logic into a dedicated module.
220+
- Lets you iteratively upgrade security without rewriting everything.
221+
222+
---
223+
224+
## Next Steps (Recommended)
225+
1. Implement Phase 0 + Phase 1.
226+
2. Run minimal compile/run tests with Python + C++.
227+
3. Add cgroup limits and validate with load tests.
228+
4. Only then add seccomp and other hardening.
229+
230+
---
231+
232+
## Success Definition
233+
The rewrite is successful if:
234+
- User code runs in a hardened sandbox with **no host FS/network visibility**.
235+
- Resource limits are enforced at the kernel level (cgroups + namespaces).
236+
- The public API behavior remains unchanged.
237+

0 commit comments

Comments
 (0)