Skip to content

Commit 60e3892

Browse files
pavank63claude
andcommitted
docs(proposal): add build isolation design for sandboxed builds
Add design proposal for --build-isolation flag that sandboxes PEP 517 build backend subprocesses using ephemeral Unix users and Linux namespaces. Includes security findings from proof-of-concept testing with build-attack-test package. Signed-off-by: Pavan Kalyan Reddy Cherupally <pcherupa@redhat.com> Co-Authored-By: Claude <claude@anthropic.com>
1 parent 1926e29 commit 60e3892

2 files changed

Lines changed: 301 additions & 0 deletions

File tree

docs/proposals/build-isolation.md

Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# Build isolation for sandboxing build backends
2+
3+
- Author: Pavan Kalyan Reddy Cherupally
4+
- Created: 2026-04-21
5+
- Status: Open
6+
- Issue: [#1019](https://github.com/python-wheel-build/fromager/issues/1019)
7+
8+
## What
9+
10+
A `--build-isolation` flag that sandboxes PEP 517 build backend
11+
subprocesses (`build_sdist`, `build_wheel`) so they cannot read
12+
credentials, access the network, or interfere with the host system.
13+
14+
## Why
15+
16+
Fromager executes upstream-controlled code (setup.py, build backends)
17+
during wheel builds. A compromised or malicious package can:
18+
19+
- Read credential files like `$HOME/.netrc` and exfiltrate tokens
20+
- Access sensitive environment variables (registry keys, API tokens)
21+
- Reach the network to upload stolen data or download payloads
22+
- Signal or inspect other processes via `/proc` or shared IPC
23+
- Interfere with parallel builds through shared `/tmp`
24+
- Leave persistent backdoors: `.pth` files that run on every Python
25+
startup, shell profile entries that run on every login, or
26+
background daemons that survive the build
27+
28+
The existing `--network-isolation` flag blocks network access but does
29+
not protect against credential theft, process/IPC visibility, or
30+
persistent backdoors.
31+
32+
Build isolation wraps each build backend invocation in a sandbox that
33+
combines file-level credential protection with OS-level namespace
34+
isolation. Only the PEP 517 hook calls are sandboxed; download,
35+
installation, and upload steps run normally.
36+
37+
## Goals
38+
39+
- A `--build-isolation/--no-build-isolation` CLI flag (default off)
40+
that supersedes `--network-isolation` for build steps
41+
- Credential protection: build processes cannot read `.netrc` or
42+
other root-owned credential files
43+
- Network isolation: no routing in the build namespace
44+
- Process isolation: build cannot see or signal other processes
45+
- IPC isolation: separate shared memory, semaphores, message queues
46+
- Persistence protection: build cannot drop `.pth` backdoors, modify
47+
shell profiles, or leave background daemons running after the build
48+
- Environment scrubbing: downstream build systems can strip sensitive
49+
environment variables via `FROMAGER_SCRUB_ENV_VARS`
50+
- Works in unprivileged containers (Podman/Docker) without
51+
`--privileged` or `--cap-add SYS_ADMIN`
52+
- Minimal overhead (< 50ms per build invocation)
53+
54+
## Non-goals
55+
56+
- **Mount namespace isolation.** Mounting tmpfs over `$HOME` or
57+
making `/usr` read-only was explored but abandoned. The
58+
`pyproject_hooks` library creates temporary files in `/tmp` for
59+
IPC between the parent process and the build backend
60+
(`input.json`/`output.json`). A mount namespace with a fresh
61+
`/tmp` hides these files and breaks the build. Bind-mounting the
62+
specific IPC directory is fragile and couples fromager to
63+
`pyproject_hooks` internals.
64+
- **bubblewrap (bwrap).** bwrap provides stronger filesystem
65+
isolation but requires `CAP_SYS_ADMIN` or a privileged container,
66+
which is unavailable in the standard unprivileged Podman/Docker
67+
build environment.
68+
- **Hardcoded list of sensitive environment variables.** Fromager is
69+
an upstream tool; the specific variables that are sensitive depend
70+
on the downstream build system. Scrubbing is controlled entirely
71+
by the deployer via `FROMAGER_SCRUB_ENV_VARS`.
72+
- **macOS / Windows support.** Linux namespaces and `unshare` are
73+
Linux-only. The flag is unavailable on other platforms.
74+
75+
## How
76+
77+
### Isolation mechanism
78+
79+
Build isolation combines two complementary techniques:
80+
81+
#### 1. Ephemeral Unix user
82+
83+
Before each build invocation, the isolation script creates a
84+
short-lived system user with `useradd` and removes it with `userdel`
85+
on exit (via `trap EXIT`). The user has:
86+
87+
- No home directory (`-M -d /nonexistent`)
88+
- No login shell (`-s /sbin/nologin`)
89+
- A randomized name (`fmr_<random>`) to avoid collisions
90+
91+
This provides file-level credential protection: `.netrc` is owned by
92+
`root:root` with mode `600`, so the ephemeral user cannot read it.
93+
The overhead is approximately 10ms for `useradd` and 10ms for
94+
`userdel`.
95+
96+
#### 2. Linux namespaces via unshare
97+
98+
After dropping to the ephemeral user with `setpriv`, the script
99+
enters new namespaces with `unshare`:
100+
101+
| Namespace | Flag | Purpose |
102+
| --- | --- | --- |
103+
| Network | `--net` | No routing; blocks all network access |
104+
| PID | `--pid --fork` | Build sees only its own processes |
105+
| IPC | `--ipc` | Isolated shared memory and semaphores |
106+
| UTS | `--uts` | Separate hostname |
107+
108+
`--map-root-user` maps the ephemeral user to UID 0 inside the
109+
namespace, giving it enough privilege to bring up the loopback
110+
interface and set the hostname without requiring real root.
111+
112+
#### Why setpriv instead of runuser
113+
114+
`runuser` calls `setgroups()`, which is denied inside user namespaces
115+
(the kernel blocks it to prevent group membership escalation).
116+
`setpriv --reuid --regid --clear-groups` avoids this call entirely.
117+
118+
#### Order of operations
119+
120+
```
121+
useradd fmr_<random> # create ephemeral user (outside namespace)
122+
└─ setpriv --reuid --regid # drop to ephemeral user
123+
└─ unshare --uts --net --pid --ipc --fork --map-root-user
124+
├─ ip link set lo up
125+
├─ hostname localhost
126+
└─ exec <build command>
127+
userdel fmr_<random> # cleanup (trap EXIT)
128+
```
129+
130+
The user is created before entering the namespace because `useradd`
131+
needs access to `/etc/passwd` and `/etc/shadow` on the real
132+
filesystem. `setpriv` drops privileges before `unshare` so the UID
133+
switch happens outside the namespace where the real UID is mapped.
134+
135+
### Environment variable scrubbing
136+
137+
Downstream build systems may have sensitive environment variables
138+
(registry tokens, CI credentials) that should not be visible to
139+
build backends. Rather than hardcoding a list in fromager, scrubbing
140+
is controlled by the deployer:
141+
142+
```bash
143+
# In the container image or CI environment
144+
export FROMAGER_SCRUB_ENV_VARS="NGC_API_KEY,TWINE_PASSWORD,CI_JOB_TOKEN"
145+
```
146+
147+
When `--build-isolation` is active, `external_commands.run()` reads
148+
this comma-separated list and removes the named variables from the
149+
subprocess environment before invoking the build.
150+
151+
### Integration points
152+
153+
#### CLI (`__main__.py`)
154+
155+
- Build isolation availability is detected at import time (same
156+
pattern as network isolation)
157+
- `--build-isolation/--no-build-isolation` option on the `main`
158+
group, stored on `WorkContext`
159+
- Fails early with a clear message if the platform does not support
160+
build isolation
161+
162+
#### WorkContext (`context.py`)
163+
164+
- New `build_isolation: bool` field (default `False`)
165+
166+
#### BuildEnvironment (`build_environment.py`)
167+
168+
- `run()` method accepts `build_isolation` parameter, defaults to
169+
`ctx.build_isolation`
170+
- `install()` method explicitly passes `build_isolation=False`
171+
because dependency installation needs access to the local PyPI
172+
mirror
173+
174+
#### Build backend hooks (`dependencies.py`)
175+
176+
- `_run_hook_with_extra_environ` passes `ctx.build_isolation` to
177+
`build_env.run()`
178+
179+
#### Subprocess runner (`external_commands.py`)
180+
181+
- `run()` accepts `build_isolation: bool` parameter
182+
- When active, prepends the isolation script to the command,
183+
sets `FROMAGER_BUILD_DIR` so the script can `chmod` the build
184+
directory for the ephemeral user, applies env scrubbing, and sets
185+
`CARGO_NET_OFFLINE=true`
186+
- Build isolation supersedes network isolation but reuses the
187+
`NetworkIsolationError` detection for consistent error reporting
188+
189+
### What is and is not isolated
190+
191+
| Aspect | Protected | Notes |
192+
| --- | --- | --- |
193+
| `.netrc` / credentials | Yes | Ephemeral user cannot read root:root 600 files |
194+
| Network access | Yes | No routing in network namespace |
195+
| Process visibility | Yes | PID namespace; only build processes visible |
196+
| IPC (shm, semaphores) | Yes | IPC namespace |
197+
| Env var leakage | Configurable | Via `FROMAGER_SCRUB_ENV_VARS` |
198+
| `.pth` / shell profile backdoors | Yes | Ephemeral user cannot write to site-packages or home directory |
199+
| Persistent background process | Yes | PID namespace kills all processes when the build exits |
200+
| `/tmp` cross-build leakage | Partial | Sticky bit prevents cross-user access; no mount namespace |
201+
| Filesystem write access | No | Ephemeral user has world-writable access to build dir |
202+
| Trojan in build output | No | Malicious code in the built wheel is not detected |
203+
204+
### Compatibility
205+
206+
Works in unprivileged Podman and Docker containers without
207+
`--privileged` or `--cap-add SYS_ADMIN`. Docker's default seccomp
208+
profile may block `unshare`; Podman's policy allows it. On Ubuntu
209+
24.04, `sysctl kernel.apparmor_restrict_unprivileged_userns=0` is
210+
required.
211+
212+
## Examples
213+
214+
```bash
215+
# Build with full isolation
216+
fromager --build-isolation bootstrap -r requirements.txt
217+
218+
# Build with isolation and env scrubbing
219+
FROMAGER_SCRUB_ENV_VARS="NGC_API_KEY,TWINE_PASSWORD" \
220+
fromager --build-isolation bootstrap -r requirements.txt
221+
```
222+
223+
## Findings
224+
225+
A proof-of-concept package
226+
([build-attack-test](https://github.com/pavank63/build-attack-test))
227+
was used to validate the attack surface. It runs security probes from
228+
`setup.py` during `build_sdist` / `build_wheel` to test what a
229+
malicious build backend can access. Testing was performed with
230+
`--network-isolation` enabled.
231+
232+
### Results without build isolation
233+
234+
| Attack vector | Result | Risk |
235+
| --- | --- | --- |
236+
| Credential file access (`.netrc`) | **Vulnerable** | Build process can read credential files containing auth tokens |
237+
| Sensitive environment variables | **Vulnerable** | Build system variables (registry paths, tokens) visible to backends |
238+
| Network access | Blocked | Already mitigated by `--network-isolation` |
239+
| Process visibility (PID) | **Vulnerable** | Build can see all running processes including fromager, parallel builds, and their command-line arguments |
240+
| IPC (shared memory, semaphores) | **Vulnerable** | Build can see and potentially attach to shared memory segments from other processes |
241+
| Hostname | **Vulnerable** | Real hostname visible, leaks build infrastructure identity |
242+
| Build cache read/write | **Vulnerable** | Build can read and write to shared compiler caches like ccache and cargo, enabling cache poisoning |
243+
| Package settings files | **Vulnerable** | Build can read all package override configuration files |
244+
| Persistent background process | **Vulnerable** | Build can spawn a daemon that continues running after the build finishes |
245+
| Python `.pth` backdoor | **Vulnerable** | Build can drop a `.pth` file into site-packages that runs code on every Python startup |
246+
| Shell profile injection | **Vulnerable** | Build can append to `.bashrc` / `.profile` to run code on every shell login |
247+
| pip config poisoning | **Vulnerable** | Build can write `pip.conf` to redirect dependency installs to an attacker-controlled index |
248+
249+
### Key takeaways
250+
251+
1. **Network isolation alone is insufficient.** A build can steal
252+
credentials from `.netrc` and embed them in the built wheel. The
253+
credentials leave the build system when the wheel is distributed,
254+
bypassing network controls entirely.
255+
256+
2. **Builds can leave persistent backdoors.** `.pth` files, shell
257+
profile entries, pip config changes, and background daemons all
258+
survive the build and can compromise subsequent builds or the
259+
host.
260+
261+
3. **Build cache poisoning is possible.** A poisoned compiler cache
262+
entry (ccache, cargo) can inject malicious code into future
263+
builds of unrelated packages.
264+
265+
### Supply-chain amplification
266+
267+
The persistence attacks above are especially dangerous because
268+
fromager builds many packages sequentially in the same environment.
269+
A single malicious package built early in the bootstrap can
270+
compromise every package built after it:
271+
272+
- A `.pth` file dropped into site-packages runs on every subsequent
273+
Python invocation, including fromager building the next package.
274+
It can silently modify source files or inject code into build
275+
outputs.
276+
- A poisoned `pip.conf` redirects dependency installs for all
277+
subsequent builds to an attacker-controlled index.
278+
- A poisoned compiler cache entry (ccache/cargo) injects malicious
279+
code into any later package that compiles the same source file.
280+
- A background daemon can watch the build directory and modify
281+
source code for the next package before its build starts.
282+
283+
The published wheels for those downstream packages would contain
284+
the injected code even though their source is clean.
285+
286+
Build isolation breaks this chain. Each build runs as a separate
287+
ephemeral user in its own PID, IPC, and network namespace, so it
288+
cannot write to site-packages, modify pip config, poison caches,
289+
or leave daemons behind. When fromager runs parallel builds, each
290+
gets its own ephemeral user (`fmr_<random>`) and its own set of
291+
namespaces — parallel builds cannot see or interfere with each
292+
other.
293+
294+
### Remaining gaps
295+
296+
Build cache poisoning and package settings access are **not fully
297+
addressed** by this proposal, as the ephemeral user still needs
298+
write access to the build directory. Addressing these would require
299+
mount namespace isolation, which is incompatible with the current
300+
`pyproject_hooks` IPC mechanism (see Non-goals).

docs/proposals/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Fromager Enhancement Proposals
44
.. toctree::
55
:maxdepth: 1
66

7+
build-isolation
78
new-patcher-config
89
new-resolver-config
910
release-cooldown

0 commit comments

Comments
 (0)