|
| 1 | +# Build isolation for sandboxing build backends |
| 2 | + |
| 3 | +- Author: Pavan Kalyan Reddy Cherupally |
| 4 | +- Created: 2026-04-21 |
| 5 | +- Status: Open |
| 6 | +- Issue: [#1019](https://github.com/python-wheel-build/fromager/issues/1019) |
| 7 | + |
| 8 | +## What |
| 9 | + |
| 10 | +A `--build-isolation` flag that sandboxes PEP 517 build backend |
| 11 | +subprocesses (`build_sdist`, `build_wheel`) so they cannot read |
| 12 | +credentials, access the network, or interfere with the host system. |
| 13 | + |
| 14 | +## Why |
| 15 | + |
| 16 | +Fromager executes upstream-controlled code (setup.py, build backends) |
| 17 | +during wheel builds. A compromised or malicious package can: |
| 18 | + |
| 19 | +- Read credential files like `$HOME/.netrc` and exfiltrate tokens |
| 20 | +- Access sensitive environment variables (registry keys, API tokens) |
| 21 | +- Reach the network to upload stolen data or download payloads |
| 22 | +- Signal or inspect other processes via `/proc` or shared IPC |
| 23 | +- Interfere with parallel builds through shared `/tmp` |
| 24 | +- Leave persistent backdoors: `.pth` files that run on every Python |
| 25 | + startup, shell profile entries that run on every login, or |
| 26 | + background daemons that survive the build |
| 27 | + |
| 28 | +The existing `--network-isolation` flag blocks network access but does |
| 29 | +not protect against credential theft, process/IPC visibility, or |
| 30 | +persistent backdoors. |
| 31 | + |
| 32 | +Build isolation wraps each build backend invocation in a sandbox that |
| 33 | +combines file-level credential protection with OS-level namespace |
| 34 | +isolation. Only the PEP 517 hook calls are sandboxed; download, |
| 35 | +installation, and upload steps run normally. |
| 36 | + |
| 37 | +## Goals |
| 38 | + |
| 39 | +- A `--build-isolation/--no-build-isolation` CLI flag (default off) |
| 40 | + that supersedes `--network-isolation` for build steps |
| 41 | +- Credential protection: build processes cannot read `.netrc` or |
| 42 | + other root-owned credential files |
| 43 | +- Network isolation: no routing in the build namespace |
| 44 | +- Process isolation: build cannot see or signal other processes |
| 45 | +- IPC isolation: separate shared memory, semaphores, message queues |
| 46 | +- Persistence protection: build cannot drop `.pth` backdoors, modify |
| 47 | + shell profiles, or leave background daemons running after the build |
| 48 | +- Environment scrubbing: downstream build systems can strip sensitive |
| 49 | + environment variables via `FROMAGER_SCRUB_ENV_VARS` |
| 50 | +- Works in unprivileged containers (Podman/Docker) without |
| 51 | + `--privileged` or `--cap-add SYS_ADMIN` |
| 52 | +- Minimal overhead (< 50ms per build invocation) |
| 53 | + |
| 54 | +## Non-goals |
| 55 | + |
| 56 | +- **Mount namespace isolation.** Mounting tmpfs over `$HOME` or |
| 57 | + making `/usr` read-only was explored but abandoned. The |
| 58 | + `pyproject_hooks` library creates temporary files in `/tmp` for |
| 59 | + IPC between the parent process and the build backend |
| 60 | + (`input.json`/`output.json`). A mount namespace with a fresh |
| 61 | + `/tmp` hides these files and breaks the build. Bind-mounting the |
| 62 | + specific IPC directory is fragile and couples fromager to |
| 63 | + `pyproject_hooks` internals. |
| 64 | +- **bubblewrap (bwrap).** bwrap provides stronger filesystem |
| 65 | + isolation but requires `CAP_SYS_ADMIN` or a privileged container, |
| 66 | + which is unavailable in the standard unprivileged Podman/Docker |
| 67 | + build environment. |
| 68 | +- **Hardcoded list of sensitive environment variables.** Fromager is |
| 69 | + an upstream tool; the specific variables that are sensitive depend |
| 70 | + on the downstream build system. Scrubbing is controlled entirely |
| 71 | + by the deployer via `FROMAGER_SCRUB_ENV_VARS`. |
| 72 | +- **macOS / Windows support.** Linux namespaces and `unshare` are |
| 73 | + Linux-only. The flag is unavailable on other platforms. |
| 74 | + |
| 75 | +## How |
| 76 | + |
| 77 | +### Isolation mechanism |
| 78 | + |
| 79 | +Build isolation combines two complementary techniques: |
| 80 | + |
| 81 | +#### 1. Ephemeral Unix user |
| 82 | + |
| 83 | +Before each build invocation, the isolation script creates a |
| 84 | +short-lived system user with `useradd` and removes it with `userdel` |
| 85 | +on exit (via `trap EXIT`). The user has: |
| 86 | + |
| 87 | +- No home directory (`-M -d /nonexistent`) |
| 88 | +- No login shell (`-s /sbin/nologin`) |
| 89 | +- A randomized name (`fmr_<random>`) to avoid collisions |
| 90 | + |
| 91 | +This provides file-level credential protection: `.netrc` is owned by |
| 92 | +`root:root` with mode `600`, so the ephemeral user cannot read it. |
| 93 | +The overhead is approximately 10ms for `useradd` and 10ms for |
| 94 | +`userdel`. |
| 95 | + |
| 96 | +#### 2. Linux namespaces via unshare |
| 97 | + |
| 98 | +After dropping to the ephemeral user with `setpriv`, the script |
| 99 | +enters new namespaces with `unshare`: |
| 100 | + |
| 101 | +| Namespace | Flag | Purpose | |
| 102 | +| --- | --- | --- | |
| 103 | +| Network | `--net` | No routing; blocks all network access | |
| 104 | +| PID | `--pid --fork` | Build sees only its own processes | |
| 105 | +| IPC | `--ipc` | Isolated shared memory and semaphores | |
| 106 | +| UTS | `--uts` | Separate hostname | |
| 107 | + |
| 108 | +`--map-root-user` maps the ephemeral user to UID 0 inside the |
| 109 | +namespace, giving it enough privilege to bring up the loopback |
| 110 | +interface and set the hostname without requiring real root. |
| 111 | + |
| 112 | +#### Why setpriv instead of runuser |
| 113 | + |
| 114 | +`runuser` calls `setgroups()`, which is denied inside user namespaces |
| 115 | +(the kernel blocks it to prevent group membership escalation). |
| 116 | +`setpriv --reuid --regid --clear-groups` avoids this call entirely. |
| 117 | + |
| 118 | +#### Order of operations |
| 119 | + |
| 120 | +``` |
| 121 | +useradd fmr_<random> # create ephemeral user (outside namespace) |
| 122 | + └─ setpriv --reuid --regid # drop to ephemeral user |
| 123 | + └─ unshare --uts --net --pid --ipc --fork --map-root-user |
| 124 | + ├─ ip link set lo up |
| 125 | + ├─ hostname localhost |
| 126 | + └─ exec <build command> |
| 127 | +userdel fmr_<random> # cleanup (trap EXIT) |
| 128 | +``` |
| 129 | + |
| 130 | +The user is created before entering the namespace because `useradd` |
| 131 | +needs access to `/etc/passwd` and `/etc/shadow` on the real |
| 132 | +filesystem. `setpriv` drops privileges before `unshare` so the UID |
| 133 | +switch happens outside the namespace where the real UID is mapped. |
| 134 | + |
| 135 | +### Environment variable scrubbing |
| 136 | + |
| 137 | +Downstream build systems may have sensitive environment variables |
| 138 | +(registry tokens, CI credentials) that should not be visible to |
| 139 | +build backends. Rather than hardcoding a list in fromager, scrubbing |
| 140 | +is controlled by the deployer: |
| 141 | + |
| 142 | +```bash |
| 143 | +# In the container image or CI environment |
| 144 | +export FROMAGER_SCRUB_ENV_VARS="NGC_API_KEY,TWINE_PASSWORD,CI_JOB_TOKEN" |
| 145 | +``` |
| 146 | + |
| 147 | +When `--build-isolation` is active, `external_commands.run()` reads |
| 148 | +this comma-separated list and removes the named variables from the |
| 149 | +subprocess environment before invoking the build. |
| 150 | + |
| 151 | +### Integration points |
| 152 | + |
| 153 | +#### CLI (`__main__.py`) |
| 154 | + |
| 155 | +- Build isolation availability is detected at import time (same |
| 156 | + pattern as network isolation) |
| 157 | +- `--build-isolation/--no-build-isolation` option on the `main` |
| 158 | + group, stored on `WorkContext` |
| 159 | +- Fails early with a clear message if the platform does not support |
| 160 | + build isolation |
| 161 | + |
| 162 | +#### WorkContext (`context.py`) |
| 163 | + |
| 164 | +- New `build_isolation: bool` field (default `False`) |
| 165 | + |
| 166 | +#### BuildEnvironment (`build_environment.py`) |
| 167 | + |
| 168 | +- `run()` method accepts `build_isolation` parameter, defaults to |
| 169 | + `ctx.build_isolation` |
| 170 | +- `install()` method explicitly passes `build_isolation=False` |
| 171 | + because dependency installation needs access to the local PyPI |
| 172 | + mirror |
| 173 | + |
| 174 | +#### Build backend hooks (`dependencies.py`) |
| 175 | + |
| 176 | +- `_run_hook_with_extra_environ` passes `ctx.build_isolation` to |
| 177 | + `build_env.run()` |
| 178 | + |
| 179 | +#### Subprocess runner (`external_commands.py`) |
| 180 | + |
| 181 | +- `run()` accepts `build_isolation: bool` parameter |
| 182 | +- When active, prepends the isolation script to the command, |
| 183 | + sets `FROMAGER_BUILD_DIR` so the script can `chmod` the build |
| 184 | + directory for the ephemeral user, applies env scrubbing, and sets |
| 185 | + `CARGO_NET_OFFLINE=true` |
| 186 | +- Build isolation supersedes network isolation but reuses the |
| 187 | + `NetworkIsolationError` detection for consistent error reporting |
| 188 | + |
| 189 | +### What is and is not isolated |
| 190 | + |
| 191 | +| Aspect | Protected | Notes | |
| 192 | +| --- | --- | --- | |
| 193 | +| `.netrc` / credentials | Yes | Ephemeral user cannot read root:root 600 files | |
| 194 | +| Network access | Yes | No routing in network namespace | |
| 195 | +| Process visibility | Yes | PID namespace; only build processes visible | |
| 196 | +| IPC (shm, semaphores) | Yes | IPC namespace | |
| 197 | +| Env var leakage | Configurable | Via `FROMAGER_SCRUB_ENV_VARS` | |
| 198 | +| `.pth` / shell profile backdoors | Yes | Ephemeral user cannot write to site-packages or home directory | |
| 199 | +| Persistent background process | Yes | PID namespace kills all processes when the build exits | |
| 200 | +| `/tmp` cross-build leakage | Partial | Sticky bit prevents cross-user access; no mount namespace | |
| 201 | +| Filesystem write access | No | Ephemeral user has world-writable access to build dir | |
| 202 | +| Trojan in build output | No | Malicious code in the built wheel is not detected | |
| 203 | + |
| 204 | +### Compatibility |
| 205 | + |
| 206 | +Works in unprivileged Podman and Docker containers without |
| 207 | +`--privileged` or `--cap-add SYS_ADMIN`. Docker's default seccomp |
| 208 | +profile may block `unshare`; Podman's policy allows it. On Ubuntu |
| 209 | +24.04, `sysctl kernel.apparmor_restrict_unprivileged_userns=0` is |
| 210 | +required. |
| 211 | + |
| 212 | +## Examples |
| 213 | + |
| 214 | +```bash |
| 215 | +# Build with full isolation |
| 216 | +fromager --build-isolation bootstrap -r requirements.txt |
| 217 | + |
| 218 | +# Build with isolation and env scrubbing |
| 219 | +FROMAGER_SCRUB_ENV_VARS="NGC_API_KEY,TWINE_PASSWORD" \ |
| 220 | + fromager --build-isolation bootstrap -r requirements.txt |
| 221 | +``` |
| 222 | + |
| 223 | +## Findings |
| 224 | + |
| 225 | +A proof-of-concept package |
| 226 | +([build-attack-test](https://github.com/pavank63/build-attack-test)) |
| 227 | +was used to validate the attack surface. It runs security probes from |
| 228 | +`setup.py` during `build_sdist` / `build_wheel` to test what a |
| 229 | +malicious build backend can access. Testing was performed with |
| 230 | +`--network-isolation` enabled. |
| 231 | + |
| 232 | +### Results without build isolation |
| 233 | + |
| 234 | +| Attack vector | Result | Risk | |
| 235 | +| --- | --- | --- | |
| 236 | +| Credential file access (`.netrc`) | **Vulnerable** | Build process can read credential files containing auth tokens | |
| 237 | +| Sensitive environment variables | **Vulnerable** | Build system variables (registry paths, tokens) visible to backends | |
| 238 | +| Network access | Blocked | Already mitigated by `--network-isolation` | |
| 239 | +| Process visibility (PID) | **Vulnerable** | Build can see all running processes including fromager, parallel builds, and their command-line arguments | |
| 240 | +| IPC (shared memory, semaphores) | **Vulnerable** | Build can see and potentially attach to shared memory segments from other processes | |
| 241 | +| Hostname | **Vulnerable** | Real hostname visible, leaks build infrastructure identity | |
| 242 | +| Build cache read/write | **Vulnerable** | Build can read and write to shared compiler caches like ccache and cargo, enabling cache poisoning | |
| 243 | +| Package settings files | **Vulnerable** | Build can read all package override configuration files | |
| 244 | +| Persistent background process | **Vulnerable** | Build can spawn a daemon that continues running after the build finishes | |
| 245 | +| Python `.pth` backdoor | **Vulnerable** | Build can drop a `.pth` file into site-packages that runs code on every Python startup | |
| 246 | +| Shell profile injection | **Vulnerable** | Build can append to `.bashrc` / `.profile` to run code on every shell login | |
| 247 | +| pip config poisoning | **Vulnerable** | Build can write `pip.conf` to redirect dependency installs to an attacker-controlled index | |
| 248 | + |
| 249 | +### Key takeaways |
| 250 | + |
| 251 | +1. **Network isolation alone is insufficient.** A build can steal |
| 252 | + credentials from `.netrc` and embed them in the built wheel. The |
| 253 | + credentials leave the build system when the wheel is distributed, |
| 254 | + bypassing network controls entirely. |
| 255 | + |
| 256 | +2. **Builds can leave persistent backdoors.** `.pth` files, shell |
| 257 | + profile entries, pip config changes, and background daemons all |
| 258 | + survive the build and can compromise subsequent builds or the |
| 259 | + host. |
| 260 | + |
| 261 | +3. **Build cache poisoning is possible.** A poisoned compiler cache |
| 262 | + entry (ccache, cargo) can inject malicious code into future |
| 263 | + builds of unrelated packages. |
| 264 | + |
| 265 | +### Supply-chain amplification |
| 266 | + |
| 267 | +The persistence attacks above are especially dangerous because |
| 268 | +fromager builds many packages sequentially in the same environment. |
| 269 | +A single malicious package built early in the bootstrap can |
| 270 | +compromise every package built after it: |
| 271 | + |
| 272 | +- A `.pth` file dropped into site-packages runs on every subsequent |
| 273 | + Python invocation, including fromager building the next package. |
| 274 | + It can silently modify source files or inject code into build |
| 275 | + outputs. |
| 276 | +- A poisoned `pip.conf` redirects dependency installs for all |
| 277 | + subsequent builds to an attacker-controlled index. |
| 278 | +- A poisoned compiler cache entry (ccache/cargo) injects malicious |
| 279 | + code into any later package that compiles the same source file. |
| 280 | +- A background daemon can watch the build directory and modify |
| 281 | + source code for the next package before its build starts. |
| 282 | + |
| 283 | +The published wheels for those downstream packages would contain |
| 284 | +the injected code even though their source is clean. |
| 285 | + |
| 286 | +Build isolation breaks this chain. Each build runs as a separate |
| 287 | +ephemeral user in its own PID, IPC, and network namespace, so it |
| 288 | +cannot write to site-packages, modify pip config, poison caches, |
| 289 | +or leave daemons behind. When fromager runs parallel builds, each |
| 290 | +gets its own ephemeral user (`fmr_<random>`) and its own set of |
| 291 | +namespaces — parallel builds cannot see or interfere with each |
| 292 | +other. |
| 293 | + |
| 294 | +### Remaining gaps |
| 295 | + |
| 296 | +Build cache poisoning and package settings access are **not fully |
| 297 | +addressed** by this proposal, as the ephemeral user still needs |
| 298 | +write access to the build directory. Addressing these would require |
| 299 | +mount namespace isolation, which is incompatible with the current |
| 300 | +`pyproject_hooks` IPC mechanism (see Non-goals). |
0 commit comments