Skip to content

fix(security): Restrict sandbox loopback access to proxy and DNS only#95

Open
usnavy13 wants to merge 1 commit into
devfrom
fix/restrict-sandbox-loopback-access
Open

fix(security): Restrict sandbox loopback access to proxy and DNS only#95
usnavy13 wants to merge 1 commit into
devfrom
fix/restrict-sandbox-loopback-access

Conversation

@usnavy13
Copy link
Copy Markdown
Owner

@usnavy13 usnavy13 commented May 7, 2026

Summary

When ENABLE_SANDBOX_NETWORK=true, nsjail disables the network namespace clone (--disable_clone_newnet) so sandbox processes share the container's network namespace. The egress firewall is supposed to lock sandboxes down to only the allowlist proxy, but an overly broad iptables rule (-o lo -j ACCEPT) granted sandbox processes access to every port on loopback — including the API itself on 127.0.0.1:8000.

This PR replaces the blanket loopback rule with targeted rules that only allow DNS resolution (127.0.0.53:53 UDP/TCP).

Vulnerability Details

Root cause: src/services/sandbox/egress_firewall.py line 120–135 contained:

-A OUTPUT -m owner --uid-owner <sandbox_uid> -o lo -j ACCEPT

This was intended for DNS and "localhost-only services" but inadvertently opened all 65,535 loopback ports to the sandbox UID.

Attack path (demonstrated on the live dev instance):

  1. Submit a Python execution request with ENABLE_SANDBOX_NETWORK=true:

    import urllib.request, ssl
    ctx = ssl.create_default_context()
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE
    resp = urllib.request.urlopen("https://127.0.0.1:8000/health", timeout=5, context=ctx)
    print(resp.status, resp.read().decode())
  2. Result: 200 {"status":"healthy","version":"1.2.0","service":"code-interpreter-api"} — sandbox code reached the API.

  3. A port scan from inside the sandbox confirmed:

    Port Service Status
    8000 API (HTTPS) OPEN
    18443 Egress Proxy OPEN (intended)
    6379 Redis Closed (separate container)
    3900 Garage/S3 Closed (separate container)
  4. With AUTH_ENABLED=true (current config), calling /exec from the sandbox returns 401 — auth blocks escalation. However, if AUTH_ENABLED=false (a documented, supported configuration for trusted-network deployments), sandbox code would get full unauthenticated API access: executing code in other sessions, accessing other sessions' files, etc.

Mitigating factors:

  • ENABLE_SANDBOX_NETWORK defaults to false — only affects deployments that enable it for skill installs
  • AUTH_ENABLED defaults to true — full escalation requires auth to be disabled
  • nsjail PID/mount namespace isolation prevents the API key from being discoverable via /proc or environment variables

What Changed

src/services/sandbox/egress_firewall.py — replaced one rule with two:

Before After
-o lo -j ACCEPT (all loopback ports) -d 127.0.0.53 -p udp --dport 53 -j ACCEPT (DNS only)
-d 127.0.0.53 -p tcp --dport 53 -j ACCEPT (DNS over TCP)

The existing rule #1 (proxy port on 127.0.0.1) and rule #3 (REJECT everything else) are unchanged. The net effect is that the sandbox UID can now only reach:

  • 127.0.0.1:<proxy_port> — the egress allowlist proxy (pip, npm, go, cargo)
  • 127.0.0.53:53 — DNS resolution via systemd-resolved

All other loopback ports (including 8000/API) are now rejected.

Test Plan

  • pytest tests/unit/test_egress_proxy.py — 21 tests pass
  • flake8 and black --check clean
  • Rebuild container and re-run the sandbox loopback test — urllib.request.urlopen("https://127.0.0.1:8000/health") should fail with connection refused/rejected
  • Verify skill installs still work (pip install, npm install) through the proxy

🤖 Generated with Claude Code

The blanket `-o lo -j ACCEPT` iptables rule allowed sandbox processes to
reach the API on 127.0.0.1:8000 when ENABLE_SANDBOX_NETWORK=true (shared
network namespace). Replace with targeted rules permitting only the egress
proxy port and DNS (127.0.0.53:53), closing the SSRF/escalation path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@usnavy13 usnavy13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — sandbox egress loopback fix

Net: the core fix is correct. Replacing the blanket -o lo ACCEPT with narrow ACCEPTs + the unchanged catch-all REJECT does close the SSRF/escalation path — the API on 127.0.0.1:8000 and the bridge IP are now both rejected for the sandbox uid (verified: no IPv4 bypass; proxy runs as root so skill installs still work; REPL/PTC use stdio so nothing breaks). Two substantive refinements + a couple of nits below; none are hard blockers.

Should-fix

  • DNS rule targets the wrong resolver (inline). In this Docker image the in-container resolver is 127.0.0.11 (Docker embedded DNS), not 127.0.0.53 (that's the host's systemd-resolved, shown only as ExtServers in /etc/resolv.conf). Verified on the live container. So the new rule is effectively dead, and direct sandbox-side getaddrinfo() (which the sandbox inherits via the container's /etc/resolv.conf) now hits the catch-all REJECT. Cleanest fix: drop the DNS exception entirely — the proxy resolves DNS as root for all HTTPS_PROXY-aware tools, and the old -o lo rule matched 0 packets in practice. Otherwise retarget to 127.0.0.11.
  • IPv6 is unprotected (inline). These are iptables (IPv4) rules only; ip6tables OUTPUT policy is ACCEPT with no rules. Not exploitable in the default deploy (Docker bridge IPv6 off, API binds IPv4-only, internal services in separate netns, seccomp blocks bind), so it's a defense-in-depth gap rather than a live hole — but worth closing by mirroring the rules in ip6tables (or asserting IPv6 is disabled).

Nits

  • No test exercises this module. The cited "21 tests pass" are in test_egress_proxy.py (the proxy, not the firewall). A pure-Python test_egress_firewall.py that mocks _run_iptables and asserts the 4 rule arg-lists + order + comment + the rollback path would be cheap and high-value.
  • Stale comments: :94 ("catch-all DROP") and :159 ("Drop everything else") say DROP but the action is REJECT; :86 docstring still says "can only reach the proxy" (DNS is also allowed now).
  • SANDBOX_UID=0 footgun: _get_sandbox_user_id() accepts 0; if misconfigured it'd firewall the root API/proxy itself. Consider rejecting uid 0.

Test plan note: the rebuild + live re-test boxes are still unchecked, and the running container is still on the pre-PR rules — worth validating end-to-end before merge.

- ALLOW the sandbox uid → 127.0.0.1:<proxy_port> (so pip etc. work)
- DROP everything else from the sandbox uid
- ALLOW the sandbox uid → 127.0.0.53:53 (DNS via systemd-resolved)
- REJECT everything else from the sandbox uid
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPv6 not covered (defense-in-depth gap). This REJECT — and all rules here — are iptables (IPv4) only. ip6tables OUTPUT policy is ACCEPT with no rules, so over IPv6 the sandbox uid is unrestricted. Not exploitable in the default deploy (Docker bridge IPv6 off, API binds 0.0.0.0/IPv4-only, internal services in separate netns, seccomp blocks bind for non-bash) — but the invariant is absent for v6. Consider mirroring these rules via ip6tables (or asserting IPv6 is disabled) so it holds if an operator ever enables Docker IPv6.

# localhost-only services, etc.). The proxy enforces hostname allowlist
# for actual outbound; this just keeps the sandbox uid able to talk
# to itself if it ever needs to.
# Allow DNS to systemd-resolved on loopback (some tools resolve
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is inaccurate for the Docker runtime: systemd-resolved does not run inside the container (no systemd; CMD is python3 -m src.main), so nothing listens on 127.0.0.53. 127.0.0.53 is the host's stub resolver — see the # ExtServers: [host(127.0.0.53)] line in the container's /etc/resolv.conf. The actual in-container resolver is 127.0.0.11.

"-o",
"lo",
"-d",
"127.0.0.53",
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong resolver address. Verified on the live container: /etc/resolv.conf is nameserver 127.0.0.11 (Docker embedded DNS); 127.0.0.53 has no listener inside the container. Two consequences:

  1. This ACCEPT is effectively dead — it never matches real DNS traffic.
  2. The sandbox inherits the container's /etc/resolv.conf (chroot to /, the unshare --mount wrapper never remounts /etc), so any tool doing direct getaddrinfo() queries 127.0.0.11:53 → caught by the catch-all REJECT. So direct sandbox-side DNS that worked before this PR now fails.

Not a security regression (fails closed) and the main path is fine (the proxy resolves DNS as root for HTTPS_PROXY tools; the old -o lo rule matched 0 packets). Recommendation: drop both DNS rules entirely and update the docstring — the proxy already covers DNS. If you'd rather keep direct resolution, change 127.0.0.53127.0.0.11 in both rules instead.

"--uid-owner",
str(sandbox_uid),
"-d",
"127.0.0.53",
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same wrong-address issue as the UDP rule above (127.0.0.53 → should be 127.0.0.11, or drop both DNS rules).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant