Skip to content

Commit 1bdf943

Browse files
committed
devenv: Add subuid/subgid configuration for nested rootless podman
When running inside a constrained UID namespace (e.g., devaipod's rootless podman), the container only has access to a limited UID range. The previous configuration only handled cgroups but not the subuid/subgid mappings, causing nested podman to fail with 'newuidmap: write to uid_map failed'. This adds a Python script at /usr/lib/devenv/userns-setup which: - Detects constrained UID namespaces by parsing /proc/self/uid_map - Configures /etc/subuid and /etc/subgid with the available UID range - Uses SUDO_USER to correctly configure for the target user when run via sudo - Resets podman storage if mappings change - Preserves other users' entries when updating subuid/subgid - Creates containers.conf with cgroups=disabled for constrained namespaces The devenv-init.sh becomes a thin wrapper calling the Python implementation. Also adds chmod u+s for newuidmap/newgidmap on C10s, as shadow-utils doesn't set setuid by default (unlike Debian's uidmap package). Also adds a GitHub Actions workflow and just target to test nested podman by pulling and running busybox in the devcontainer images. Tested by running bootc workspace inside devaipod and successfully pulling and running containers with nested podman. Assisted-by: OpenCode (Claude claude-opus-4-5-20250514) Signed-off-by: Colin Walters <walters@verbum.org> # Conflicts: # devenv/.dockerignore
1 parent 5a09005 commit 1bdf943

9 files changed

Lines changed: 327 additions & 28 deletions

File tree

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
name: Test DevContainer
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
paths:
8+
- 'devenv/**'
9+
- 'common/.devcontainer/**'
10+
- '.github/workflows/test-devcontainer.yml'
11+
12+
env:
13+
REGISTRY: ghcr.io
14+
15+
jobs:
16+
test:
17+
runs-on: ubuntu-24.04
18+
strategy:
19+
fail-fast: false
20+
matrix:
21+
os: [debian, c10s]
22+
23+
steps:
24+
- name: Checkout
25+
uses: actions/checkout@v6
26+
27+
- name: Set up runner
28+
uses: bootc-dev/actions/bootc-ubuntu-setup@main
29+
30+
- name: Login to GitHub Container Registry
31+
uses: docker/login-action@v3
32+
with:
33+
registry: ${{ env.REGISTRY }}
34+
username: ${{ github.repository_owner }}
35+
password: ${{ secrets.GITHUB_TOKEN }}
36+
37+
- name: Test nested podman in devcontainer
38+
run: just devcontainer-test ${{ env.REGISTRY }}/${{ github.repository_owner }}/devenv-${{ matrix.os }}:latest

Justfile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,14 @@ devenv-build-c10s:
1212

1313
# Build devenv image with local tag (defaults to Debian)
1414
devenv-build: devenv-build-debian
15+
16+
# Test nested podman and VMs work in a devcontainer image
17+
# Usage: just devcontainer-test <image>
18+
# Example: just devcontainer-test ghcr.io/bootc-dev/devenv-debian:latest
19+
# Note: We avoid --privileged by using minimal security options for nested containers:
20+
# - label=disable: Required for mounting /proc in nested user namespace
21+
# - unmask=/proc/*: Allows access to /proc paths needed for nested containers
22+
# - /dev/net/tun: Required for pasta networking in nested containers
23+
# - /dev/kvm: Required for running VMs with bcvk
24+
devcontainer-test image:
25+
docker run --rm --security-opt label=disable --security-opt unmask=/proc/* --device /dev/net/tun --device /dev/kvm "{{ image }}" /usr/libexec/devenv-selftest.sh

common/.devcontainer/devcontainer.json

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,14 @@
1313
},
1414
"features": {},
1515
"runArgs": [
16-
// Because we want to be able to run podman and also use e.g. /dev/kvm
17-
// among other things
18-
"--privileged"
16+
// Minimal security options for nested podman (avoids --privileged):
17+
// - label=disable: Required for mounting /proc in nested user namespace
18+
// - unmask=/proc/*: Allows access to /proc paths needed for nested containers
19+
"--security-opt", "label=disable",
20+
"--security-opt", "unmask=/proc/*",
21+
// Device access for nested containers and VMs
22+
"--device", "/dev/net/tun",
23+
"--device", "/dev/kvm"
1924
],
2025
"postCreateCommand": {
2126
// Our init script

devenv/.dockerignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,5 @@
1515
!fetch-tools.sh
1616
!install-rust.sh
1717
!install-kani.sh
18+
!devenv-selftest.sh
19+
!userns-setup

devenv/Containerfile.c10s

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,8 @@ ENV RUSTUP_HOME=/usr/local/rustup
7171
ENV KANI_HOME=/usr/local/kani
7272
# Setup for codespaces
7373
COPY devenv-init.sh /usr/local/bin/
74+
COPY userns-setup /usr/lib/devenv/userns-setup
75+
COPY devenv-selftest.sh /usr/libexec/
7476

7577
WORKDIR /
7678
# Create user before declaring volumes so home directory has correct ownership

devenv/Containerfile.debian

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,8 @@ ENV RUSTUP_HOME=/usr/local/rustup
7171
ENV KANI_HOME=/usr/local/kani
7272
# Setup for codespaces
7373
COPY devenv-init.sh /usr/local/bin/
74+
COPY userns-setup /usr/lib/devenv/userns-setup
75+
COPY devenv-selftest.sh /usr/libexec/
7476

7577
WORKDIR /
7678
# Create user before declaring volumes so home directory has correct ownership

devenv/devenv-init.sh

Lines changed: 2 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,3 @@
11
#!/bin/bash
2-
set -euo pipefail
3-
# Set things up so that podman can run nested inside the privileged
4-
# docker container of a codespace or devpod.
5-
6-
# Fix the propagation - only needed in some environments (e.g., codespaces)
7-
# In devpod with rootless podman, / may already have shared propagation
8-
# or we may not have permission to remount it.
9-
propagation=$(findmnt -J -o TARGET,PROPAGATION / | jq -r '.filesystems[0].propagation // "unknown"')
10-
if [ "$propagation" = "private" ]; then
11-
if mount -o remount --make-shared / 2>/dev/null; then
12-
echo "Set / to shared propagation"
13-
else
14-
echo "Warning: Could not set / to shared propagation (may not be needed)"
15-
fi
16-
fi
17-
18-
# This is actually safe to expose to all users really, like Fedora derivatives do
19-
if [ -e /dev/kvm ]; then
20-
chmod a+rw /dev/kvm 2>/dev/null || true
21-
fi
22-
23-
# Handle nested cgroups - update containers.conf if it exists and has the settings commented out
24-
if [ -f /usr/share/containers/containers.conf ]; then
25-
sed -i -e 's,^#cgroups =.*,cgroups = "no-conmon",' -e 's,^#cgroup_manager =.*,cgroup_manager = "cgroupfs",' /usr/share/containers/containers.conf
26-
fi
2+
# Thin wrapper that calls the Python implementation
3+
exec python3 /usr/lib/devenv/userns-setup "$@"

devenv/devenv-selftest.sh

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
#!/bin/bash
2+
# Test that nested podman and VMs work correctly in this devcontainer.
3+
# This script is designed to be run inside the container after devenv-init.sh.
4+
set -euo pipefail
5+
6+
echo "=== Testing nested podman and VMs ==="
7+
8+
echo "Running devenv-init.sh..."
9+
sudo /usr/local/bin/devenv-init.sh
10+
11+
echo "Podman version:"
12+
podman --version
13+
14+
echo "Podman info (rootless):"
15+
podman info --format '{{.Host.Security.Rootless}}'
16+
17+
# Use CentOS Stream 10 as the test image for both container and VM
18+
image="quay.io/centos-bootc/centos-bootc:stream10"
19+
20+
echo "Pulling $image..."
21+
podman pull "$image"
22+
23+
echo "Running nested container..."
24+
podman run --rm "$image" echo "Hello from nested podman!"
25+
26+
echo "=== Nested container test passed ==="
27+
28+
# Test bcvk (VM) if available and /dev/kvm exists
29+
if command -v bcvk >/dev/null 2>&1 && [ -e /dev/kvm ]; then
30+
echo ""
31+
echo "=== Testing bcvk VM ==="
32+
echo "bcvk version:"
33+
bcvk --version
34+
35+
echo "Running bcvk ephemeral VM with SSH..."
36+
bcvk ephemeral run-ssh "$image" -- echo "Hello from bcvk VM!"
37+
38+
echo "=== bcvk VM test passed ==="
39+
else
40+
echo ""
41+
echo "=== Skipping bcvk VM test (bcvk not available or /dev/kvm missing) ==="
42+
fi
43+
44+
echo ""
45+
echo "=== All tests passed ==="

devenv/userns-setup

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Set up nested podman inside privileged docker/podman containers (codespaces, devpod).
4+
5+
This handles:
6+
- Mount propagation fixes
7+
- /dev/kvm permissions
8+
- subuid/subgid configuration for constrained UID namespaces
9+
- containers.conf configuration for nested operation
10+
"""
11+
12+
import argparse
13+
import json
14+
import os
15+
import shutil
16+
import subprocess
17+
import sys
18+
from pathlib import Path
19+
20+
21+
def run_cmd(cmd: list[str], check: bool = True, capture: bool = False) -> subprocess.CompletedProcess:
22+
"""Run a command, optionally capturing output."""
23+
return subprocess.run(cmd, check=check, capture_output=capture, text=True)
24+
25+
26+
def get_mount_propagation(target: str) -> str:
27+
"""Get mount propagation type for a given mount point."""
28+
result = run_cmd(["findmnt", "-J", "-o", "TARGET,PROPAGATION", target], capture=True, check=False)
29+
if result.returncode != 0:
30+
return "unknown"
31+
try:
32+
data = json.loads(result.stdout)
33+
return data.get("filesystems", [{}])[0].get("propagation", "unknown")
34+
except (json.JSONDecodeError, IndexError, KeyError):
35+
return "unknown"
36+
37+
38+
def fix_mount_propagation() -> None:
39+
"""Fix root mount propagation if needed (e.g., in codespaces)."""
40+
propagation = get_mount_propagation("/")
41+
if propagation == "private":
42+
result = run_cmd(["mount", "-o", "remount", "--make-shared", "/"], check=False)
43+
if result.returncode == 0:
44+
print("Set / to shared propagation")
45+
else:
46+
print("Warning: Could not set / to shared propagation (may not be needed)")
47+
48+
49+
def fix_kvm_permissions() -> None:
50+
"""Make /dev/kvm accessible to all users (safe, like Fedora derivatives do)."""
51+
kvm = Path("/dev/kvm")
52+
if kvm.exists():
53+
try:
54+
kvm.chmod(0o666)
55+
except PermissionError:
56+
pass
57+
58+
59+
def detect_constrained_namespace() -> tuple[bool, int]:
60+
"""
61+
Detect whether we're in a constrained UID namespace.
62+
63+
Returns:
64+
(is_constrained, max_uid): True if constrained (1000-100000 UIDs available),
65+
along with the maximum usable UID.
66+
"""
67+
max_uid = 0
68+
try:
69+
with open("/proc/self/uid_map") as f:
70+
for line in f:
71+
parts = line.split()
72+
if len(parts) >= 3:
73+
inside = int(parts[0])
74+
count = int(parts[2])
75+
end = inside + count
76+
if end > max_uid:
77+
max_uid = end
78+
except (OSError, ValueError):
79+
return False, 0
80+
81+
# Constrained if between 1000 and 100000 UIDs
82+
is_constrained = 1000 < max_uid < 100000
83+
return is_constrained, max_uid
84+
85+
86+
def configure_subuid_subgid(target_user: str | None = None) -> None:
87+
"""
88+
Configure subuid/subgid for nested rootless podman in constrained UID namespaces.
89+
90+
Args:
91+
target_user: Username to configure. Defaults to SUDO_USER or current user.
92+
"""
93+
# Only proceed if podman is available
94+
if not shutil.which("podman"):
95+
return
96+
97+
# Check for newuidmap/newgidmap
98+
if not shutil.which("newuidmap"):
99+
print("Warning: newuidmap not found, nested podman may fail")
100+
101+
is_constrained, max_uid = detect_constrained_namespace()
102+
if not is_constrained:
103+
print(f"Full UID namespace available (max={max_uid}), using default podman config")
104+
return
105+
106+
# Determine target user
107+
if target_user is None:
108+
target_user = os.environ.get("SUDO_USER")
109+
if target_user is None:
110+
import pwd
111+
target_user = pwd.getpwuid(os.getuid()).pw_name
112+
113+
# Get target user's UID
114+
import pwd
115+
try:
116+
target_uid = pwd.getpwnam(target_user).pw_uid
117+
except KeyError:
118+
print(f"Warning: User {target_user} not found")
119+
return
120+
121+
# Calculate subuid range
122+
subuid_start = target_uid + 1
123+
subuid_count = max_uid - subuid_start
124+
125+
if subuid_count < 1000:
126+
print(f"Insufficient UID range for nested podman (only {subuid_count} UIDs available)")
127+
return
128+
129+
expected = f"{target_user}:{subuid_start}:{subuid_count}"
130+
131+
# Check if already configured correctly
132+
subuid_path = Path("/etc/subuid")
133+
if subuid_path.exists():
134+
current = None
135+
for line in subuid_path.read_text().splitlines():
136+
if line.startswith(f"{target_user}:"):
137+
current = line
138+
break
139+
if current == expected:
140+
print(f"Nested podman subuid/subgid already configured for {target_user}")
141+
return
142+
143+
print(f"Configuring nested podman for {target_user} (subuid {subuid_start}:{subuid_count})")
144+
145+
# Configure subuid/subgid
146+
for path in [Path("/etc/subuid"), Path("/etc/subgid")]:
147+
lines = []
148+
if path.exists():
149+
lines = [line for line in path.read_text().splitlines()
150+
if not line.startswith(f"{target_user}:")]
151+
lines.append(expected)
152+
path.write_text("\n".join(lines) + "\n")
153+
154+
# Reset podman storage if it exists (may have wrong UID mappings)
155+
import pwd
156+
user_home = Path(pwd.getpwnam(target_user).pw_dir)
157+
storage_dir = user_home / ".local/share/containers/storage"
158+
if storage_dir.exists():
159+
print("Resetting podman storage for new UID mappings")
160+
shutil.rmtree(storage_dir)
161+
162+
print("Nested podman subuid/subgid configured successfully")
163+
164+
165+
def configure_containers_conf() -> None:
166+
"""Configure containers.conf for nested container operation."""
167+
if not shutil.which("podman"):
168+
return
169+
170+
is_constrained, _ = detect_constrained_namespace()
171+
172+
if not is_constrained:
173+
# Full namespace - just update the shipped config
174+
conf_path = Path("/usr/share/containers/containers.conf")
175+
if conf_path.exists():
176+
content = conf_path.read_text()
177+
content = content.replace("#cgroups =", 'cgroups = "no-conmon" #')
178+
content = content.replace("#cgroup_manager =", 'cgroup_manager = "cgroupfs" #')
179+
conf_path.write_text(content)
180+
else:
181+
# Constrained namespace - create full config for nested operation
182+
conf_dir = Path("/etc/containers")
183+
conf_dir.mkdir(parents=True, exist_ok=True)
184+
conf_path = conf_dir / "containers.conf"
185+
conf_path.write_text("""\
186+
# Generated for nested container support in constrained UID namespace
187+
[containers]
188+
cgroups = "disabled"
189+
utsns = "host"
190+
191+
[engine]
192+
cgroup_manager = "cgroupfs"
193+
""")
194+
print("Configured containers.conf for constrained UID namespace")
195+
196+
197+
def main() -> int:
198+
parser = argparse.ArgumentParser(
199+
description="Configure nested podman for devcontainers"
200+
)
201+
parser.add_argument(
202+
"user",
203+
nargs="?",
204+
help="Target user for subuid/subgid configuration (default: SUDO_USER or current user)",
205+
)
206+
args = parser.parse_args()
207+
208+
fix_mount_propagation()
209+
fix_kvm_permissions()
210+
configure_subuid_subgid(args.user)
211+
configure_containers_conf()
212+
213+
return 0
214+
215+
216+
if __name__ == "__main__":
217+
sys.exit(main())

0 commit comments

Comments
 (0)