Skip to content

Commit ce44df9

Browse files
andrewdunndevcgwalters
authored andcommitted
fix(install): join host IPC namespace to prevent dm semaphore deadlock
Inside a container with an isolated IPC namespace (the podman/docker default), udevd on the host cannot see the container's semaphores, causing cryptsetup luksOpen/luksClose to deadlock on semop(). The primary fix is adding --ipc=host to the documented podman invocations. As defense-in-depth, call setns() into /proc/1/ns/ipc at the very start of global_init() when the process is in a different IPC namespace than pid 1, so that devmapper's udev synchronization works correctly even if the caller omits --ipc=host. Signed-off-by: Andrew Dunn <andrew@dunn.dev>
1 parent d8785c6 commit ce44df9

File tree

2 files changed

+21
-3
lines changed

2 files changed

+21
-3
lines changed

crates/lib/src/cli.rs

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
use std::ffi::{CString, OsStr, OsString};
66
use std::fs::File;
77
use std::io::{BufWriter, Seek};
8+
use std::os::fd::AsFd;
89
use std::os::unix::process::CommandExt;
910
use std::process::Command;
1011

@@ -1534,6 +1535,23 @@ async fn usroverlay(access_mode: FilesystemOverlayAccessMode) -> Result<()> {
15341535
/// in the standard `main` function.
15351536
#[allow(unsafe_code)]
15361537
pub fn global_init() -> Result<()> {
1538+
// Join the host IPC namespace if we're in an isolated one. Inside a
1539+
// container with a separate IPC namespace (the podman/docker default),
1540+
// udevd on the host cannot see the container's semaphores, causing
1541+
// cryptsetup operations to deadlock on semop(). The primary fix is to
1542+
// run the install container with --ipc=host; this is defense-in-depth
1543+
// for cases where the caller forgets that flag.
1544+
let ns_pid1 = std::fs::read_link("/proc/1/ns/ipc").context("reading /proc/1/ns/ipc")?;
1545+
let ns_self = std::fs::read_link("/proc/self/ns/ipc").context("reading /proc/self/ns/ipc")?;
1546+
if ns_pid1 != ns_self {
1547+
let pid1ipcns = std::fs::File::open("/proc/1/ns/ipc").context("open pid1 ipcns")?;
1548+
rustix::thread::move_into_link_name_space(
1549+
pid1ipcns.as_fd(),
1550+
Some(rustix::thread::LinkNameSpaceType::InterProcessCommunication),
1551+
)
1552+
.context("setns(ipc)")?;
1553+
tracing::debug!("Joined pid1 IPC namespace");
1554+
}
15371555
// In some cases we re-exec with a temporary binary,
15381556
// so ensure that the syslog identifier is set.
15391557
ostree::glib::set_prgname(bootc_utils::NAME.into());

docs/src/bootc-install.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,15 +60,15 @@ to an existing system and install your container image. Failure to run
6060
Here's an example of using `bootc install` (root/elevated permission required):
6161

6262
```bash
63-
podman run --rm --privileged --pid=host -v /var/lib/containers:/var/lib/containers -v /dev:/dev --security-opt label=type:unconfined_t <image> bootc install to-disk /path/to/disk
63+
podman run --rm --privileged --pid=host --ipc=host -v /var/lib/containers:/var/lib/containers -v /dev:/dev --security-opt label=type:unconfined_t <image> bootc install to-disk /path/to/disk
6464
```
6565

6666
Note that while `--privileged` is used, this command will not perform any
6767
destructive action on the host system. Among other things, `--privileged`
6868
makes sure that all host devices are mounted into container. `/path/to/disk` is
6969
the host's block device where `<image>` will be installed on.
7070

71-
The `--pid=host --security-opt label=type:unconfined_t` today
71+
The `--pid=host --ipc=host --security-opt label=type:unconfined_t` today
7272
make it more convenient for bootc to perform some privileged
7373
operations; in the future these requirements may be dropped.
7474

@@ -191,7 +191,7 @@ process, you can create a raw disk image that you can boot via virtualization. R
191191

192192
```bash
193193
truncate -s 10G myimage.raw
194-
podman run --rm --privileged --pid=host --security-opt label=type:unconfined_t -v /dev:/dev -v /var/lib/containers:/var/lib/containers -v .:/output <yourimage> bootc install to-disk --generic-image --via-loopback /output/myimage.raw
194+
podman run --rm --privileged --pid=host --ipc=host --security-opt label=type:unconfined_t -v /dev:/dev -v /var/lib/containers:/var/lib/containers -v .:/output <yourimage> bootc install to-disk --generic-image --via-loopback /output/myimage.raw
195195
```
196196

197197
Notice that we use `--generic-image` for this use case.

0 commit comments

Comments
 (0)