You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix epoll_ctl dropping registrations in multi-threaded guests
In a multi-threaded guest host_fd_ref_open() hands back a dup of the
target fd that host_fd_ref_close() closes when the syscall returns.
sys_epoll_ctl() used that transient dup as the kqueue knote ident, and
the kernel drops a knote the moment its fd is closed -- so every epoll
registration made while multi-threaded was torn down the instant
epoll_ctl() returned, and epoll_pwait() never reported readiness again.
Single-threaded guests borrow the raw fd (no dup, no close) and never
hit it. Node's libuv DelayedTaskScheduler (eventfd + epoll backing
uv_async_send) relied on this path and hung forever at process exit:
the main thread blocked in WorkerThreadsTaskRunner::Shutdown ->
uv_thread_join on a scheduler thread that could no longer be woken.
Key the knote on the persistent host fd from the fd table. Take it from
the same atomic fd_snapshot() that validates the fd, so the ident comes
from the entry that was validated rather than a second fd_to_host()
lookup that could race a concurrent close/reopen. Result mapping already
uses udata (the guest fd), so the ident only needs to stay open and
refer to the same open file description.
Guard the close+reopen ABA with a per-slot generation counter. fd_table
entries now carry a monotonic generation bumped on every allocation;
epoll registrations stamp it at ADD/MOD. If the guest closes a watched
fd and reopens it (reusing the guest fd number), the kernel has already
dropped the original knote, yet reg->active still looks live -- a later
DEL/MOD would EV_DELETE the wrong knote on the reused host fd. A
mismatched generation now marks the registration gone, so DEL/MOD report
ENOENT (matching Linux's auto-removal on close) and ADD starts fresh.
Also implement the FIONBIO / FIOCLEX / FIONCLEX ioctls, which were
falling through to ENOTTY. libuv's uv_pipe_open() sets non-blocking via
FIONBIO, so Node's console.log() to a pipe threw "open ENOTTY". FIONBIO
maps to F_SETFL O_NONBLOCK (status flag, shared across the dup).
FIOCLEX/FIONCLEX mirror F_SETFD by toggling the fd_table cloexec bit
rather than the host fd's FD_CLOEXEC, which is per-descriptor and lost on
the dup. They need no host fd, so they dispatch before
host_fd_ref_open_regular_io() -- which rejects O_PATH (FD_PATH) with
EBADF, while Linux allows these ioctls (like F_SETFD) on O_PATH fds --
and validate the slot and flip the flag in a single fd_lock section, so
there is no validate-then-mutate window for a concurrent close/reuse to
flip cloexec on a different file.
Add tests/test-epoll-mt.c: a CLONE_THREAD sibling keeps the guest
multi-threaded across epoll_ctl, then asserts a registered eventfd and
pipe still deliver an EPOLLIN edge. It fails without the poll.c fix.
Add tests/test-ioctl-cloexec.c covering FIOCLEX/FIONCLEX round-trip on
both a regular and an O_PATH fd. Both are listed in tests/manifest.txt
so the driver runs them under make check.
With these, node:alpine (node v26.3.0) runs JavaScript, timers, the
libuv threadpool, and promises, and exits cleanly.
0 commit comments