Skip to content

Fix vsock file descriptor lifetime across async boundaries#552

Closed
DePasqualeOrg wants to merge 3 commits intoapple:mainfrom
DePasqualeOrg:fix-vsock-ebadf-crash
Closed

Fix vsock file descriptor lifetime across async boundaries#552
DePasqualeOrg wants to merge 3 commits intoapple:mainfrom
DePasqualeOrg:fix-vsock-ebadf-crash

Conversation

@DePasqualeOrg
Copy link
Copy Markdown
Contributor

@DePasqualeOrg DePasqualeOrg commented Feb 23, 2026

Note: I encountered a crash while using this package. The root cause analysis from Claude Code pointed to this potential issue, but since the specifics are beyond my understanding, I'm marking this as a draft PR. I put it through several rounds of review with Claude Code and Codex, which suggested that this fixes a legitimate issue. Also, PR #403 mentions that a fix remains to be found for EBADF panics. If this isn't helpful, please close this PR.

Problem

The gRPC client crashes with a precondition failure in NIO:

NIOPosix/System.swift:262: Precondition failed: unacceptable errno 9 Bad file descriptor
  in fcntl(descriptor:command:value:))

The crash occurs in BaseSocket.ignoreSIGPIPE() when NIO calls fcntl(fd, F_SETNOSIGPIPE, 1) on a file descriptor that has been invalidated.

Root cause

VZVirtioSocketConnection is not a raw POSIX socket – it bridges the process to the Virtualization daemon via XPC. When close() is called, the framework signals the hypervisor to tear down the host-to-guest vsock mapping, which invalidates all file descriptors pointing to the underlying kernel object – including dup'd ones.

The existing dupHandle() method calls self.close() immediately after dup(). This is safe when the fd is used synchronously, but dialAgent() passes the fd to gRPC's ClientConnection(.connectedSocket(fd)), which defers NIO channel creation until the first RPC call. By that time, the fd is invalid.

The same pattern exists in waitForAgent(), where the agent's first gRPC call (setTime via TimeSyncer) can be deferred by up to 30 seconds when Rosetta is not enabled.

Fix

Keep the VZVirtioSocketConnection alive until the consumer is done with the fd.

  • VsockTransport (new): A thread-safe wrapper that retains the VZVirtioSocketConnection and provides explicit, idempotent close semantics. Includes a deinit safety net.
  • VsockConnection (new): A public wrapper that pairs a dup'd fd with its VsockTransport, providing a FileHandle-like API with automatic lifetime management. Replaces FileHandle in VirtualMachineInstance.dial(), VsockListener, LinuxProcess stdio, and UnixSocketRelay.
  • retainedConnection() (new): Extension on VZVirtioSocketConnection that returns a VsockConnection bundling the dup'd fd with a VsockTransport. Used by dial() and VsockListener.
  • dupFileDescriptor() (new): Like dupHandle(), but does not close the connection. The caller is responsible for keeping the connection alive. Used by dialAgent() and waitForAgent(), where Vminitd manages the VsockTransport separately.
  • Vminitd: Gains an optional VsockTransport field. close() uses defer to ensure the transport is closed even if gRPC shutdown throws.
  • dialAgent(): Uses dupFileDescriptor() + VsockTransport instead of dupHandle().
  • waitForAgent(): Same migration – returns (FileHandle, VsockTransport) so start() can pass the transport to Vminitd.
  • VirtualMachineInstance.dial(): Now returns VsockConnection instead of FileHandle. Uses retainedConnection() so callers (stdio relay, copy, Unix socket relay) get correct lifetime management automatically.
  • VsockListener: Yields VsockConnection instead of FileHandle.
  • UnixSocketRelay: Retains the guest VsockConnection in an ActiveRelay struct for the duration of each active relay. Cleans up on relay completion and on stop().

After the fix, both the original fd (in VZVirtioSocketConnection) and the dup'd fd (used by NIO/gRPC) are open simultaneously. NIO owns and closes the dup'd fd during channel teardown. Vminitd.close() shuts down gRPC first, then closes the transport. The dup() is still necessary to avoid a double-close.

Tests

  • Unit tests (VsockTransportTests.swift): Verify fd lifecycle invariants using Unix socket pairs – the exact fcntl(F_SETNOSIGPIPE) call that triggers the NIO crash, read/write through dup'd fds, correct teardown ordering, and peer EOF behavior.
  • Integration test (testExecDeferredConnectionStability): Runs 10 sequential exec calls with 100ms delays between creating the gRPC connection and making the first RPC, exercising the exact code path that was crashing.

@dcantah
Copy link
Copy Markdown
Member

dcantah commented Feb 24, 2026

Do you know what was occurring when you got the crash? Were you going to stop the vm/container? Was it just running and randomly crashed? It'd be useful to narrow that down. I personally haven't seen any EBADFs in quite awhile now. Our last set was solely due to interactions with our (no longer exposed) pause/resume functionality.

@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I think I was starting or resuming a container, or trying to run a command in a container. Sorry that I can't be more specific than that.

@DePasqualeOrg DePasqualeOrg force-pushed the fix-vsock-ebadf-crash branch from 4f81487 to 8c047dc Compare March 13, 2026 10:14
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I realized that I can in fact offer more details based on crash reports. I had Codex analyze three crash reports from my app that uses this package, and these were the findings. Hopefully this helps shed more light on the issue.

  • Two of the crashes fault on NIO event-loop threads in ClientBootstrap.withConnectedSocket(...), with the top of the stack going through BaseSocket.ignoreSIGPIPE() -> Posix.fcntl(...) -> preconditionIsNotUnacceptableErrno(...). Those were the EBADF cases.
  • Another crash is an EXC_GUARD (GUARD_TYPE_FD) with CLOSE on file descriptor 41, again on an NIO event-loop thread in the ClientBootstrap.withConnectedSocket(...) path, with BaseSocket.close() / channel setup cleanup in the top frames.
  • Across all three crashes, the fault point is during connection establishment / first channel setup, not an obvious stop/pause path.

These reports support the fd-lifetime issue this PR is fixing: the connection appears to already be invalid by the time NIO tries to use it or take ownership of it. The reports do not prove whether the bad descriptor came specifically from dialAgent() or waitForAgent(), but they do seem consistent with the duplicated fd remaining in user space after the underlying vsock endpoint was torn down too early.

@DePasqualeOrg DePasqualeOrg force-pushed the fix-vsock-ebadf-crash branch from 8c047dc to 633b8b2 Compare March 15, 2026 20:56
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I'm still seeing crashes when tearing down containers. This is the latest crash analysis and summary of fixes from Codex. The changes went through several rounds of review with Codex and Claude Code:

The latest crash report still points at the same vsock fd-lifetime bug. It crashes on NIO-ELT-2-#3 with EXC_GUARD (GUARD_TYPE_FD): CLOSE on file descriptor 31, and the relevant stack is BaseSocket.close -> BaseStreamSocketChannel.close0 -> ClientBootstrap.withConnectedSocket -> DefaultChannelProvider.makeChannel -> ConnectionManager.startConnecting. That is the gRPC/vsock connection setup path.

That stack is also consistent with containerization's container stop/cleanup paths, which still go through relay shutdown plus vm.withAgent / guest cleanup work. So this still fits the same vsock/gRPC lifetime bug family.

The root cause is broader than dialAgent() alone: a VZVirtioSocketConnection could be dup'd and then closed before NIO or other async code actually used the dup'd fd. The same lifetime mistake was still present in dial(_:), VsockListener, and UnixSocketRelay.

The fixes keep the originating VZVirtioSocketConnection alive for waitForAgent() and dialAgent(), keep raw vsock dials and accepted listener connections alive across async or deferred-use boundaries via VsockConnection, make UnixSocketRelay retain the guest-side VsockConnection for the full relay lifetime, and give BidirectionalRelay its own dup of the guest fd so relay shutdown does not double-close the same descriptor.

Smaller cleanup fixes from review: StdioHandles.close() now clears readabilityHandler before close and still closes all stdio handles even if one close throws, and the old unsafe dupHandle() helper was removed.

@DePasqualeOrg DePasqualeOrg changed the title Fix EBADF crash in vsock connection handling Fix vsock file descriptor lifetime across async boundaries Mar 15, 2026
Exercises the dialAgent() → gRPC RPC path with deliberate delays
between creating the connection and making the first RPC call. This
reproduces a crash where NIO hits a precondition failure (EBADF) in
fcntl(F_SETNOSIGPIPE) because the VZVirtioSocketConnection was closed
before the gRPC client created the NIO channel.
When dialAgent() creates a gRPC connection via vsock, it dup's the
file descriptor and immediately closes the VZVirtioSocketConnection.
The Virtualization framework tears down the vsock endpoint when the
connection is closed, which can invalidate the dup'd descriptor. Since
gRPC defers NIO channel creation until the first RPC, the fd may be
invalid by then, causing a precondition failure in NIO's
fcntl(F_SETNOSIGPIPE).

The fix introduces VsockTransport, a Sendable wrapper that retains
the VZVirtioSocketConnection until Vminitd.close() explicitly shuts
it down after the gRPC channel. A new dupFileDescriptor() method dups
without closing, and dialAgent() passes the connection as transport.
@DePasqualeOrg DePasqualeOrg force-pushed the fix-vsock-ebadf-crash branch from c4fa3e6 to d379af4 Compare April 5, 2026 11:34
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I rebased on main and resolved conflicts. The grpc-swift-2 upgrade (#578) was reverted in #588, so the fix now works with the grpc-swift-1 API:

  • Vminitd: Keeps the VsockTransport? property, but uses the original SandboxContextAsyncClient-based init instead of manually bootstrapping a gRPC channel. close() defers transport?.close() before client.close().
  • UnixSocketRelay.stop(): Now explicitly closes guest connections and clears activeRelays synchronously, rather than deferring cleanup to an async Task waiting on dispatch cancel handlers. This fixed a flaky test where the asynchronous cleanup chain sometimes never completed.

@DePasqualeOrg DePasqualeOrg marked this pull request as ready for review April 8, 2026 04:27
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I've been using these changes for the past few weeks, and they appear to prevent the crashes that I was seeing before. I think they're worth reviewing.

@dcantah
Copy link
Copy Markdown
Member

dcantah commented Apr 8, 2026

@DePasqualeOrg I'll take a look today. I think the biggest thing for me is last time I traced this (across the entire stack), I physically could not see how this is possible that we'd get an invalid FD. The virtualization stack does the right thing as far as I could tell. There's nothing special about the fd that wouldn't allow it to be dup(2)'able, and nowhere in NIO is it closed out from under us. I'll review this however if this is resolving the crashes. It definitely seems like a race condition because I have never seen it on an M1 machine, but have on an M4 Max

Comment on lines +22 to +27
///
/// When a vsock connection's file descriptor is dup'd and handed to gRPC/NIO,
/// the original VZVirtioSocketConnection must remain open. The Virtualization
/// framework tears down the host-to-guest vsock mapping when the connection is
/// closed, which invalidates dup'd descriptors. This wrapper captures the
/// connection's close operation and provides explicit, idempotent close semantics.
Copy link
Copy Markdown
Member

@dcantah dcantah Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not accurate, otherwise this project would not work at all

@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

Thanks for taking another look. I'll close this, try to reproduce the crashes, and post an issue with a description of the crash reports the next time this happens.

@dcantah
Copy link
Copy Markdown
Member

dcantah commented Apr 8, 2026

I think the work definitely has merit if it's preventing the crashes. Let me try and see what might make sense..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants