Fix vsock file descriptor lifetime across async boundaries#552
Fix vsock file descriptor lifetime across async boundaries#552DePasqualeOrg wants to merge 3 commits intoapple:mainfrom
Conversation
|
Do you know what was occurring when you got the crash? Were you going to stop the vm/container? Was it just running and randomly crashed? It'd be useful to narrow that down. I personally haven't seen any EBADFs in quite awhile now. Our last set was solely due to interactions with our (no longer exposed) pause/resume functionality. |
|
I think I was starting or resuming a container, or trying to run a command in a container. Sorry that I can't be more specific than that. |
4f81487 to
8c047dc
Compare
|
I realized that I can in fact offer more details based on crash reports. I had Codex analyze three crash reports from my app that uses this package, and these were the findings. Hopefully this helps shed more light on the issue.
|
8c047dc to
633b8b2
Compare
|
I'm still seeing crashes when tearing down containers. This is the latest crash analysis and summary of fixes from Codex. The changes went through several rounds of review with Codex and Claude Code:
|
Exercises the dialAgent() → gRPC RPC path with deliberate delays between creating the connection and making the first RPC call. This reproduces a crash where NIO hits a precondition failure (EBADF) in fcntl(F_SETNOSIGPIPE) because the VZVirtioSocketConnection was closed before the gRPC client created the NIO channel.
When dialAgent() creates a gRPC connection via vsock, it dup's the file descriptor and immediately closes the VZVirtioSocketConnection. The Virtualization framework tears down the vsock endpoint when the connection is closed, which can invalidate the dup'd descriptor. Since gRPC defers NIO channel creation until the first RPC, the fd may be invalid by then, causing a precondition failure in NIO's fcntl(F_SETNOSIGPIPE). The fix introduces VsockTransport, a Sendable wrapper that retains the VZVirtioSocketConnection until Vminitd.close() explicitly shuts it down after the gRPC channel. A new dupFileDescriptor() method dups without closing, and dialAgent() passes the connection as transport.
c4fa3e6 to
d379af4
Compare
|
I rebased on main and resolved conflicts. The grpc-swift-2 upgrade (#578) was reverted in #588, so the fix now works with the grpc-swift-1 API:
|
|
I've been using these changes for the past few weeks, and they appear to prevent the crashes that I was seeing before. I think they're worth reviewing. |
|
@DePasqualeOrg I'll take a look today. I think the biggest thing for me is last time I traced this (across the entire stack), I physically could not see how this is possible that we'd get an invalid FD. The virtualization stack does the right thing as far as I could tell. There's nothing special about the fd that wouldn't allow it to be dup(2)'able, and nowhere in NIO is it closed out from under us. I'll review this however if this is resolving the crashes. It definitely seems like a race condition because I have never seen it on an M1 machine, but have on an M4 Max |
| /// | ||
| /// When a vsock connection's file descriptor is dup'd and handed to gRPC/NIO, | ||
| /// the original VZVirtioSocketConnection must remain open. The Virtualization | ||
| /// framework tears down the host-to-guest vsock mapping when the connection is | ||
| /// closed, which invalidates dup'd descriptors. This wrapper captures the | ||
| /// connection's close operation and provides explicit, idempotent close semantics. |
There was a problem hiding this comment.
This is not accurate, otherwise this project would not work at all
|
Thanks for taking another look. I'll close this, try to reproduce the crashes, and post an issue with a description of the crash reports the next time this happens. |
|
I think the work definitely has merit if it's preventing the crashes. Let me try and see what might make sense.. |
Problem
The gRPC client crashes with a precondition failure in NIO:
The crash occurs in
BaseSocket.ignoreSIGPIPE()when NIO callsfcntl(fd, F_SETNOSIGPIPE, 1)on a file descriptor that has been invalidated.Root cause
VZVirtioSocketConnectionis not a raw POSIX socket – it bridges the process to the Virtualization daemon via XPC. Whenclose()is called, the framework signals the hypervisor to tear down the host-to-guest vsock mapping, which invalidates all file descriptors pointing to the underlying kernel object – including dup'd ones.The existing
dupHandle()method callsself.close()immediately afterdup(). This is safe when the fd is used synchronously, butdialAgent()passes the fd to gRPC'sClientConnection(.connectedSocket(fd)), which defers NIO channel creation until the first RPC call. By that time, the fd is invalid.The same pattern exists in
waitForAgent(), where the agent's first gRPC call (setTimeviaTimeSyncer) can be deferred by up to 30 seconds when Rosetta is not enabled.Fix
Keep the
VZVirtioSocketConnectionalive until the consumer is done with the fd.VsockTransport(new): A thread-safe wrapper that retains theVZVirtioSocketConnectionand provides explicit, idempotent close semantics. Includes adeinitsafety net.VsockConnection(new): A public wrapper that pairs a dup'd fd with itsVsockTransport, providing aFileHandle-like API with automatic lifetime management. ReplacesFileHandleinVirtualMachineInstance.dial(),VsockListener,LinuxProcessstdio, andUnixSocketRelay.retainedConnection()(new): Extension onVZVirtioSocketConnectionthat returns aVsockConnectionbundling the dup'd fd with aVsockTransport. Used bydial()andVsockListener.dupFileDescriptor()(new): LikedupHandle(), but does not close the connection. The caller is responsible for keeping the connection alive. Used bydialAgent()andwaitForAgent(), whereVminitdmanages theVsockTransportseparately.Vminitd: Gains an optionalVsockTransportfield.close()usesdeferto ensure the transport is closed even if gRPC shutdown throws.dialAgent(): UsesdupFileDescriptor()+VsockTransportinstead ofdupHandle().waitForAgent(): Same migration – returns(FileHandle, VsockTransport)sostart()can pass the transport toVminitd.VirtualMachineInstance.dial(): Now returnsVsockConnectioninstead ofFileHandle. UsesretainedConnection()so callers (stdio relay, copy, Unix socket relay) get correct lifetime management automatically.VsockListener: YieldsVsockConnectioninstead ofFileHandle.UnixSocketRelay: Retains the guestVsockConnectionin anActiveRelaystruct for the duration of each active relay. Cleans up on relay completion and onstop().After the fix, both the original fd (in
VZVirtioSocketConnection) and the dup'd fd (used by NIO/gRPC) are open simultaneously. NIO owns and closes the dup'd fd during channel teardown.Vminitd.close()shuts down gRPC first, then closes the transport. Thedup()is still necessary to avoid a double-close.Tests
VsockTransportTests.swift): Verify fd lifecycle invariants using Unix socket pairs – the exactfcntl(F_SETNOSIGPIPE)call that triggers the NIO crash, read/write through dup'd fds, correct teardown ordering, and peer EOF behavior.testExecDeferredConnectionStability): Runs 10 sequential exec calls with 100ms delays between creating the gRPC connection and making the first RPC, exercising the exact code path that was crashing.