Skip to content

fix: defer PTY slave drop to prevent macOS data loss race#164

Merged
branchseer merged 2 commits intomainfrom
fix/pty-macos-slave-drop-race
Feb 16, 2026
Merged

fix: defer PTY slave drop to prevent macOS data loss race#164
branchseer merged 2 commits intomainfrom
fix/pty-macos-slave-drop-race

Conversation

@branchseer
Copy link
Copy Markdown
Member

Summary

Fixes a flaky is_terminal test failure on macOS (pty_terminal::tests::terminal::is_terminal) observed in CI run #22038626101.

Root Cause

Terminal::spawn previously dropped the parent's slave PTY fd immediately after spawning the child:

drop(pty_pair.slave); // Critical: drop slave so EOF is signaled when child exits

On macOS, when a child process writes output and exits quickly, all slave references close before the reader ever calls read() on the master. macOS then returns EIO on the master PTY without draining the output buffer, causing the child's output to be permanently lost. portable-pty converts EIO to Ok(0) (EOF), so read_to_end returns with zero bytes and screen_contents() is empty.

The failure is a race between child exit and the first read():

Scenario Result
read() pending before child exits Data delivered normally
read() called after all slave fds close macOS returns EIO, data lost

Reproduction

Adding thread::sleep(Duration::from_secs(2)) before read_to_end in the is_terminal test deterministically reproduces the failure — the child always exits during the sleep, and the first read() returns 0 with empty screen contents.

Fix

Move the slave fd into the background monitoring thread instead of dropping it immediately in spawn(). The thread holds the slave alive until after child.wait() returns, then drops writer and slave in sequence:

thread::spawn(move || {
    child.wait();       // wait for exit
    writer = None;      // close writer
    drop(slave);        // NOW trigger EOF on the reader
});

This keeps the PTY in a "connected" state while the child runs, so the kernel output buffer is preserved regardless of when the reader starts.

Verification

  • Stable repro: With a 2s sleep before read_to_end, the old code fails 100% of the time; the fix passes 100%.
  • macOS: 30/30 runs of the full pty_terminal test suite pass.
  • Windows (cargo xtest): 9/9 tests pass on aarch64-pc-windows-msvc.
  • Full suite: cargo test (all crates) passes with no regressions.

On macOS, when the parent's slave fd is closed immediately after spawn
and the child exits quickly, all slave references close before the
reader issues its first read(). macOS returns EIO on the master PTY
without draining the output buffer, causing data loss.

Move the slave fd into the background monitoring thread and drop it
after child.wait() returns. This keeps the PTY in a connected state
while the child runs, guaranteeing buffered output is preserved
regardless of reader timing.
@branchseer branchseer merged commit bddad36 into main Feb 16, 2026
6 checks passed
@branchseer branchseer deleted the fix/pty-macos-slave-drop-race branch February 16, 2026 04:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant