fix: resolve flaky PTY tests on macOS and Windows#166
Closed
branchseer wants to merge 1 commit intomainfrom
Closed
fix: resolve flaky PTY tests on macOS and Windows#166branchseer wants to merge 1 commit intomainfrom
branchseer wants to merge 1 commit intomainfrom
Conversation
Defer PTY slave and master cleanup to background thread after child.wait() instead of dropping them eagerly during spawn. macOS (is_terminal flaky empty output): dropping the slave fd immediately after spawn meant that when the child exited quickly, all slave references closed before the reader's first read(). macOS returns EIO without draining the output buffer, causing portable-pty to report Ok(0) and read_to_end to return empty data. Windows (read_to_end_returns_exit_status_nonzero timeout): the reader pipe only gets EOF when ClosePseudoConsole is called, which requires all Arc<Mutex<Inner>> references (held by master and slave) to be dropped. The master was held in PtyWriter and never explicitly dropped, so EOF depended on undocumented ConPTY auto-close behavior that is unreliable under CI load. The fix wraps master in Arc<Mutex<Option<...>>> (like writer) and drops writer, master, and slave sequentially in the background thread after child.wait() returns. This deterministically triggers EOF on both platforms regardless of timing.
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes two flaky PTY test failures by deferring slave and master cleanup to the background thread after
child.wait()returns, instead of dropping them eagerly during spawn.Bug 1: macOS
is_terminal— empty output (CI #22038626101)Root cause: The parent's slave fd was dropped immediately after spawn (
drop(pty_pair.slave)). When the child exited quickly, all slave references closed before the reader's firstread(). macOS returns EIO on the master fd without draining the output buffer. portable-pty converts EIO →Ok(0), soread_to_endreturns empty data.Reproduction: Adding
thread::sleep(2s)beforeread_to_endin theis_terminaltest deterministically produced empty output (child exits during the sleep, all slave refs close, reader gets EIO on first read).Fix: Keep the slave alive in the background thread. It is dropped only after
child.wait()returns, ensuring the output buffer is fully drained before EIO can occur.Bug 2: Windows
read_to_end_returns_exit_status_nonzero— 5s timeout (CI #22038894925)Root cause: The reader pipe gets EOF only when
ClosePseudoConsoleis called, which requires allArc<Mutex<Inner>>references (held by both master and slave) to reach zero. The slave was already dropped, but the master lived insidePtyWriterand was never explicitly dropped. EOF depended on undocumented ConPTY auto-close behavior that is unreliable under CI load.Reproduction: Adding a 10s sleep in the background thread before closing the writer caused
read_to_endto hang and the test to timeout at 5s — proving EOF depends on cleanup timing, not child exit. With the fix applied, the same test passes even with a 3s delay because the master is explicitly dropped.Fix: Wrap master in
Arc<Mutex<Option<...>>>(like writer) and drop it from the background thread afterchild.wait(). This callsClosePseudoConsole→ closes the stdout pipe → reader gets EOF deterministically.Changes
PtyWriter.master:Box<dyn MasterPty + Send>→Arc<Mutex<Option<Box<dyn MasterPty + Send>>>>Terminal::spawn: background thread now drops writer, master, and slave in sequence afterchild.wait()PtyWriter::resize: handlesOption(returns error if master already dropped)Validation