Skip to content

Commit 014b32a

Browse files
TTTPOBclaude
andcommitted
fix(test): prevent Python 3.10 post-test hang in CI
On Python 3.10, closing a WebSocket fd from another thread does not interrupt selectors.EpollSelector.select(), so WSKernelClient's non-daemon connection_thread can block for up to 10 s after stop_channels() is called. Python waits for all non-daemon threads at exit, causing the test suite to hang 10–20 s after the last test. Patch _run_websocket in conftest to pass ping_timeout=2 so the internal Dispatcher uses sel.select(2) instead of sel.select(10); the thread then exits within ≤2 s of close() and the join() succeeds cleanly. Also add timeout-minutes: 10 to the CI step as a safety net. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 76161d4 commit 014b32a

2 files changed

Lines changed: 41 additions & 0 deletions

File tree

.github/workflows/test.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,4 @@ jobs:
2525

2626
- name: Run tests
2727
run: uv run pytest -v
28+
timeout-minutes: 10

tests/conftest.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,46 @@
1010

1111
import pytest
1212

13+
# ---------------------------------------------------------------------------
14+
# Workaround: Python 3.10 selectors.EpollSelector.select() is not interrupted
15+
# when a registered socket fd is closed from another thread. WSKernelClient
16+
# creates a non-daemon connection_thread that calls run_forever() with no
17+
# ping_timeout, so the internal Dispatcher uses sel.select(10) — the thread
18+
# can block for up to 10 s after stop_channels() is called. Because the
19+
# thread is non-daemon, Python will wait for it at process exit, causing the
20+
# test suite to hang for ~10–20 s after the last test.
21+
#
22+
# Fix: patch _run_websocket to pass ping_timeout=2 so the Dispatcher uses
23+
# sel.select(2) instead. The thread will exit within ≤2 s of close(), well
24+
# within the REQUEST_TIMEOUT join window (10 s). Python 3.12 handles the
25+
# close() interruption correctly and is unaffected by this patch.
26+
# ---------------------------------------------------------------------------
27+
try:
28+
from jupyter_kernel_client.wsclient import WSKernelClient
29+
30+
def _fast_run_websocket(self):
31+
if self.kernel_socket is None:
32+
self.log.error("No websocket defined.")
33+
return
34+
try:
35+
self.kernel_socket.run_forever(
36+
ping_interval=self.ping_interval,
37+
reconnect=self.reconnect_interval,
38+
ping_timeout=2, # keeps sel.select() timeout short so close() unblocks quickly
39+
)
40+
except ValueError as e:
41+
self.log.error(
42+
"Unable to open websocket connection with %s",
43+
self.kernel_socket.url,
44+
exc_info=e,
45+
)
46+
except BaseException as e:
47+
self.log.error("Websocket listener thread stopped.", exc_info=e)
48+
49+
WSKernelClient._run_websocket = _fast_run_websocket
50+
except ImportError:
51+
pass
52+
1353

1454
def _find_free_port() -> int:
1555
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:

0 commit comments

Comments
 (0)