fix(transcript_mirror): handle CancelledError in eager-flush done callback#931
Merged
ashwin-ant merged 1 commit intoMay 14, 2026
Conversation
…lback The ``add_done_callback`` lambda on the eager-flush ``_flush_task`` called ``t.exception()`` unconditionally. In Python 3.8+, ``Task.exception()`` raises ``CancelledError`` for cancelled tasks, and the raise from inside a done-callback surfaces as a noisy "Exception in callback" log every time the SDK shuts down with pending eager flushes (visible in anthropics#928's failing-test traceback). Replace the lambda with a module-level ``_swallow_done_exception`` helper that no-ops on cancelled tasks and otherwise retrieves the exception so asyncio doesn't warn about an unretrieved exception on a fire-and-forget task. Closes anthropics#930 Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
ashwin-ant
pushed a commit
that referenced
this pull request
May 14, 2026
…ic wait (#933) ## Summary Fixes #928. The two eager-flush tests assumed 2 `await asyncio.sleep(0)` yields between consecutive `enqueue` calls were enough for each drain to complete and append. Under lock contention between drains the path from `enqueue` to `store.append` needs ~4 turns (drain releases lock → next drain acquires it → `wait_for(store.append)` schedules its inner task → record). Both tests fail 5/5 locally on Python 3.11.14 / macOS arm64; CI got lucky on event-loop scheduling at merge time of #905. See #928 for the full probe and yield-count sweep. ## Changes ### Unit-level test (`test_eager_mode_flushes_per_frame`) Replace fixed `sleep(0)` count with a new `_wait_until(predicate, timeout=1.0)` helper that yields until `len(store.append_calls)` reaches the expected value, with a 1-second deadline. Deterministic — works regardless of Python / pytest-asyncio / OS scheduling differences. ### Integration-level test (`test_eager_flush_mode_appends_per_frame_before_result`) Convert `_make_mock_transport`'s `yield_between: bool` to `yields_between: int` (default `0`) and pass `yields_between=10` for this test, so the mock yields the loop enough times between frames for each eager flush to drain before the next frame arrives. Robust headroom — 4 was the observed minimum, 10 leaves room for slower environments. The signature change touches only one caller (this same test); other callers omit the parameter and behave identically to before. ## Test plan - [x] `for i in 1 2 3 4 5; do uv run pytest <both tests> -q; done` → 5/5 passed (was 5/5 failed before) - [x] `uv run pytest tests/test_transcript_mirror.py` → 42/42 passed - [x] `ruff check / ruff format` clean ## Related issues / PRs - Filed alongside two other fixes from the same audit pass: #929 (stderr callback swallow → PR #932), #930 (cancellation log noise → PR #931). Independent of those. Co-authored-by: Xian Zheng <xian.zheng@challenger.gauntletai.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
ashwin-ant
approved these changes
May 14, 2026
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #931 +/- ##
=======================================
Coverage ? 89.10%
=======================================
Files ? 23
Lines ? 3983
Branches ? 0
=======================================
Hits ? 3549
Misses ? 434
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #930.
The
add_done_callbacklambda on the eager-flush_flush_taskcalledt.exception()unconditionally. In Python 3.8+,Task.exception()raisesCancelledErrorfor cancelled tasks, and the raise from inside a done-callback surfaces as a noisyException in callbacklog every time the SDK shuts down with pending eager flushes — visible in #928's failing-testCaptured log teardownblock.Changes
src/claude_agent_sdk/_internal/transcript_mirror_batcher.py: replacelambda t: t.exception()with a module-level_swallow_done_exceptionhelper that no-ops on cancelled tasks and otherwise retrieves the exception so asyncio doesn't warn.tests/test_transcript_mirror.py: addTestSwallowDoneExceptionwith three cases — cancelled task (must not raise), failed task (retrieves exception, doesn't re-raise), successful task (no-op).Test plan
uv run pytest tests/test_transcript_mirror.py::TestSwallowDoneException— 3 passeduv run mypy src/— cleanruff check / ruff format— cleanNotes
The fix is intentionally narrow — same call site, same intent (silence "Task exception was never retrieved" warnings), just cancellation-safe. No behavior change on the happy path.