Skip to content

[F12] Fix asyncio race in SDK dispatch when SDK turn aborts at error_max_turns #257

@sriumcp

Description

@sriumcp

Problem

When the SDK turn aborts at error_max_turns, nous's dispatcher logs the following:

ERROR:asyncio:an error occurred during closing of asynchronous generator
RuntimeError: aclose(): asynchronous generator is already running
WARNING:asyncio:Loop ... that handles pid 70842 is closed

This is a race between the abort signal and the normal stream-consumption loop: the abort path tries to close the async generator while the consumer is still pulling from it. The retry succeeded in this run, so no observed user impact — but the leak could matter in a more sensitive case (e.g., a non-idempotent side effect, a process that holds file handles), and the noisy traceback obscures the real root-cause log line in the abort report.

Reproducible whenever the SDK aborts hard mid-turn (max_turns, manual kill, etc.).

Desired behavior

The abort path in orchestrator.sdk_dispatch (or wherever the SDK turn loop and abort handler live) should ensure the async generator is fully drained or properly cancelled before close. Likely a one- or two-line fix:

try:
    await asyncio.wait_for(gen.aclose(), timeout=5.0)
except (asyncio.TimeoutError, asyncio.CancelledError, RuntimeError):
    pass  # already running; let the loop tear down naturally

Or use a sentinel pattern where the consumer loop checks an abort flag and exits gracefully, allowing aclose() to find the generator quiescent.

Suggested implementation sketch

  1. Locate the SDK turn dispatcher (likely orchestrator/sdk_dispatch.py or orchestrator/phases/execute_analyze.py).
  2. Find the abort path that triggers aclose() on the async generator.
  3. Refactor to either (a) await the consumer loop's exit before aclose, or (b) catch the "already running" RuntimeError defensively.
  4. Add a regression test that simulates max_turns abort and verifies the dispatcher cleans up without warning logs.

Acceptance criteria

  • No RuntimeError: aclose(): asynchronous generator is already running log lines on max_turns abort.
  • No ERROR:asyncio:an error occurred during closing of asynchronous generator log lines on max_turns abort.
  • Regression test exists for the abort path's cleanup.
  • Friction report F12 row in the tracking issue checks off.

Severity

LOW — cosmetic in observed cases, but a real race condition that could matter in non-idempotent scenarios.

Source

friction-report.md F12, paper-memorytime-mirage campaign (2026-05).


Part of friction-report tracking issue #245.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfriction-reportFrom external campaign-author friction reports

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions