Skip to content

ProcessError exit_code/stderr stripped during error propagation; users see generic Exception with hard-coded "Check stderr output for details" placeholder #800

@ty13r

Description

@ty13r

Summary

When the bundled CLI subprocess exits non-zero, the original ProcessError (with structured exit_code and stderr fields) is destroyed during propagation through the message reader and re-raised as a generic Exception carrying only a string. Users cannot catch ProcessError, cannot access exit_code, and cannot get actual subprocess stderr — the stderr field of the original error is itself a hard-coded literal "Check stderr output for details", never the real subprocess output.

This is a triple defect compounding across three files and makes diagnosing Command failed with exit code 1 failures effectively impossible without forking the SDK.

Verified reproduction

import asyncio
from claude_agent_sdk import ClaudeAgentOptions, ProcessError, query

async def main():
    options = ClaudeAgentOptions(
        model="claude-haiku-4-5-20241022",  # not a current model alias
        max_turns=1,
        permission_mode="dontAsk",
    )
    try:
        async for _msg in query(prompt="hi", options=options):
            pass
    except Exception as exc:
        print(f"caught type:    {type(exc).__name__}")
        print(f"is ProcessError: {isinstance(exc, ProcessError)}")
        print(f"message:        {exc}")
        print(f"has .stderr:    {hasattr(exc, 'stderr')}")
        print(f"has .exit_code: {hasattr(exc, 'exit_code')}")

asyncio.run(main())

Actual output:

caught type:    Exception
is ProcessError: False
message:        Command failed with exit code 1 (exit code: 1)
Error output: Check stderr output for details
has .stderr:    False
has .exit_code: False

Expected output (any one of these would be a valid fix):

caught type:    ProcessError
is ProcessError: True
message:        Command failed with exit code 1
has .stderr:    True
exc.stderr:     "Error: 'claude-haiku-4-5-20241022' is not a valid model alias.\n  Did you mean 'claude-haiku-4-5-20251001'?"
has .exit_code: True
exc.exit_code:  1

Defect 1 — stderr never captured

src/claude_agent_sdk/_internal/transport/subprocess_cli.py lines 612–618 (current main):

if returncode is not None and returncode != 0:
    self._exit_error = ProcessError(
        f"Command failed with exit code {returncode}",
        exit_code=returncode,
        stderr="Check stderr output for details",   # ← hard-coded literal
    )
    raise self._exit_error

The literal string is what user code sees as the stderr field on the resulting error. The actual stderr stream from self._process is never read into this argument. The SDK only pipes stderr at all when the user explicitly opts in via options.stderr=callback or extra_args["debug-to-stderr"] (lines 371–377). For the default case, stderr is inherited by the parent process and is irretrievable from the exception.

Defect 2 — Original exception class destroyed in message reader

src/claude_agent_sdk/_internal/query.py lines 247–256:

except Exception as e:
    logger.error(f"Fatal error in message reader: {e}")
    # Signal all pending control requests so they fail fast instead of timing out
    for request_id, event in list(self.pending_control_responses.items()):
        if request_id not in self.pending_control_results:
            self.pending_control_results[request_id] = e
            event.set()
    # Put error in stream so iterators can handle it
    await self._message_send.send({"type": "error", "error": str(e)})

The original ProcessError (with its exit_code and stderr attributes) is reduced to str(e) and stuffed into a dict. All structured information is lost.

Defect 3 — Re-raised as generic Exception

src/claude_agent_sdk/_internal/query.py lines 724–733:

async def receive_messages(self) -> AsyncIterator[dict[str, Any]]:
    """Receive SDK messages (not control messages)."""
    async for message in self._message_receive:
        if message.get("type") == "end":
            break
        elif message.get("type") == "error":
            raise Exception(message.get("error", "Unknown error"))
        yield message

This raises a bare Exception, not even ClaudeSDKError. Users who want to write except ProcessError as e: handle_subprocess_failure(e) cannot, because by the time the error reaches them it's no longer a ProcessError and no longer has exit_code or stderr attributes.

Why this matters

  • Concurrent setups: when running multiple query() calls in parallel and one or more fail, the parent terminal interleaves stderr from N subprocesses. Users have no programmatic way to associate a stderr line with a specific failed call. Each exception carries no per-call diagnostic.
  • CI / non-interactive: in CI, parent stderr may be captured but not associated with the exception. Structured error handling is impossible.
  • Closed issues: Haiku model fails with exit code 1 through SDK while working via CLI #515, CANNOT SUPPORT LONG SYSTEM PROMPT #282, multi subagents with long prompt do not work on windows #238 all hit variants of this. Users keep filing bugs against the literal placeholder string because they have no other diagnostic.
  • Type checking: try: ... except ProcessError as e: is the documented pattern but doesn't actually work because the type is destroyed.

Proposed fix

Three small, backward-compatible changes:

Fix 1 — Capture stderr to an internal ring buffer in subprocess_cli.py

Even when the user hasn't set options.stderr=callback, pipe the subprocess's stderr through a small (e.g., 8 KB) ring buffer. On non-zero exit, include the captured tail in the ProcessError:

captured_stderr = self._stderr_buffer.read_tail()  # last 8KB
self._exit_error = ProcessError(
    f"Command failed with exit code {returncode}",
    exit_code=returncode,
    stderr=captured_stderr or "(no stderr captured)",
)

Fix 2 — Preserve the exception object through the message reader

In query.py:255, store the actual exception (not str(e)) in the message:

await self._message_send.send({"type": "error", "exception": e})

Fix 3 — Re-raise the original exception in receive_messages

In query.py:731:

elif message.get("type") == "error":
    exc = message.get("exception")
    if exc is not None:
        raise exc
    raise ClaudeSDKError(message.get("error", "Unknown error"))

This preserves the original ProcessError (or any other ClaudeSDKError subclass) all the way to the user, with exit_code and stderr intact. isinstance(exc, ProcessError) works again.

Environment

  • claude-agent-sdk 0.1.x (current as of April 2026)
  • Python 3.12 on macOS 14, arm64
  • Bundled CLI 2.x

Happy to PR all three fixes if maintainers are open — they're localized and ~30 LOC total.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions