fix(bedrock): convert non-200 SSE error events to APIStatusError#1555
fix(bedrock): convert non-200 SSE error events to APIStatusError#1555nileshpatil6 wants to merge 2 commits into
Conversation
When Bedrock returns a non-200 status code inside an SSE event (e.g. internalServerException with status 400), the stream decoder was raising a raw ValueError which users could not catch via the standard anthropic.APIError hierarchy. Instead, yield a ServerSentEvent with event="error" containing the Bedrock error body so the existing error-handling path in Stream and AsyncStream calls _make_status_error and raises the appropriate APIStatusError subclass. Fixes anthropics#1477
Zawwarsami16
left a comment
There was a problem hiding this comment.
Direction is right — the raw ValueError was uncatchable through the SDK exception hierarchy and that needed fixing. But I think the status-code mapping the PR description claims will not actually fire with the current shape, and there is no test that would catch the regression. Two specific concerns and a minor one.
1. The error-subclass mapping is keyed on the HTTP response, not the SSE frame
The PR body says:
The existing
Stream.__stream__andAsyncStream.__stream__already handleevent == "error"by callingself._client._make_status_error(...), which maps the exception to the correctAPIStatusErrorsubclass (InternalServerError,BadRequestError, etc.).
That mapping lives in _client.py _make_status_error (and BaseMantleClient._make_status_error in lib/bedrock/_mantle.py), and both versions branch on response.status_code:
if response.status_code == 400:
return _exceptions.BadRequestError(...)
...
if response.status_code >= 500:
return _exceptions.InternalServerError(...)
return APIStatusError(err_msg, response=response, body=body)The response passed in is self.response from Stream/AsyncStream, which is the HTTP-level httpx.Response. For a Bedrock streaming call where the HTTP response was 200 OK and the error lives inside an SSE frame, response.status_code is 200, every branch above misses, and we land on the final fallback APIStatusError. So a Bedrock 400-frame yields a generic APIStatusError rather than BadRequestError, and a 500-frame yields generic APIStatusError rather than InternalServerError.
The end result is still catchable via anthropic.APIError, which is a genuine improvement over the previous ValueError. But the more specific subclass mapping the PR body promises will not happen.
Two cleanest fix paths I can see:
(a) Embed the Bedrock status code in the error body and override BaseMantleClient._make_status_error to consult it first. Then the SSE error decoder includes something like {"_bedrock_status": 400, "type": "error", "error": {...}}, and _make_status_error checks isinstance(body, dict) and body.get("_bedrock_status") before falling back to response.status_code.
(b) Raise the typed exception directly from _parse_message_from_event (or right after the decoder returns the error SSE), bypassing _make_status_error entirely for the Bedrock path. This requires the decoder to import _exceptions but keeps the routing local to the Bedrock code that already understands the frame format.
Either one is fine; (a) is the more general pattern.
2. No test coverage for the new path
The one file changed is the decoder itself; no new test exercises the non-200 frame path. The existing decoder tests presumably only cover 200 frames, so neither the old ValueError nor the new ServerSentEvent shape was protected against regression. A unit test with a synthesized EventStreamMessage carrying status_code=400 + :exception-type: validationException + a JSON body would lock the contract.
3. Minor: bytes fallback formatting
In the JSON-decode fallback:
except Exception:
err_message = str(raw_body)If raw_body is bytes and JSON parsing fails, this yields the bytes-literal repr (b'{\n "..."') inside the error message. raw_body.decode(errors="replace") (with a small isinstance guard) gives a cleaner string for the caller to read.
Happy to PR (a) or (b) on top of this branch if helpful, but worth confirming the maintainer's preference first since the choice affects how downstream callers catch the error.
Bedrock streaming errors arrive as non-200 event frames inside 200 OK HTTP responses. The previous code raised ValueError (untyped) or always fell through to generic APIStatusError because _make_status_error only checked response.status_code (always 200 for Bedrock streams). - embed _bedrock_status in the error SSE body so the real status code propagates through the streaming pipeline - update BaseBedrockClient and BaseMantleClient._make_status_error to prefer _bedrock_status over response.status_code when present - fix bytes fallback: use .decode(errors='replace') instead of str() to avoid b'...' repr noise in error messages - add unit tests covering the non-200 frame path and status mapping
|
Thanks for the detailed review @Zawwarsami16 -- all three points addressed in the latest commit: 1. Status-code-based exception mapping The approach: 2. Unit tests Added
3. Changed to |
Problem
When Bedrock returns a non-200 status code inside an SSE event frame (for example
internalServerExceptionwithstatus_code: 400orinternalServerExceptionwithstatus_code: 500),AWSEventStreamDecoder._parse_message_from_eventraises a rawValueError:This
ValueErroris not part of the SDK error hierarchy, so user code that catchesanthropic.APIError/anthropic.InternalServerError/ etc. silently misses the error and has to special-case a raw Python built-in exception type.Fixes #1477
Root cause
_parse_message_from_eventusedraise ValueError(...)directly instead of routing the error through the existing SSE error path.Fix
Instead of raising
ValueError, the method now returns aServerSentEvent(event="error", data=<json>). The existingStream.__stream__andAsyncStream.__stream__already handleevent == "error"by callingself._client._make_status_error(...), which maps the exception to the correctAPIStatusErrorsubclass (InternalServerError,BadRequestError, etc.).The error body is constructed from the Bedrock event headers (
:exception-type) and the JSON body (messagefield), giving callers the full context they need.What changed
src/anthropic/lib/bedrock/_stream_decoder.py: replaceraise ValueError(...)with a yieldedServerSentEvent(event="error", ...)containing the Bedrock exception type and message. Also changed the return type of_parse_message_from_eventfromstr | NonetoServerSentEvent | None(the callers already just checked truthiness and yielded the result), and updatediter_bytes/aiter_bytesto yield the returnedServerSentEventdirectly rather than constructing a new one.Before / after
Before:
After: