Summary
When using stream_async() with AgentCore (following the official example), events yielded to the caller contain non-JSON-serializable objects (Agent, OpenTelemetry Span, etc.) merged into the event dict by prepare(). This causes AgentCore to fall back to repr() serialization, producing massive Python repr strings on the SSE wire that the consumer cannot parse.
Observed behavior
On long-running agent invocations (~13 tool-calling turns with extended thinking enabled), the SSE response stream hits the AgentCore 100 MB response payload limit and is truncated with HTTP/2 INTERNAL_ERROR before the final structured_result can be emitted.
Measured breakdown of a typical stream:
| Content |
Events |
Size |
% of stream |
ModelStreamEvent subclasses (repr serialized) |
782 |
99.8 MB |
99.8% |
ModelStreamChunkEvent (valid JSON) |
956 |
200 KB |
0.2% |
The 782 repr-serialized events average 131 KB each because they contain the full chain-of-thought reasoning text. They appear on the wire as Python repr (single quotes) instead of JSON, e.g.:
data: {"reasoningText": "Let me think...", "delta": {"reasoningContent": {"text": "Let me think..."}}, "reasoning": true, "agent": <strands.agent.agent.Agent object at 0x...>, "event_loop_cycle_span": <opentelemetry.trace.Span object at 0x...>, ...}
Root cause
ModelStreamEvent.prepare() (src/strands/types/_events.py) merges invocation_state directly into the event dict:
class ModelStreamEvent(TypedEvent):
def prepare(self, invocation_state: dict) -> None:
if "delta" in self:
self.update(invocation_state)
invocation_state contains non-serializable objects (Agent instance, Span, etc.). Since all ModelStreamEvent subclasses (TextStreamEvent, ReasoningTextStreamEvent, ToolUseStreamEvent, ReasoningSignatureStreamEvent) have a "delta" key, they all get these fields merged in.
ModelStreamChunkEvent is unaffected because it has no prepare() override.
Reproduction
Follow the AgentCore streaming example with an agent that uses extended thinking and makes multiple tool calls. The stream will contain both valid JSON events and Python repr events. On long invocations the repr events accumulate past 100 MB and the stream is truncated.
Current workaround
We filter events before yielding to AgentCore:
try:
json.dumps(event)
except (TypeError, ValueError):
continue
yield event
This works but feels like a workaround for something the SDK should handle.
Possible fixes (open question — we may be missing something)
We are not sure if there is a reason prepare() needs to merge into the dict itself, or if there is a recommended pattern for filtering events before yielding to AgentCore that we have missed. If so, we would appreciate guidance.
Some ideas if this is indeed unintended:
- Store internal state separately — e.g.
event._context = invocation_state instead of self.update(invocation_state), keeping the dict serializable
- Strip internal fields in
stream_async() before yielding to the caller
- Document the filtering requirement in the AgentCore streaming example
Environment
- strands-agents >= 0.1.0
- AgentCore Runtime (us-east-1)
- Agent uses Bedrock Claude with extended thinking and structured output
- ~13 tool-calling turns per invocation
Summary
When using
stream_async()with AgentCore (following the official example), events yielded to the caller contain non-JSON-serializable objects (Agent, OpenTelemetrySpan, etc.) merged into the event dict byprepare(). This causes AgentCore to fall back torepr()serialization, producing massive Python repr strings on the SSE wire that the consumer cannot parse.Observed behavior
On long-running agent invocations (~13 tool-calling turns with extended thinking enabled), the SSE response stream hits the AgentCore 100 MB response payload limit and is truncated with HTTP/2
INTERNAL_ERRORbefore the finalstructured_resultcan be emitted.Measured breakdown of a typical stream:
ModelStreamEventsubclasses (repr serialized)ModelStreamChunkEvent(valid JSON)The 782 repr-serialized events average 131 KB each because they contain the full chain-of-thought reasoning text. They appear on the wire as Python repr (single quotes) instead of JSON, e.g.:
Root cause
ModelStreamEvent.prepare()(src/strands/types/_events.py) mergesinvocation_statedirectly into the event dict:invocation_statecontains non-serializable objects (Agent instance, Span, etc.). Since allModelStreamEventsubclasses (TextStreamEvent,ReasoningTextStreamEvent,ToolUseStreamEvent,ReasoningSignatureStreamEvent) have a"delta"key, they all get these fields merged in.ModelStreamChunkEventis unaffected because it has noprepare()override.Reproduction
Follow the AgentCore streaming example with an agent that uses extended thinking and makes multiple tool calls. The stream will contain both valid JSON events and Python repr events. On long invocations the repr events accumulate past 100 MB and the stream is truncated.
Current workaround
We filter events before yielding to AgentCore:
This works but feels like a workaround for something the SDK should handle.
Possible fixes (open question — we may be missing something)
We are not sure if there is a reason
prepare()needs to merge into the dict itself, or if there is a recommended pattern for filtering events before yielding to AgentCore that we have missed. If so, we would appreciate guidance.Some ideas if this is indeed unintended:
event._context = invocation_stateinstead ofself.update(invocation_state), keeping the dict serializablestream_async()before yielding to the callerEnvironment