feat: parse SSE streaming responses for response plugins#138
Closed
noyitz wants to merge 2 commits into
Closed
Conversation
9e42655 to
15a7ee9
Compare
When the response body is not valid JSON (e.g., SSE/Server-Sent Events from streaming providers like Anthropic), parse the SSE data lines to extract usage and model information. This enables response plugins (usage-tracking, metering) to process streaming responses that were previously skipped with "Failed to parse response body as JSON". Also fixes two issues with streaming response handling: 1. Always respond to response headers so Envoy proceeds with body chunks (previously returned nil, causing per-message timeout) 2. Send an immediate ack for each non-EoS response body chunk so Envoy continues forwarding subsequent chunks instead of blocking Signed-off-by: Noy Itzikowitz <nitzikow@redhat.com>
15a7ee9 to
72b5454
Compare
shmuelk
reviewed
Jun 7, 2026
Comment on lines
67
to
77
| if err := json.Unmarshal(responseBodyBytes, &reqCtx.Response.Body); err != nil { | ||
| logger.Error(err, "Failed to parse response body as JSON, skipping response plugins") | ||
| return s.generateEmptyResponseBodyResponse(responseBodyBytes), nil | ||
| // Try parsing as SSE (Server-Sent Events) — streaming responses from providers | ||
| // like Anthropic use SSE format which isn't valid JSON. | ||
| if sseBody, sseErr := parseSSEResponseBody(responseBodyBytes); sseErr == nil && sseBody != nil { | ||
| reqCtx.Response.Body = sseBody | ||
| logger.V(logutil.VERBOSE).Info("parsed SSE response body for response plugins") | ||
| } else { | ||
| logger.Error(err, "Failed to parse response body as JSON or SSE, skipping response plugins") | ||
| return s.generateEmptyResponseBodyResponse(responseBodyBytes), nil | ||
| } | ||
| } |
Collaborator
There was a problem hiding this comment.
Why try to parse a JSON and fail? Why not look at the content-type header and parse accordingly?
Address review feedback from @shmuelk: instead of trying JSON parse and falling back to SSE on failure, check the Content-Type response header upfront to select the correct parser. - text/event-stream → SSE parser (parseSSEResponseBody) - anything else → JSON parser (json.Unmarshal) Also fix streaming tests: - JSON body tests now use content-type: application/json (not text/event-stream) - Tests receive response header ack before sending body chunks - Tests receive chunk acks for non-final streaming chunks
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Problem
When providers like Anthropic return streaming responses (SSE format with
text/event-stream), the response body starts withevent:not{. The current code fails JSON parsing and skips all response plugins:Usage-tracking, metering, and any other response plugins never execute for streaming requests. Additionally,
HandleResponseHeadersreturnsnilfor streaming, causing Envoyper-message_timeout_exceeded.Solution
pkg/handlers/response.go:data:lines for JSON objects, extractusageandmodelfieldspkg/handlers/server.go:3. Send immediate
BodyResponse{}ack for non-EoS response body chunks so Envoy continues streamingClient sees streamed output in real-time. Chunks accumulated in-memory (bounded by max_tokens) and parsed at EoS.
Test plan
Generated with Claude Code