Skip to content

feat: parse SSE streaming responses for response plugins#4

Closed
Mohammad-nassar10 wants to merge 5 commits into
mainfrom
pr-138
Closed

feat: parse SSE streaming responses for response plugins#4
Mohammad-nassar10 wants to merge 5 commits into
mainfrom
pr-138

Conversation

@Mohammad-nassar10

Copy link
Copy Markdown
Contributor

What type of PR is this?
/kind bug

What this PR does / why we need it:
Clone the PR #138 from main.

noyitz and others added 2 commits June 1, 2026 17:43
When the response body is not valid JSON (e.g., SSE/Server-Sent Events
from streaming providers like Anthropic), parse the SSE data lines to
extract usage and model information. This enables response plugins
(usage-tracking, metering) to process streaming responses that were
previously skipped with "Failed to parse response body as JSON".

Also fixes two issues with streaming response handling:
1. Always respond to response headers so Envoy proceeds with body
   chunks (previously returned nil, causing per-message timeout)
2. Send an immediate ack for each non-EoS response body chunk so
   Envoy continues forwarding subsequent chunks instead of blocking

Signed-off-by: Noy Itzikowitz <nitzikow@redhat.com>
@github-actions github-actions Bot added the size/M label Jun 2, 2026
@Mohammad-nassar10 Mohammad-nassar10 changed the title Pr 138 feat: parse SSE streaming responses for response plugins Jun 2, 2026
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
@ronenkat ronenkat requested a review from davidbreitgand June 4, 2026 06:50
Comment thread pkg/handlers/response.go
Co-authored-by: David Breitgand <davidbreitgand@users.noreply.github.com>
Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com>
@github-actions github-actions Bot added size/L and removed size/M labels Jun 4, 2026
@davidbreitgand

davidbreitgand commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@Mohammad-nassar10 , @ronenkat , PR looks good, but there is a subtle bug that I spotted. Left a suggestion (validated locally on my side) how to fix.

IN SHORT:
The exact faulty assumption is that there is only one data, but in SSE it may need to be the concatenation of several consecutive data: lines from the same event before attempting unmarshalling.

You can verify that by adding the attached unit test to response_test.go

// This test locks in the multiline SSE fix by asserting that there is one logical SSE event
// split across multiple `data:` lines is reconstructed before JSON parsing.
func TestParseSSEResponseBody_MultilineEvent(t *testing.T) {
	body := []byte("data: {\"response\":\n" +
		"data: {\"model\":\"gpt-4o\",\"usage\":{\"input_tokens\":12,\"output_tokens\":34}}}\n\n" +
		"data: [DONE]\n")

	got, err := parseSSEResponseBody(body)
	if err != nil {
		t.Fatalf("parseSSEResponseBody returned unexpected error: %v", err)
	}

	want := map[string]any{
		"model": "gpt-4o",
		"usage": map[string]any{
			"input_tokens":  float64(12),
			"output_tokens": float64(34),
		},
	}
	if diff := cmp.Diff(want, got); diff != "" {
		t.Fatalf("parseSSEResponseBody returned unexpected result, diff(-want, +got): %s", diff)
	}
}

This line breaking is permitted in response and actually is often happening for SSE streaming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants