Skip to content

Handle incomplete/failed streaming responses and expose usage in metadata.#122

Open
homin-bt wants to merge 4 commits intomainfrom
issue115
Open

Handle incomplete/failed streaming responses and expose usage in metadata.#122
homin-bt wants to merge 4 commits intomainfrom
issue115

Conversation

@homin-bt
Copy link
Copy Markdown

@homin-bt homin-bt commented Apr 28, 2026

Previously, the streaming response parser only processed response.completed events, silently dropping terminal events for truncated or failed responses. This meant that when a streaming call hit max_output_tokens or encountered a server error, no span metadata was recorded.

Changes:

  • trace/contrib/openai/responses.go: Extend parseStreamingResponse to handle response.incomplete and response.failed events in addition to response.completed. Add "status" and "usage" to the metadata fields captured in handleResponseCompletedMessage, so token counts and terminal status are always visible in the Braintrust UI.
  • trace/contrib/openai/responses_test.go: Unit tests covering all three terminal event types, verifying status and usage appear in span metadata.
  • examples/openai/main.go: Update to use the Responses API (replacing the Chat Completions API), with explicit handling for completed, incomplete, and failed status.

Resolves #115

@homin-bt homin-bt marked this pull request as ready for review April 28, 2026 14:53
@homin-bt homin-bt changed the title Issue115 Handle incomplete/failed streaming responses and expose usage in metadata. Apr 28, 2026
Comment thread trace/contrib/openai/responses_test.go Outdated
func TestResponsesIncompleteStreaming(t *testing.T) {
rt, exporter := newTestResponsesTracer(t)

sseBody := `event: response.output_text.delta
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think rather than having a strict unit test for parsing the SSE stream, we should try to make a real request to the server (using VCR, etc) that reveals this test (e.g. with a very low max tokens). Look at the other tests to see the pattern to follow.

@@ -181,6 +180,7 @@ func (rt *responsesTracer) handleResponseCompletedMessage(span trace.Span, rawMs
metadataFields := []string{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that we are collecting response.incomplete, I think should add incomplete_details as a field here as well.

// parse the other messages too?
if msgType == "response.completed" {
switch msgType {
case "response.completed", "response.failed", "response.incomplete":
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the response.failed case, can we capture the error somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bot] OpenAI Responses API streaming tracer ignores response.failed and response.incomplete terminal events

3 participants