Summary
The OpenAI Chat Completions streaming aggregator does not track choice indices from streamed chunks. When n > 1 is set (requesting multiple completions), deltas from all choices are concatenated into a single output entry, producing garbled content. The non-streaming path correctly preserves all choices separately.
What is missing
1. Streaming aggregation uses a single accumulator for all choice indices
In trace/contrib/openai/chatcompletions.go, postprocessStreamingResults() (lines 170–275) maintains one set of accumulators for all chunks:
var role *string
var content string
var toolCalls []interface{}
var finishReason interface{}
Each streaming chunk contains a choices array with a single element whose index field identifies which logical choice it belongs to. The code processes choices[0] from every chunk (line 183) without checking the index field:
if choiceMap, ok := choices[0].(map[string]any); ok {
delta, ok := choiceMap["delta"].(map[string]any)
// ...
if deltaContent, ok := delta["content"].(string); ok {
content += deltaContent // all choice indices mixed together
}
When n=2, streaming chunks arrive interleaved:
choices: [{"index": 0, "delta": {"content": "Hello"}}]
choices: [{"index": 1, "delta": {"content": "Goodbye"}}]
choices: [{"index": 0, "delta": {"content": " world"}}]
choices: [{"index": 1, "delta": {"content": " world"}}]
The aggregated content becomes "HelloGoodbye world world" — garbled output from both choices concatenated.
2. Output hardcodes index 0
The final output (lines 263–274) always returns a single choice with hardcoded "index": 0:
return []map[string]interface{}{
{
"index": 0,
"message": map[string]interface{}{
"role": finalRole,
"content": content, // garbled when n > 1
"tool_calls": finalToolCalls, // also garbled when n > 1
},
"logprobs": nil,
"finish_reason": finishReason,
},
}
3. Non-streaming path works correctly
handleChatCompletionResponse() (lines 290–328) passes the full choices array from the API response as-is, correctly preserving all n choices with their separate content, tool_calls, and finish_reason.
4. The n parameter is captured in request metadata
The metadata fields list (line 58) includes "n", confirming this is a recognized request parameter. The streaming output should faithfully represent the response for all supported request configurations.
Braintrust docs status
Braintrust docs describe capturing "inputs, outputs, model parameters, token usage, and costs" for LLM calls but do not specifically address multi-choice (n > 1) output handling. Status: unclear.
Upstream sources
- OpenAI Chat Completions API reference: https://platform.openai.com/docs/api-reference/chat/create — documents
n parameter: "How many chat completion choices to generate for each input message"
- OpenAI streaming format: each SSE chunk contains
choices[].index identifying which completion the delta belongs to
openai-go SDK v1.12.0 (used by this repo) supports N parameter in ChatCompletionNewParams
Braintrust docs sources
Local repo files inspected
trace/contrib/openai/chatcompletions.go — postprocessStreamingResults() lines 170–275: single accumulator set for all choices, choices[0] processed without checking index field; output hardcodes "index": 0 at line 265
trace/contrib/openai/chatcompletions.go — handleChatCompletionResponse() lines 290–328: non-streaming path correctly passes all choices as-is
trace/contrib/openai/chatcompletions.go — metadata fields line 58: "n" is captured in request metadata
trace/contrib/openai/traceopenai_test.go — no streaming test with n > 1
Summary
The OpenAI Chat Completions streaming aggregator does not track choice indices from streamed chunks. When
n > 1is set (requesting multiple completions), deltas from all choices are concatenated into a single output entry, producing garbled content. The non-streaming path correctly preserves all choices separately.What is missing
1. Streaming aggregation uses a single accumulator for all choice indices
In
trace/contrib/openai/chatcompletions.go,postprocessStreamingResults()(lines 170–275) maintains one set of accumulators for all chunks:Each streaming chunk contains a
choicesarray with a single element whoseindexfield identifies which logical choice it belongs to. The code processeschoices[0]from every chunk (line 183) without checking theindexfield:When
n=2, streaming chunks arrive interleaved:The aggregated
contentbecomes"HelloGoodbye world world"— garbled output from both choices concatenated.2. Output hardcodes index 0
The final output (lines 263–274) always returns a single choice with hardcoded
"index": 0:3. Non-streaming path works correctly
handleChatCompletionResponse()(lines 290–328) passes the fullchoicesarray from the API response as-is, correctly preserving allnchoices with their separate content, tool_calls, and finish_reason.4. The
nparameter is captured in request metadataThe metadata fields list (line 58) includes
"n", confirming this is a recognized request parameter. The streaming output should faithfully represent the response for all supported request configurations.Braintrust docs status
Braintrust docs describe capturing "inputs, outputs, model parameters, token usage, and costs" for LLM calls but do not specifically address multi-choice (
n > 1) output handling. Status: unclear.Upstream sources
nparameter: "How many chat completion choices to generate for each input message"choices[].indexidentifying which completion the delta belongs toopenai-goSDK v1.12.0 (used by this repo) supportsNparameter inChatCompletionNewParamsBraintrust docs sources
Local repo files inspected
trace/contrib/openai/chatcompletions.go—postprocessStreamingResults()lines 170–275: single accumulator set for all choices,choices[0]processed without checkingindexfield; output hardcodes"index": 0at line 265trace/contrib/openai/chatcompletions.go—handleChatCompletionResponse()lines 290–328: non-streaming path correctly passes all choices as-istrace/contrib/openai/chatcompletions.go— metadata fields line 58:"n"is captured in request metadatatrace/contrib/openai/traceopenai_test.go— no streaming test withn > 1