[bot] OpenAI Chat Completions streaming aggregation corrupts output when `n > 1` (multiple choices)

## Summary

The OpenAI Chat Completions streaming aggregator does not track choice indices from streamed chunks. When `n > 1` is set (requesting multiple completions), deltas from all choices are concatenated into a single output entry, producing garbled content. The non-streaming path correctly preserves all choices separately.

## What is missing

### 1. Streaming aggregation uses a single accumulator for all choice indices

In `trace/contrib/openai/chatcompletions.go`, `postprocessStreamingResults()` (lines 170–275) maintains one set of accumulators for all chunks:

```go
var role *string
var content string
var toolCalls []interface{}
var finishReason interface{}
```

Each streaming chunk contains a `choices` array with a single element whose `index` field identifies which logical choice it belongs to. The code processes `choices[0]` from every chunk (line 183) without checking the `index` field:

```go
if choiceMap, ok := choices[0].(map[string]any); ok {
    delta, ok := choiceMap["delta"].(map[string]any)
    // ...
    if deltaContent, ok := delta["content"].(string); ok {
        content += deltaContent  // all choice indices mixed together
    }
```

When `n=2`, streaming chunks arrive interleaved:
```
choices: [{"index": 0, "delta": {"content": "Hello"}}]
choices: [{"index": 1, "delta": {"content": "Goodbye"}}]
choices: [{"index": 0, "delta": {"content": " world"}}]
choices: [{"index": 1, "delta": {"content": " world"}}]
```

The aggregated `content` becomes `"HelloGoodbye world world"` — garbled output from both choices concatenated.

### 2. Output hardcodes index 0

The final output (lines 263–274) always returns a single choice with hardcoded `"index": 0`:

```go
return []map[string]interface{}{
    {
        "index": 0,
        "message": map[string]interface{}{
            "role":       finalRole,
            "content":    content,        // garbled when n > 1
            "tool_calls": finalToolCalls, // also garbled when n > 1
        },
        "logprobs":      nil,
        "finish_reason": finishReason,
    },
}
```

### 3. Non-streaming path works correctly

`handleChatCompletionResponse()` (lines 290–328) passes the full `choices` array from the API response as-is, correctly preserving all `n` choices with their separate content, tool_calls, and finish_reason.

### 4. The `n` parameter is captured in request metadata

The metadata fields list (line 58) includes `"n"`, confirming this is a recognized request parameter. The streaming output should faithfully represent the response for all supported request configurations.

## Braintrust docs status

Braintrust docs describe capturing "inputs, outputs, model parameters, token usage, and costs" for LLM calls but do not specifically address multi-choice (`n > 1`) output handling. Status: **unclear**.

## Upstream sources

- OpenAI Chat Completions API reference: https://platform.openai.com/docs/api-reference/chat/create — documents `n` parameter: "How many chat completion choices to generate for each input message"
- OpenAI streaming format: each SSE chunk contains `choices[].index` identifying which completion the delta belongs to
- `openai-go` SDK v1.12.0 (used by this repo) supports `N` parameter in `ChatCompletionNewParams`

## Braintrust docs sources

- https://www.braintrust.dev/docs/guides/tracing (general tracing overview)
- https://www.braintrust.dev/docs/integrations/ai-providers/openai (OpenAI integration)

## Local repo files inspected

- `trace/contrib/openai/chatcompletions.go` — `postprocessStreamingResults()` lines 170–275: single accumulator set for all choices, `choices[0]` processed without checking `index` field; output hardcodes `"index": 0` at line 265
- `trace/contrib/openai/chatcompletions.go` — `handleChatCompletionResponse()` lines 290–328: non-streaming path correctly passes all choices as-is
- `trace/contrib/openai/chatcompletions.go` — metadata fields line 58: `"n"` is captured in request metadata
- `trace/contrib/openai/traceopenai_test.go` — no streaming test with `n > 1`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] OpenAI Chat Completions streaming aggregation corrupts output when `n > 1` (multiple choices) #121

Summary

What is missing

1. Streaming aggregation uses a single accumulator for all choice indices

2. Output hardcodes index 0

3. Non-streaming path works correctly

4. The `n` parameter is captured in request metadata

Braintrust docs status

Upstream sources

Braintrust docs sources

Local repo files inspected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bot] OpenAI Chat Completions streaming aggregation corrupts output when n > 1 (multiple choices) #121

Description

Summary

What is missing

1. Streaming aggregation uses a single accumulator for all choice indices

2. Output hardcodes index 0

3. Non-streaming path works correctly

4. The n parameter is captured in request metadata

Braintrust docs status

Upstream sources

Braintrust docs sources

Local repo files inspected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[bot] OpenAI Chat Completions streaming aggregation corrupts output when `n > 1` (multiple choices) #121

4. The `n` parameter is captured in request metadata