Skip to content

[bot] Anthropic streaming tracer never flushes updated metadata to span (stop_reason, resolved model lost) #120

@braintrust-bot

Description

@braintrust-bot

Summary

The Anthropic Messages streaming tracer updates mt.metadata with stop_reason during stream processing but never writes the updated metadata back to the span. Additionally, the resolved model name and message id from the message_start event are not captured at all. The non-streaming path correctly updates and flushes metadata with all response fields.

This means Anthropic streaming spans have metadata frozen at request time — missing stop_reason (which indicates why the model stopped: end_turn, max_tokens, tool_use, etc.) and the resolved model version (e.g., "claude-haiku-4-5-20251001" instead of the alias sent in the request).

What is missing

1. Streaming path never writes updated metadata to span

In trace/contrib/anthropic/messages.go, parseStreamingResponse (lines 117–199) sets braintrust.output_json and braintrust.metrics but never calls SetJSONAttr(span, "braintrust.metadata", ...):

func (mt *messagesTracer) parseStreamingResponse(span trace.Span, body io.Reader) error {
    // ... scan events ...

    output := mt.postprocessStreamingResults(allResults)
    if len(output) > 0 {
        internal.SetJSONAttr(span, "braintrust.output_json", output)  // ✅ set
    }

    // ... metrics ...
    internal.SetJSONAttr(span, "braintrust.metrics", metrics)  // ✅ set

    return scanner.Err()
    // ❌ braintrust.metadata is NEVER updated after StartSpan
}

Meanwhile, postprocessStreamingResults (line 315) updates mt.metadata["stop_reason"] but this change is orphaned — never written to the span.

2. Non-streaming path flushes metadata correctly

handleMessageResponse (lines 360–409) updates metadata and writes it:

mt.metadata["stop_reason"] = rawMsg["stop_reason"]
mt.metadata["stop_sequence"] = rawMsg["stop_sequence"]
mt.metadata["model"] = rawMsg["model"]
internal.SetJSONAttr(span, "braintrust.metadata", mt.metadata)  // ✅ flushed

3. message_start event carries model and id but neither is captured

VCR cassettes confirm message_start includes the resolved model:

{"type":"message_start","message":{"model":"claude-haiku-4-5-20251001","id":"msg_01XaNC...",...}}

The current message_start handler (lines 153–162) only extracts usage:

case "message_start":
    if message, ok := chunk["message"].(map[string]any); ok {
        if curUsage, ok := message["usage"].(map[string]any); ok {
            // only usage is captured
        }
    }

The resolved model name (which may differ from the request alias) and the message id are ignored.

4. Impact

Field Non-streaming Streaming
stop_reason ✅ In metadata ❌ Computed but never written to span
Resolved model ✅ Updated from response ❌ Only request-time alias
stop_sequence ✅ In metadata ❌ Not captured
Message id ❌ Not captured ❌ Not captured

stop_reason is critical for understanding model behavior — it distinguishes end_turn (natural completion), max_tokens (truncated), and tool_use (model wants to call a tool). Without it, users cannot differentiate truncated responses from complete ones in streaming traces.

5. Comparable integrations

  • OpenAI Responses API streaming: correctly flushes metadata after processing response.completed event (handleResponseCompletedMessage calls SetJSONAttr(span, "braintrust.metadata", ...))
  • OpenAI Chat Completions streaming: has the same metadata flush gap (covered by [bot] OpenAI Chat Completions streaming tracer does not capture response-level metadata #108)
  • Bedrock ConverseStream: correctly sets metadata in finalize() via setJSONAttr(o.log, o.span, "braintrust.metadata", o.metadata)

Braintrust docs status

The Braintrust advanced tracing docs specify that model and similar settings belong in metadata. The Anthropic integration docs describe automatic tracing of "all API calls." The streaming metadata gap is not documented. Status: unclear.

Upstream sources

Braintrust docs sources

Local repo files inspected

  • trace/contrib/anthropic/messages.goparseStreamingResponse() lines 117–199: sets output_json and metrics but never flushes metadata; postprocessStreamingResults() line 315: updates mt.metadata["stop_reason"] but it's orphaned; handleMessageResponse() lines 360–409: non-streaming path correctly flushes metadata
  • trace/contrib/anthropic/messages.gomessage_start handler lines 153–162: only extracts usage, ignores model and id
  • trace/contrib/anthropic/testdata/cassettes/TestStreamingWithCitations.yaml line 52: confirms message_start contains resolved model
  • trace/contrib/openai/responses.gohandleResponseCompletedMessage(): reference showing correct metadata flush in streaming
  • trace/contrib/bedrockruntime/stream.gofinalize(): reference showing correct metadata flush in streaming

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions