Summary
The Mistral Conversations API (client.beta.conversations.start(), .append(), etc.) was recently instrumented (per #273), but server-side tool executions within conversation responses are not decomposed into child SpanTypeAttribute.TOOL spans. Tool execution entries (code interpreter, web search, image generation, document library) appear as opaque entries in the span's output array, without individual tool spans that would allow users to drill into each execution.
This is an asymmetry within the Mistral integration itself: the chat and agents paths now create child TOOL spans via _log_completion_tool_spans() (per #378), but the conversations finalization path (_finalize_conversation_response()) does not call any tool span logic.
What is missing
_finalize_conversation_response() at line 1034 of py/src/braintrust/integrations/mistral/tracing.py logs the conversation output and ends the span, but never calls _log_completion_tool_spans() or an equivalent:
def _finalize_conversation_response(span, request_metadata, response, start_time):
response_data = _normalized_mistral_dict(response)
response_metadata = _conversation_response_data_to_metadata(response_data)
usage = response_data.get("usage") if response_data else None
_log_and_end_span(
span,
output=_conversation_outputs_data(response_data), # Full outputs array — tool executions are opaque
metrics=_merge_metrics(start_time, usage),
metadata={**request_metadata, **response_metadata},
)
Contrast with the chat/agents paths (lines 1025 and 1097) which call _log_completion_tool_spans(response_data, parent_span=span) before _log_and_end_span().
Conversation outputs include tool executions
The Mistral Conversations API returns an outputs array containing mixed entry types:
- Message entries (assistant text responses)
- Tool execution entries (code interpreter output, web search results, image generation results)
- Function call entries (custom tool invocations)
- Agent handoff entries
Each tool execution entry in outputs has a type (e.g., tool_execution), the tool name, input/output, and status. These should be decomposed into child TOOL spans matching how the chat/agents paths now work.
Comparison within the Mistral integration
| Mistral API surface |
Tool calls in output? |
Child TOOL spans? |
client.chat.complete() / .stream() |
Yes |
Yes (via _log_completion_tool_spans) |
client.agents.complete() / .stream() |
Yes |
Yes (via _log_completion_tool_spans) |
client.beta.conversations.start() / .append() |
Yes (in outputs array) |
No |
Comparison with other providers' agentic surfaces
| Provider |
Agentic surface |
Tool span decomposition? |
| OpenAI (Responses API) |
responses.create() |
Yes |
| Anthropic (Managed Agents) |
beta.sessions.events.stream() |
Yes |
| Google GenAI (Interactions) |
interactions.create() |
Yes |
| Mistral (Conversations) |
beta.conversations.start() |
No |
Minimum fix
- Add a
_log_conversation_tool_spans() function (or adapt _log_completion_tool_spans()) that iterates over conversation outputs entries and creates child TOOL spans for tool execution entries
- Call it from
_finalize_conversation_response() before _log_and_end_span()
- Apply the same logic to the streaming conversation aggregation path
- Add VCR-backed test for a conversation with server-side tool execution
Braintrust docs status
not_found — The Mistral integration page does not mention the Conversations API or tool span decomposition for conversations.
Upstream sources
Local files inspected
py/src/braintrust/integrations/mistral/tracing.py:
_finalize_conversation_response() (line 1034) — does NOT call _log_completion_tool_spans() or equivalent
_log_completion_tool_spans() (line 990) — exists and works for chat/agents; not called for conversations
_conversation_outputs_data() (line 459) — returns the raw outputs array without tool span extraction
_aggregate_conversation_events() (line 919) — streaming aggregation; no tool span logic
py/src/braintrust/integrations/mistral/test_mistral.py:
test_wrap_mistral_chat_complete_tool_spans (line 249) — validates chat tool spans exist
- No equivalent test for conversations tool spans
Summary
The Mistral Conversations API (
client.beta.conversations.start(),.append(), etc.) was recently instrumented (per #273), but server-side tool executions within conversation responses are not decomposed into childSpanTypeAttribute.TOOLspans. Tool execution entries (code interpreter, web search, image generation, document library) appear as opaque entries in the span's output array, without individual tool spans that would allow users to drill into each execution.This is an asymmetry within the Mistral integration itself: the chat and agents paths now create child TOOL spans via
_log_completion_tool_spans()(per #378), but the conversations finalization path (_finalize_conversation_response()) does not call any tool span logic.What is missing
_finalize_conversation_response()at line 1034 ofpy/src/braintrust/integrations/mistral/tracing.pylogs the conversation output and ends the span, but never calls_log_completion_tool_spans()or an equivalent:Contrast with the chat/agents paths (lines 1025 and 1097) which call
_log_completion_tool_spans(response_data, parent_span=span)before_log_and_end_span().Conversation outputs include tool executions
The Mistral Conversations API returns an
outputsarray containing mixed entry types:Each tool execution entry in
outputshas a type (e.g.,tool_execution), the tool name, input/output, and status. These should be decomposed into child TOOL spans matching how the chat/agents paths now work.Comparison within the Mistral integration
client.chat.complete()/.stream()_log_completion_tool_spans)client.agents.complete()/.stream()_log_completion_tool_spans)client.beta.conversations.start()/.append()outputsarray)Comparison with other providers' agentic surfaces
responses.create()beta.sessions.events.stream()interactions.create()beta.conversations.start()Minimum fix
_log_conversation_tool_spans()function (or adapt_log_completion_tool_spans()) that iterates over conversationoutputsentries and creates child TOOL spans for tool execution entries_finalize_conversation_response()before_log_and_end_span()Braintrust docs status
not_found — The Mistral integration page does not mention the Conversations API or tool span decomposition for conversations.
Upstream sources
outputsarray with execution detailsLocal files inspected
py/src/braintrust/integrations/mistral/tracing.py:_finalize_conversation_response()(line 1034) — does NOT call_log_completion_tool_spans()or equivalent_log_completion_tool_spans()(line 990) — exists and works for chat/agents; not called for conversations_conversation_outputs_data()(line 459) — returns the rawoutputsarray without tool span extraction_aggregate_conversation_events()(line 919) — streaming aggregation; no tool span logicpy/src/braintrust/integrations/mistral/test_mistral.py:test_wrap_mistral_chat_complete_tool_spans(line 249) — validates chat tool spans exist