Description
What happened?
When a Foundry-hosted agent invokes an MCP tool through a Foundry Toolbox connector (FoundryChatClient.get_toolbox(...)), the W3C trace context of the active OpenTelemetry span is not forwarded to the MCP server reached by the toolbox proxy. As a result, the span produced by the downstream MCP service (in our case a Logic Apps "OAuth Identity Passthrough" MCP server) starts a brand-new trace instead of continuing the agent's trace.
In Datadog APM and Datadog LLM Observability we see two disconnected traces:
- one trace for the Hosted Agent (
POST /responses, gen_ai.* spans, etc.),
- a separate, unlinked trace on the Logic Apps side for the actual tool execution.
The same code path with a direct MCPStreamableHTTPTool (no toolbox proxy) propagates traceparent/tracestate correctly via params._meta of tools/call, as documented in python/samples/02-agents/observability/README.md and implemented in python/packages/core/agent_framework/_mcp.py (_inject_otel_into_mcp_meta). Switching the same agent from tools=[MCPStreamableHTTPTool(...)] to tools=toolbox is enough to lose the linkage.
The toolbox path is server-side: the agent process never opens the MCP connection itself. The Foundry platform proxy at FOUNDRY_AGENT_TOOLSET_ENDPOINT (the same component handled by the .NET FoundryToolboxBearerTokenHandler, see dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/FoundryToolboxBearerTokenHandler.cs) only injects Authorization and Foundry-Features and does not appear to forward traceparent/tracestate headers (or inject them in the MCP params._meta) when calling the remote MCP server.
What did you expect to happen?
The Foundry Toolbox proxy should propagate the active W3C trace context to the underlying MCP server, either:
- by forwarding the inbound
traceparent/tracestate HTTP headers received from the agent process to the MCP server, and/or
- by injecting them into the MCP
params._meta of tools/call, equivalent to what _inject_otel_into_mcp_meta does for direct MCP tools.
That way Datadog APM, Datadog LLM Observability, and Application Insights can show a single end-to-end trace: agent → toolbox proxy → MCP server, instead of two disconnected ones.
Steps to reproduce the issue
- Build a Foundry hosted agent with
agent-framework-foundry-hosting==1.0.0a260428, agent-framework-core==1.2.1, agent-framework-foundry==1.2.1.
- Configure OpenTelemetry with the W3C trace context propagator (the default) and an OTLP/HTTP exporter to Datadog (or any backend with distributed tracing).
- Wire two variants of the same agent:
- Variant A (works):
tools=[MCPStreamableHTTPTool(name="...", url=MCP_URL)] pointing directly to the MCP server.
- Variant B (broken):
toolbox = await client.get_toolbox(TOOLBOX_NAME); tools=toolbox against a Foundry Toolbox connector that points to the same MCP server.
- Send a prompt that triggers the tool in both variants.
- In Datadog APM, observe:
- Variant A: a single trace with the agent span as parent and the MCP server span as child.
- Variant B: the agent span and the MCP server span end up in two unrelated traces; the MCP server's
traceparent is freshly generated.
Code Sample
Same agent, two tool wirings. Only the tools= value differs.
from agent_framework import MCPStreamableHTTPTool
from agent_framework_foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer, InMemoryResponseProvider
from azure.identity.aio import ManagedIdentityCredential
Variant A — direct MCP tool, W3C context flows end-to-end.
mcp_tool = MCPStreamableHTTPTool(name="research_tools", url=MCP_URL)
Variant B — Foundry Toolbox proxy, W3C context is dropped at the proxy.
toolbox = await chat_client.get_toolbox(TOOLBOX_NAME, version=TOOLBOX_VERSION)
async def main() -> None:
async with (
ManagedIdentityCredential() as credential,
FoundryChatClient(
project_endpoint=PROJECT_ENDPOINT,
model=MODEL_DEPLOYMENT_NAME,
credential=credential,
allow_preview=True,
).as_agent(
name="ea-ai-hosted-agent-baseline-python",
instructions=SYSTEM_PROMPT,
tools=toolbox, # swap to [mcp_tool] to compare
default_options={
"store": False,
"tool_choice": "required",
"temperature": 0.2,
"max_tokens": 2000,
},
) as agent,
):
server = ResponsesHostServer(agent, store=InMemoryResponseProvider())
await server.run_async()
OpenTelemetry is configured at process start with the default W3C TraceContextTextMapPropagator and an OTLP/HTTP exporter to Datadog agentless intake.
Error Messages / Stack Traces
There is no exception. The symptom is a missing parent/child link in the backend.
In Datadog APM, the agent's run shows a gen_ai.* span tree ending at the mcp.tool.call (or equivalent) span. A separate trace, with no parent, contains the Logic Apps spans for the actual MCP execution. The two carry the same wall-clock window and the same caller identity but no shared trace_id.
Inbound HTTP headers received by the MCP server confirm the issue: traceparent is either absent or contains a brand-new trace-id that does not match the agent's outbound traceparent.
Package Versions
agent-framework-core: 1.2.1, agent-framework-foundry: 1.2.1, agent-framework-openai: 1.2.1, agent-framework-foundry-hosting: 1.0.0a260428, mcp: >=1.24,<2, azure-ai-agentserver-core: 2.0.0b3, azure-ai-agentserver-responses: 1.0.0b5, opentelemetry-api: >=1.40,<1.41, opentelemetry-sdk: >=1.40,<1.41, opentelemetry-exporter-otlp-proto-http: >=1.40,<1.41, azure-monitor-opentelemetry-exporter: >=1.0.0b51
Python Version
Python 3.12
Additional Context
- Backend used to detect the missing link: Datadog APM and Datadog LLM Observability. The same gap is observable in Application Insights end-to-end transaction view.
- The MCP server behind the toolbox is a Logic Apps "OAuth Identity Passthrough" workflow protected by Entra ID (auth code + PKCE). It logs the inbound
traceparent header on every request, which is how we confirmed the value is not the one emitted by the agent.
- Direct MCP path works as advertised in
python/samples/02-agents/observability/README.md: "Whenever there is an active OpenTelemetry span context, Agent Framework automatically propagates trace context to MCP servers via the params._meta field of tools/call requests." The toolbox path bypasses that injection because the tools/call is issued by the Foundry proxy, not by the agent process.
- Possible fix surfaces:
- Have the toolbox proxy forward
traceparent/tracestate HTTP headers received from the agent to the downstream MCP server.
- Or have it re-inject the trace context into the MCP
params._meta from the inbound HTTP headers, mirroring what _inject_otel_into_mcp_meta does on the client side.
- Workaround we are using meanwhile: when full distributed tracing is required for a given environment, switch to the direct
MCPStreamableHTTPTool path and forgo the toolbox-managed OAuth identity passthrough. This is not viable in production for OAuth-protected MCP servers.
- Related: ADR
docs/decisions/0025-foundry-toolbox-support.md covers toolbox span enrichment but does not address span linking across the proxy boundary.
- Happy to share a sanitized HAR or OTel export from both variants privately if useful.
Description
What happened?
When a Foundry-hosted agent invokes an MCP tool through a Foundry Toolbox connector (
FoundryChatClient.get_toolbox(...)), the W3C trace context of the active OpenTelemetry span is not forwarded to the MCP server reached by the toolbox proxy. As a result, the span produced by the downstream MCP service (in our case a Logic Apps "OAuth Identity Passthrough" MCP server) starts a brand-new trace instead of continuing the agent's trace.In Datadog APM and Datadog LLM Observability we see two disconnected traces:
POST /responses,gen_ai.*spans, etc.),The same code path with a direct
MCPStreamableHTTPTool(no toolbox proxy) propagatestraceparent/tracestatecorrectly viaparams._metaoftools/call, as documented inpython/samples/02-agents/observability/README.mdand implemented inpython/packages/core/agent_framework/_mcp.py(_inject_otel_into_mcp_meta). Switching the same agent fromtools=[MCPStreamableHTTPTool(...)]totools=toolboxis enough to lose the linkage.The toolbox path is server-side: the agent process never opens the MCP connection itself. The Foundry platform proxy at
FOUNDRY_AGENT_TOOLSET_ENDPOINT(the same component handled by the .NETFoundryToolboxBearerTokenHandler, seedotnet/src/Microsoft.Agents.AI.Foundry.Hosting/FoundryToolboxBearerTokenHandler.cs) only injectsAuthorizationandFoundry-Featuresand does not appear to forwardtraceparent/tracestateheaders (or inject them in the MCPparams._meta) when calling the remote MCP server.What did you expect to happen?
The Foundry Toolbox proxy should propagate the active W3C trace context to the underlying MCP server, either:
traceparent/tracestateHTTP headers received from the agent process to the MCP server, and/orparams._metaoftools/call, equivalent to what_inject_otel_into_mcp_metadoes for direct MCP tools.That way Datadog APM, Datadog LLM Observability, and Application Insights can show a single end-to-end trace: agent → toolbox proxy → MCP server, instead of two disconnected ones.
Steps to reproduce the issue
agent-framework-foundry-hosting==1.0.0a260428,agent-framework-core==1.2.1,agent-framework-foundry==1.2.1.tools=[MCPStreamableHTTPTool(name="...", url=MCP_URL)]pointing directly to the MCP server.toolbox = await client.get_toolbox(TOOLBOX_NAME); tools=toolboxagainst a Foundry Toolbox connector that points to the same MCP server.traceparentis freshly generated.Code Sample
Same agent, two tool wirings. Only the
tools=value differs.Variant A — direct MCP tool, W3C context flows end-to-end.
mcp_tool = MCPStreamableHTTPTool(name="research_tools", url=MCP_URL)Variant B — Foundry Toolbox proxy, W3C context is dropped at the proxy.
toolbox = await chat_client.get_toolbox(TOOLBOX_NAME, version=TOOLBOX_VERSION)OpenTelemetry is configured at process start with the default W3C
TraceContextTextMapPropagatorand an OTLP/HTTP exporter to Datadog agentless intake.Error Messages / Stack Traces
There is no exception. The symptom is a missing parent/child link in the backend.
In Datadog APM, the agent's run shows a
gen_ai.*span tree ending at themcp.tool.call(or equivalent) span. A separate trace, with no parent, contains the Logic Apps spans for the actual MCP execution. The two carry the same wall-clock window and the same caller identity but no sharedtrace_id.Inbound HTTP headers received by the MCP server confirm the issue:
traceparentis either absent or contains a brand-newtrace-idthat does not match the agent's outboundtraceparent.Package Versions
agent-framework-core: 1.2.1, agent-framework-foundry: 1.2.1, agent-framework-openai: 1.2.1, agent-framework-foundry-hosting: 1.0.0a260428, mcp: >=1.24,<2, azure-ai-agentserver-core: 2.0.0b3, azure-ai-agentserver-responses: 1.0.0b5, opentelemetry-api: >=1.40,<1.41, opentelemetry-sdk: >=1.40,<1.41, opentelemetry-exporter-otlp-proto-http: >=1.40,<1.41, azure-monitor-opentelemetry-exporter: >=1.0.0b51
Python Version
Python 3.12
Additional Context
traceparentheader on every request, which is how we confirmed the value is not the one emitted by the agent.python/samples/02-agents/observability/README.md: "Whenever there is an active OpenTelemetry span context, Agent Framework automatically propagates trace context to MCP servers via theparams._metafield oftools/callrequests." The toolbox path bypasses that injection because thetools/callis issued by the Foundry proxy, not by the agent process.traceparent/tracestateHTTP headers received from the agent to the downstream MCP server.params._metafrom the inbound HTTP headers, mirroring what_inject_otel_into_mcp_metadoes on the client side.MCPStreamableHTTPToolpath and forgo the toolbox-managed OAuth identity passthrough. This is not viable in production for OAuth-protected MCP servers.docs/decisions/0025-foundry-toolbox-support.mdcovers toolbox span enrichment but does not address span linking across the proxy boundary.