Skip to content

Commit 6e57489

Browse files
.Net: Add OpenTelemetry Support and Samples (#182)
* Adding sample and implementation similar to MEAI approach * Add Telemetry UnitTests * Fix Async suffix * Add ADR with the proposal * Address merge changes * Fixing const visibility + coverage * Increase test coverage, add metrics collection code paths * Fix warnings * WIp * Convention adeherence * Add gen-ai.system logic + UT * Add convetion reference * Address PR comments * Addressing PR comments, Agent name optional * Remove constant * Update dotnet/src/Microsoft.Extensions.AI.Agents/ChatCompletion/ChatClientAgent.cs Co-authored-by: westey <164392973+westey-m@users.noreply.github.com> * GetLoggingName --------- Co-authored-by: westey <164392973+westey-m@users.noreply.github.com>
1 parent dd32a6d commit 6e57489

12 files changed

Lines changed: 1896 additions & 6 deletions
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
status: proposed
3+
contact: rogerbarreto
4+
date: 2025-07-14
5+
deciders: stephentoub, markwallace-microsoft, rogerbarreto, westey-m
6+
informed: {}
7+
---
8+
9+
# Agent OpenTelemetry Instrumentation
10+
11+
## Context and Problem Statement
12+
13+
Currently, the Agent Framework lacks comprehensive observability and telemetry capabilities, making it difficult for developers to monitor agent performance, track usage patterns, debug issues, and gain insights into agent behavior in production environments. While the underlying ChatClient implementations may have their own telemetry, there is no standardized way to capture agent-specific metrics and traces that provide visibility into agent operations, token usage, response times, and error patterns at the agent abstraction level.
14+
15+
## Decision Drivers
16+
17+
- **Compliance**: The implementation should adhere to established OpenTelemetry semantic conventions for agents, ensuring consistency and interoperability with existing telemetry systems.
18+
- **Observability Requirements**: Developers need comprehensive telemetry to monitor agent performance, track usage patterns, and debug issues in production environments.
19+
- **Standardization**: The solution must follow established OpenTelemetry semantic conventions and integrate seamlessly with existing .NET telemetry infrastructure.
20+
- **Microsoft.Extensions.AI Alignment**: The implementation should follow the exact patterns and conventions established by Microsoft.Extensions.AI's OpenTelemetry instrumentation.
21+
- **Non-Intrusive Design**: Telemetry should be optional and not impact the core agent functionality or performance when disabled.
22+
- **Agent-Level Insights**: The telemetry should capture agent-specific operations without duplicating underlying ChatClient telemetry.
23+
- **Extensibility**: The solution should support future enhancements and additional telemetry scenarios.
24+
25+
## Considered Options
26+
27+
### Option 1: Direct Integration into Core Agent Classes
28+
29+
Embed OpenTelemetry instrumentation directly into the base `Agent` class and `ChatClientAgent` implementations.
30+
31+
#### Pros
32+
- Automatic telemetry for all agent implementations
33+
- No additional wrapper classes needed
34+
- Consistent telemetry across all agents
35+
36+
#### Cons
37+
- Violates single responsibility principle
38+
- Increases complexity of core agent classes
39+
- Makes telemetry mandatory rather than optional
40+
- Harder to test and maintain
41+
- Couples telemetry concerns with business logic
42+
43+
### Option 2: Aspect-Oriented Programming (AOP) Approach
44+
45+
Use interceptors or AOP frameworks to inject telemetry behavior into agent methods.
46+
47+
#### Pros
48+
- Clean separation of concerns
49+
- Non-intrusive to existing code
50+
- Can be applied selectively
51+
52+
#### Cons
53+
- Adds complexity with AOP framework dependencies
54+
- Runtime overhead for interception
55+
- Harder to debug and understand
56+
- Not consistent with Microsoft.Extensions.AI patterns
57+
58+
### Option 3: OpenTelemetryAgent Wrapper Pattern
59+
60+
Create a delegating `OpenTelemetryAgent` wrapper class that implements the `Agent` interface and wraps any existing agent with telemetry instrumentation, following the exact pattern of Microsoft.Extensions.AI's `OpenTelemetryChatClient`.
61+
62+
#### Pros
63+
- Follows established Microsoft.Extensions.AI patterns exactly
64+
- Clean separation of concerns
65+
- Optional and non-intrusive
66+
- Easy to test and maintain
67+
- Consistent with .NET telemetry conventions
68+
- Supports any agent implementation
69+
- Provides agent-level telemetry without duplicating ChatClient telemetry
70+
71+
#### Cons
72+
- Requires explicit wrapping of agents
73+
- Additional object allocation for wrapper
74+
75+
## Decision Outcome
76+
77+
Chosen option: "OpenTelemetryAgent Wrapper Pattern", because it follows the established Microsoft.Extensions.AI patterns exactly, provides clean separation of concerns, maintains optional telemetry, and offers the best balance of functionality, maintainability, and consistency with existing .NET telemetry infrastructure.
78+
79+
### Implementation Details
80+
81+
The implementation includes:
82+
83+
1. **OpenTelemetryAgent Wrapper Class**: A delegating agent that wraps any `Agent` implementation with telemetry instrumentation
84+
2. **AgentOpenTelemetryConsts**: Comprehensive constants for telemetry attribute names and metric definitions
85+
3. **Extension Methods**: `.WithOpenTelemetry()` extension method for easy agent wrapping
86+
4. **Comprehensive Test Suite**: Full test coverage following Microsoft.Extensions.AI testing patterns
87+
88+
### Telemetry Data Captured
89+
90+
**Activities/Spans:**
91+
- `agent.operation.name` (agent.run, agent.run_streaming)
92+
- `agent.request.id`, `agent.request.name`, `agent.request.instructions`
93+
- `agent.request.message_count`, `agent.request.thread_id`
94+
- `agent.response.id`, `agent.response.message_count`, `agent.response.finish_reason`
95+
- `agent.usage.input_tokens`, `agent.usage.output_tokens`
96+
- Error information and activity status codes
97+
98+
**Metrics:**
99+
- Operation duration histogram with proper buckets
100+
- Token usage histogram (input/output tokens)
101+
- Request count counter
102+
- All metrics tagged with operation type and agent name
103+
104+
### Consequences
105+
106+
- **Good**: Provides comprehensive agent-level observability following established patterns
107+
- **Good**: Non-intrusive and optional implementation that doesn't affect core functionality
108+
- **Good**: Consistent with Microsoft.Extensions.AI telemetry conventions
109+
- **Good**: Easy to integrate with existing OpenTelemetry infrastructure
110+
- **Good**: Supports debugging, monitoring, and performance analysis
111+
- **Neutral**: Requires explicit wrapping of agents with `.WithOpenTelemetry()`
112+
- **Neutral**: Additional object allocation for telemetry wrapper
113+
114+
## Validation
115+
116+
The implementation is validated through:
117+
118+
1. **Comprehensive Unit Tests**: 16 test methods covering all scenarios including success, error, streaming, and edge cases
119+
2. **Integration Testing**: Step05 telemetry sample demonstrating real-world usage
120+
3. **Pattern Compliance**: Exact adherence to Microsoft.Extensions.AI OpenTelemetry patterns
121+
4. **Semantic Convention Compliance**: Follows OpenTelemetry semantic conventions for telemetry data
122+
123+
## More Information
124+
125+
### Usage Example
126+
127+
```csharp
128+
// Create TracerProvider
129+
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
130+
.AddSource(AgentOpenTelemetryConsts.DefaultSourceName)
131+
.AddConsoleExporter()
132+
.Build();
133+
134+
// Create and wrap agent with telemetry
135+
var baseAgent = new ChatClientAgent(chatClient, options);
136+
using var telemetryAgent = baseAgent.WithOpenTelemetry();
137+
138+
// Use agent normally - telemetry is captured automatically
139+
var response = await telemetryAgent.RunAsync(messages);
140+
```
141+
142+
### Integration with AppContext Switch
143+
144+
The implementation integrates with the standard .NET telemetry enablement pattern:
145+
146+
```csharp
147+
AppContext.SetSwitch("Microsoft.Extensions.AI.Agents.EnableTelemetry", true);
148+
```
149+
150+
### Relationship to Microsoft.Extensions.AI
151+
152+
This implementation follows the exact patterns established by Microsoft.Extensions.AI's OpenTelemetry instrumentation, ensuring consistency across the AI ecosystem and leveraging proven patterns for telemetry integration.

dotnet/Directory.Packages.props

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,11 @@
1515
<PackageVersion Include="System.Diagnostics.DiagnosticSource" Version="9.0.7" />
1616
<PackageVersion Include="System.Threading.Channels" Version="9.0.7" />
1717
<PackageVersion Include="System.Threading.Tasks.Extensions" Version="4.6.3" />
18+
<!-- OpenTelemetry -->
19+
<PackageVersion Include="OpenTelemetry" Version="1.9.0" />
20+
<PackageVersion Include="OpenTelemetry.Exporter.Console" Version="1.9.0" />
21+
<PackageVersion Include="OpenTelemetry.Exporter.InMemory" Version="1.9.0" />
22+
<PackageVersion Include="OpenTelemetry.Extensions.Hosting" Version="1.9.0" />
1823
<!-- Microsoft.Extensions.* -->
1924
<PackageVersion Include="Microsoft.Bcl.HashCode" Version="6.0.0" />
2025
<PackageVersion Include="Microsoft.Extensions.AI" Version="9.7.0" />
@@ -31,6 +36,7 @@
3136
<PackageVersion Include="Microsoft.Extensions.Hosting" Version="8.0.1" />
3237
<PackageVersion Include="Microsoft.Extensions.Logging" Version="8.0.1" />
3338
<PackageVersion Include="Microsoft.Extensions.Logging.Abstractions" Version="9.0.7" />
39+
<PackageVersion Include="Microsoft.Extensions.Logging.Testing" Version="9.0.7" />
3440
<!-- Agent SDKs -->
3541
<PackageVersion Include="Microsoft.Agents.CopilotStudio.Client" Version="1.1.125-beta" />
3642
<!-- Identity -->

dotnet/agent-framework-dotnet.slnx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,15 @@
5151
<File Path="../.github/workflows/dotnet-check-coverage.ps1" />
5252
<File Path="../.github/workflows/dotnet-format.yml" />
5353
</Folder>
54+
<Folder Name="/Solution Items/docs/" />
55+
<Folder Name="/Solution Items/docs/decisions/">
56+
<File Path="../docs/decisions/0001-agent-run-response.md" />
57+
<File Path="../docs/decisions/0001-agent-tools.md" />
58+
<File Path="../docs/decisions/0002-agent-opentelemetry-instrumentation.md" />
59+
<File Path="../docs/decisions/adr-short-template.md" />
60+
<File Path="../docs/decisions/adr-template.md" />
61+
<File Path="../docs/decisions/README.md" />
62+
</Folder>
5463
<Folder Name="/Solution Items/eng/" />
5564
<Folder Name="/Solution Items/eng/MSBuild/">
5665
<File Path="eng/MSBuild/LegacySupport.props" />

dotnet/samples/GettingStarted/GettingStarted.csproj

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,10 @@
2525
<PackageReference Include="Microsoft.Extensions.Configuration.Json" />
2626
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" />
2727
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
28+
<PackageReference Include="System.Diagnostics.DiagnosticSource" />
29+
<PackageReference Include="OpenTelemetry" />
30+
<PackageReference Include="OpenTelemetry.Exporter.Console" />
31+
<PackageReference Include="OpenTelemetry.Extensions.Hosting" />
2832
</ItemGroup>
2933

3034
<ItemGroup>
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
// Copyright (c) Microsoft. All rights reserved.
2+
3+
using Microsoft.Extensions.AI.Agents;
4+
using OpenTelemetry;
5+
using OpenTelemetry.Trace;
6+
7+
namespace Steps;
8+
9+
/// <summary>
10+
/// Demonstrates how to use telemetry with <see cref="ChatClientAgent"/> using OpenTelemetry.
11+
/// </summary>
12+
public sealed class Step05_ChatClientAgent_Telemetry(ITestOutputHelper output) : AgentSample(output)
13+
{
14+
/// <summary>
15+
/// Demonstrates OpenTelemetry tracing with Agent Framework.
16+
/// </summary>
17+
[Theory]
18+
[InlineData(ChatClientProviders.AzureAIAgentsPersistent)]
19+
[InlineData(ChatClientProviders.AzureOpenAI)]
20+
[InlineData(ChatClientProviders.OpenAIAssistant)]
21+
[InlineData(ChatClientProviders.OpenAIChatCompletion)]
22+
[InlineData(ChatClientProviders.OpenAIResponses)]
23+
public async Task RunWithTelemetry(ChatClientProviders provider)
24+
{
25+
// Enable telemetry
26+
AppContext.SetSwitch("Microsoft.Extensions.AI.Agents.EnableTelemetry", true);
27+
28+
// Create TracerProvider with console exporter
29+
string sourceName = Guid.NewGuid().ToString();
30+
31+
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
32+
.AddSource(sourceName)
33+
.AddConsoleExporter()
34+
.Build();
35+
36+
// Define agent
37+
var agentOptions = new ChatClientAgentOptions
38+
{
39+
Name = "TelemetryAgent",
40+
Instructions = "You are a helpful assistant.",
41+
};
42+
43+
// Create the server-side agent Id when applicable (depending on the provider).
44+
agentOptions.Id = await base.AgentCreateAsync(provider, agentOptions);
45+
46+
using var chatClient = base.GetChatClient(provider, agentOptions);
47+
var baseAgent = new ChatClientAgent(chatClient, agentOptions);
48+
49+
// Wrap the agent with OpenTelemetry instrumentation
50+
using var agent = baseAgent.WithOpenTelemetry(sourceName: sourceName);
51+
var thread = agent.GetNewThread();
52+
53+
// Run agent interactions
54+
await agent.RunAsync("What is artificial intelligence?", thread);
55+
await agent.RunAsync("How does machine learning work?", thread);
56+
57+
// Clean up
58+
await base.AgentCleanUpAsync(provider, baseAgent, thread);
59+
}
60+
}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
// Copyright (c) Microsoft. All rights reserved.
2+
3+
namespace Microsoft.Extensions.AI.Agents;
4+
5+
/// <summary>
6+
/// Extension methods for <see cref="Agent"/>.
7+
/// </summary>
8+
public static class AgentExtensions
9+
{
10+
/// <summary>
11+
/// Wraps the agent with OpenTelemetry instrumentation.
12+
/// </summary>
13+
/// <param name="agent">The agent to wrap.</param>
14+
/// <param name="sourceName">An optional source name that will be used on the telemetry data.</param>
15+
/// <returns>An <see cref="OpenTelemetryAgent"/> that wraps the original agent with telemetry.</returns>
16+
public static OpenTelemetryAgent WithOpenTelemetry(this Agent agent, string? sourceName = null)
17+
{
18+
return new OpenTelemetryAgent(agent, sourceName);
19+
}
20+
}

0 commit comments

Comments
 (0)