Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .vscode/cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -1472,6 +1472,8 @@
"FILLER",
"foundry",
"FOUNDRY",
"genai",
"GENAI",
"Unpooled",
"viseme",
"VISEME",
Expand Down
11 changes: 11 additions & 0 deletions sdk/voicelive/azure-ai-voicelive/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,17 @@

### Features Added

- Added built-in OpenTelemetry tracing support for voice sessions following GenAI Semantic Conventions:
- `VoiceLiveClientBuilder.openTelemetry(OpenTelemetry)` method for providing a custom OpenTelemetry instance
- Defaults to `GlobalOpenTelemetry.getOrNoop()` for automatic Java agent detection with zero-cost no-op fallback
- Emits spans for `connect`, `send`, `recv`, and `close` operations with Python-aligned VoiceLive telemetry semantics
- Session-level counters: turn count, interruption count, audio bytes sent/received, first token latency, MCP call/list-tools counts
- Tracks response and item hierarchy IDs (`response_id`, `conversation_id`, `item_id`, `call_id`, `previous_item_id`, `output_index`) on send/recv spans
- Captures agent/session config attributes on connect spans (`gen_ai.agent.*`, `gen_ai.system_instructions`, `gen_ai.request.*`)
- Adds OpenTelemetry metrics (`gen_ai.client.operation.duration`, `gen_ai.client.token.usage`) with provider/server/model dimensions
- Content recording controlled via `enableContentRecording(boolean)` or `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` (with legacy `AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED` fallback)
- Added `TelemetrySample.java` demonstrating OpenTelemetry integration patterns

### Breaking Changes

### Bugs Fixed
Expand Down
193 changes: 192 additions & 1 deletion sdk/voicelive/azure-ai-voicelive/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,9 @@ The following sections provide code snippets for common scenarios:
* [Handle event types](#handle-event-types)
* [Voice configuration](#voice-configuration)
* [Function calling](#function-calling)
* [MCP tool integration](#mcp-tool-integration)
* [Azure AI Foundry agent session](#azure-ai-foundry-agent-session)
* [Telemetry and tracing](#telemetry-and-tracing)
* [Complete voice assistant with microphone](#complete-voice-assistant-with-microphone)

### Focused Sample Files
Expand Down Expand Up @@ -166,6 +169,29 @@ For easier learning, explore these focused samples in order:
- Execute functions locally and return results
- Continue conversation with function results

7. **telemetry/ExplicitTracingSample.java** - Explicit OpenTelemetry tracing
- Explicit OpenTelemetry instance via builder
- Content recording with `--enable-recording` flag
- Custom console span exporter
- Azure Monitor integration example

8. **telemetry/GlobalTracingSample.java** - Automatic tracing via GlobalOpenTelemetry
- Zero builder configuration — uses `buildAndRegisterGlobal()`
- Same span output as explicit tracing

9. **MCPSample.java** - Model Context Protocol (MCP) tool integration
- Configure MCP servers for external tool access
- Handle MCP call events and tool execution
- Handle MCP approval requests for tool calls
- Process MCP call results and continue conversations

10. **AgentV2Sample.java** - Azure AI Foundry agent session
- Connect directly to an Azure AI Foundry agent via AgentSessionConfig
- Real-time audio capture and playback
- Sequence number based audio for interrupt handling
- Azure noise suppression and echo cancellation
- Conversation logging to file

> **Note:** To run audio samples (AudioPlaybackSample, MicrophoneInputSample, VoiceAssistantSample, FunctionCallingSample):
> ```bash
> mvn exec:java -Dexec.mainClass=com.azure.ai.voicelive.FunctionCallingSample -Dexec.classpathScope=test
Expand Down Expand Up @@ -397,6 +423,171 @@ client.startSession("gpt-4o-realtime-preview")
* Results are sent back to continue the conversation
* See `FunctionCallingSample.java` for a complete working example

### MCP tool integration

Use [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) servers to give the AI access to external tools during a voice session. The service calls the MCP server directly — your code only needs to handle approval requests when required:

```java com.azure.ai.voicelive.mcp
// Configure MCP servers as tools
MCPServer mcpServer = new MCPServer("deepwiki", "https://mcp.deepwiki.com/mcp")
.setRequireApproval(BinaryData.fromObject(MCPApprovalType.ALWAYS));

VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
.setTools(Arrays.asList(mcpServer))
.setInstructions("You have access to external tools via MCP. Use them when asked.");

// Handle MCP approval requests in your event loop
session.receiveEvents().subscribe(event -> {
if (event instanceof SessionUpdateResponseOutputItemDone) {
SessionUpdateResponseOutputItemDone itemDone = (SessionUpdateResponseOutputItemDone) event;
SessionResponseItem item = itemDone.getItem();

if (item instanceof ResponseMCPApprovalRequestItem) {
// Approve the tool call
ResponseMCPApprovalRequestItem approvalRequest = (ResponseMCPApprovalRequestItem) item;
MCPApprovalResponseRequestItem approval = new MCPApprovalResponseRequestItem(
approvalRequest.getId(), true);
ClientEventConversationItemCreate createItem = new ClientEventConversationItemCreate()
.setItem(approval);
session.sendEvent(createItem).subscribe();
session.sendEvent(new ClientEventResponseCreate()).subscribe();
}
}
});
```

> See `MCPSample.java` for a complete working example with MCP server configuration.

### Azure AI Foundry agent session

Connect directly to an Azure AI Foundry agent using `AgentSessionConfig`. The agent becomes the primary responder for the voice session:

```java com.azure.ai.voicelive.agentsession
// Configure agent connection
AgentSessionConfig agentConfig = new AgentSessionConfig("my-agent", "my-project")
.setAgentVersion("1.0");

// Start session with agent config (uses DefaultAzureCredential)
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(endpoint)
.credential(new DefaultAzureCredentialBuilder().build())
.buildAsyncClient();

client.startSession(agentConfig)
.flatMap(session -> {
session.receiveEvents().subscribe(event -> handleEvent(event));
return Mono.just(session);
})
.block();
```

> See `AgentV2Sample.java` for a full implementation with audio capture, playback, and conversation logging.

### Telemetry and tracing

The SDK has built-in [OpenTelemetry](https://opentelemetry.io/) tracing that emits spans for every WebSocket operation. When no OpenTelemetry SDK is present, all tracing calls are automatically no-op with zero performance impact.

#### Automatic tracing (recommended)

When no `.openTelemetry()` is called on the builder, the SDK defaults to `GlobalOpenTelemetry.getOrNoop()` —
tracing is automatically active when a global OpenTelemetry instance exists (e.g., via the
[OpenTelemetry Java agent](https://opentelemetry.io/docs/languages/java/automatic/) or
`OpenTelemetrySdk.builder().buildAndRegisterGlobal()`), and is a zero-cost no-op otherwise:

```java com.azure.ai.voicelive.tracing.automatic
// No special configuration needed — tracing is picked up from GlobalOpenTelemetry
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(endpoint)
.credential(new AzureKeyCredential(apiKey))
.buildAsyncClient();
```

#### Explicit OpenTelemetry instance

Pass your own `OpenTelemetry` instance directly to the builder for full control. This is useful
when you want different clients to use different tracer configurations:

```java com.azure.ai.voicelive.tracing.explicit
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(endpoint)
.credential(new AzureKeyCredential(apiKey))
.openTelemetry(otel)
.buildAsyncClient();
```

#### Span structure

When tracing is active, the following span hierarchy is emitted for each voice session:

```
connect gpt-4o-realtime-preview ← session lifetime span
├── send session.update ← one span per sent event
├── send input_audio_buffer.append
├── send response.create
├── recv session.created ← one span per received event
├── recv session.updated
├── recv response.audio.delta
├── recv response.done ← includes token usage attributes
├── recv rate_limits.updated ← rate limit info
└── close
```

**Common attributes** (on all spans): `gen_ai.system`, `gen_ai.operation.name`, `gen_ai.provider.name`, `gen_ai.request.model`, `az.namespace`, `server.address`, `server.port`

**Session-level attributes** (on the connect span, flushed at session close):
- `gen_ai.voice.session_id` — Voice session ID
- `gen_ai.voice.input_audio_format` / `gen_ai.voice.output_audio_format` — Audio formats (e.g., `pcm16`)
- `gen_ai.voice.input_sample_rate` — Input audio sampling rate (Hz)
- `gen_ai.voice.turn_count` — Completed response turns
- `gen_ai.voice.interruption_count` — User interruptions
- `gen_ai.voice.audio_bytes_sent` / `gen_ai.voice.audio_bytes_received` — Audio payload bytes
- `gen_ai.voice.first_token_latency_ms` — Time to first audio response
- `gen_ai.conversation.id` — Conversation ID
- `gen_ai.response.id` / `gen_ai.response.finish_reasons` — Last response metadata
- `gen_ai.system_instructions` / `gen_ai.request.temperature` / `gen_ai.request.max_output_tokens` / `gen_ai.request.tools` — Session config from `session.update`
- `gen_ai.agent.name` / `gen_ai.agent.id` / `gen_ai.agent.version` / `gen_ai.agent.project_name` / `gen_ai.agent.thread_id` — Agent metadata (when using `AgentSessionConfig`)

#### Content recording

By default, message payloads are not recorded in spans for privacy. Enable content recording via the builder or environment variable:

```java com.azure.ai.voicelive.tracing.contentrecording
// Enable content recording to capture full JSON payloads in span events
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(endpoint)
.credential(new AzureKeyCredential(apiKey))
.openTelemetry(otel)
.enableContentRecording(true)
.buildAsyncClient();

// Or via environment variables (no code changes needed):
// OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true
// (legacy fallback) AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED=true
```

> See `telemetry/ExplicitTracingSample.java` and `telemetry/GlobalTracingSample.java` for complete tracing examples.
>
> **Run the telemetry samples** to see tracing in action:
> ```bash
> # Explicit tracing (prints span names and attributes):
> mvn exec:java -Dexec.mainClass="com.azure.ai.voicelive.telemetry.ExplicitTracingSample" -Dexec.classpathScope=test -Dexec.args="--enable-tracing"
>
> # Explicit tracing + content recording (also prints full JSON payloads):
> mvn exec:java -Dexec.mainClass="com.azure.ai.voicelive.telemetry.ExplicitTracingSample" -Dexec.classpathScope=test -Dexec.args="--enable-tracing --enable-recording"
>
> # Automatic tracing via GlobalOpenTelemetry:
> mvn exec:java -Dexec.mainClass="com.azure.ai.voicelive.telemetry.GlobalTracingSample" -Dexec.classpathScope=test
> ```
>
> Sample output with `--enable-tracing`:
> ```
> 'send session.update' : {gen_ai.operation.name=send, gen_ai.voice.event_type=session.update, ...}
> 'recv session.created' : {gen_ai.operation.name=recv, gen_ai.voice.event_type=session.created, ...}
> 'recv response.done' : {gen_ai.usage.input_tokens=100, gen_ai.usage.output_tokens=50, ...}
> 'close' : {gen_ai.operation.name=close, ...}
> 'connect gpt-4o-realtime-preview' : {gen_ai.voice.session_id=..., gen_ai.voice.turn_count=1, ...}
> ```

### Complete voice assistant with microphone

A full example demonstrating real-time microphone input and audio playback:
Expand Down Expand Up @@ -442,7 +633,7 @@ client.startSession("gpt-4o-realtime-preview")
// Subscribe to receive server events
session.receiveEvents()
.subscribe(
event -> handleEvent(event, session),
event -> handleEvent(event),
error -> System.err.println("Error: " + error.getMessage())
);

Expand Down
14 changes: 14 additions & 0 deletions sdk/voicelive/azure-ai-voicelive/checkstyle-suppressions.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
<!-- This file is generated by the /eng/scripts/linting_suppression_generator.py script. -->

<suppressions>
<!-- OpenTelemetry tracing imports -->
<suppress files="com.azure.ai.voicelive.VoiceLiveTracer.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceLiveAsyncClient.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceLiveClientBuilder.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceLiveClientBuilder.java" checks="io.clientcore.linting.extensions.checkstyle.checks.ExternalDependencyExposedCheck" />
Comment thread
xitzhang marked this conversation as resolved.
<suppress files="com.azure.ai.voicelive.VoiceLiveSessionAsyncClient.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceLiveTracerTest.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceLiveClientBuilderTest.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.ReadmeSamples.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.telemetry.ExplicitTracingSample.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.telemetry.GlobalTracingSample.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.VoiceAssistantSample.java" checks="IllegalImportCheck" />
<suppress files="com.azure.ai.voicelive.telemetry.VoiceLiveTelemetryAttributeKeys.java" checks="IllegalImportCheck" />

<suppress files="com.azure.ai.voicelive.models.AssistantMessageItem.java" checks="io.clientcore.linting.extensions.checkstyle.checks.EnforceFinalFieldsCheck" />
<suppress files="com.azure.ai.voicelive.models.MessageItem.java" checks="io.clientcore.linting.extensions.checkstyle.checks.EnforceFinalFieldsCheck" />
<suppress files="com.azure.ai.voicelive.models.SystemMessageItem.java" checks="io.clientcore.linting.extensions.checkstyle.checks.EnforceFinalFieldsCheck" />
Expand Down
42 changes: 42 additions & 0 deletions sdk/voicelive/azure-ai-voicelive/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ Code generated by Microsoft (R) TypeSpec Code Generator.
<artifactId>azure-core-http-netty</artifactId>
<version>1.16.3</version> <!-- {x-version-update;com.azure:azure-core-http-netty;dependency} -->
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<version>1.58.0</version> <!-- {x-version-update;io.opentelemetry:opentelemetry-api;external_dependency} -->
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-core-test</artifactId>
Expand All @@ -82,5 +87,42 @@ Code generated by Microsoft (R) TypeSpec Code Generator.
<version>3.7.11</version> <!-- {x-version-update;io.projectreactor:reactor-test;external_dependency} -->
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
<version>1.58.0</version> <!-- {x-version-update;io.opentelemetry:opentelemetry-sdk;external_dependency} -->
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-testing</artifactId>
<version>1.58.0</version> <!-- {x-version-update;io.opentelemetry:opentelemetry-sdk-testing;external_dependency} -->
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-logging</artifactId>
<version>1.58.0</version> <!-- {x-version-update;io.opentelemetry:opentelemetry-exporter-logging;external_dependency} -->
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<version>3.6.1</version> <!-- {x-version-update;org.apache.maven.plugins:maven-enforcer-plugin;external_dependency} -->
<configuration>
<rules>
<bannedDependencies>
<includes>
<include>io.opentelemetry:opentelemetry-api:[1.58.0]</include> <!-- {x-include-update;io.opentelemetry:opentelemetry-api;external_dependency} -->
Comment thread
xitzhang marked this conversation as resolved.
</includes>
</bannedDependencies>
</rules>
</configuration>
</plugin>
</plugins>
</build>
</project>
Loading