Skip to content

Commit 9825fdc

Browse files
xitzhangXiting Zhang
andauthored
[VoiceLive] Release GA version of 1.2.0 (#46960)
* [VoiceLive] Release GA version of 1.2.0 * Update API version to 2026-04-10 and rename AvatarConfig.type to avatar_type - Updated default API version from 2026-01-01-preview to 2026-04-10 in aio/_patch.py - Renamed AvatarConfig.type field to avatar_type to avoid conflict with Python built-in - Updated documentation string and docstring references - Updated all sample code (async_mcp_sample.py) with new API version - Updated test parametrization (test_live_realtime_service.py) with new API version - Updated CHANGELOG.md with breaking change note - Updated apiview-properties.json metadata 304 unit tests pass, no new test failures introduced. * Update Foundry connect kwargs API * fix change log * add patch for legacy audio format * fix pylint * update docs * update release date * Update VoiceLive samples --------- Co-authored-by: Xiting Zhang <xitzhang@microsoft.com>
1 parent 2e19283 commit 9825fdc

20 files changed

Lines changed: 810 additions & 527 deletions

sdk/voicelive/azure-ai-voicelive/CHANGELOG.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Release History
22

3-
## 1.2.0 (Unreleased)
3+
## 1.2.0 (2026-05-22)
44

55
### Features Added
66

@@ -34,7 +34,7 @@
3434
- Response & function call ID tracking for end-to-end tracing
3535
- Agent v2 telemetry with agent identity and configuration tracking
3636
- MCP telemetry with tool call and approval flow tracking
37-
- **Agent Session Configuration**: Added `AgentSessionConfig` for configuring Azure AI Foundry agents
37+
- **Agent Session Configuration**: Added flattened `connect()` keyword arguments for configuring Azure AI Foundry agents
3838
at connection time with `agent_name`, `project_name`, `agent_version`, `conversation_id`, and more
3939
- **Transcription Improvements**:
4040
- Added `TranscriptionPhrase` and `TranscriptionWord` models for detailed transcription data
@@ -56,7 +56,12 @@
5656
### Breaking Changes
5757

5858
- Removed Foundry Agent Tool classes (`FoundryAgentTool`, `ResponseFoundryAgentCallItem`, etc.) —
59-
use `AgentSessionConfig` with `connect()` instead
59+
use flattened Azure AI Foundry keyword arguments with `connect()` instead
60+
- **Audio Format Values**: Changed `OutputAudioFormat` enum values to use underscore format
61+
(`pcm16_8000hz`, `pcm16_16000hz`) instead of the previous hyphenated values.
62+
This is a breaking change for code that compares, persists, or serializes the raw enum values.
63+
Legacy hyphenated values continue to deserialize for backward compatibility.
64+
- Renamed `AvatarConfig.type` field to `avatar_type` to avoid conflict with Python's built-in `type`
6065

6166
### Other Changes
6267

@@ -100,7 +105,7 @@
100105
- **Agent v2 Telemetry**: Added agent identity and configuration tracking on the connect span:
101106
- `gen_ai.agent.id` and `gen_ai.agent.thread_id` extracted from `session.created`/`session.updated`
102107
server events.
103-
- `gen_ai.agent.version` and `gen_ai.agent.project_name` from `AgentSessionConfig` at connect time.
108+
- `gen_ai.agent.version` and `gen_ai.agent.project_name` from Azure AI Foundry `connect()` keyword arguments at connect time.
104109
- **MCP (Model Context Protocol) Telemetry**: Added tracking for MCP tool calls and approval flows:
105110
- Per-event: `gen_ai.voice.mcp.server_label`, `gen_ai.voice.mcp.tool_name`,
106111
`gen_ai.voice.mcp.approval_request_id`, `gen_ai.voice.mcp.approve` on recv/send spans.
@@ -116,7 +121,7 @@
116121

117122
### Features Added
118123

119-
- **Agent Session Configuration**: Added `AgentSessionConfig` TypedDict for configuring Azure AI Foundry agents at connection time:
124+
- **Agent Session Configuration**: Added flattened `connect()` keyword arguments for configuring Azure AI Foundry agents at connection time:
120125
- `agent_name`: The name of the agent (required)
121126
- `project_name`: The Foundry project containing the agent (required)
122127
- `agent_version`: Optional version specification
@@ -129,7 +134,7 @@
129134
### Breaking Changes
130135

131136
- **Removed Foundry Agent Tools**: The following classes and enums related to Foundry agent tools have been removed:
132-
- `FoundryAgentTool` - Use `AgentSessionConfig` with `connect()` instead
137+
- `FoundryAgentTool` - Use flattened Azure AI Foundry keyword arguments with `connect()` instead
133138
- `ResponseFoundryAgentCallItem`
134139
- `FoundryAgentContextType` enum
135140
- `ToolType.FOUNDRY_AGENT` enum value

sdk/voicelive/azure-ai-voicelive/apiview-properties.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,5 +195,5 @@
195195
"azure.ai.voicelive.models.RequestImageContentPartDetail": "VoiceLive.RequestImageContentPartDetail",
196196
"azure.ai.voicelive.models.ServerEventType": "VoiceLive.ServerEventType"
197197
},
198-
"CrossLanguageVersion": "86299c665983"
198+
"CrossLanguageVersion": "4f7c08a38aa5"
199199
}

sdk/voicelive/azure-ai-voicelive/azure/ai/voicelive/aio/_patch.py

Lines changed: 118 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
import logging
1515
from contextlib import AbstractAsyncContextManager
1616
from urllib.parse import urlparse, urlunparse, urlencode, parse_qs
17-
from typing import Any, Mapping, Optional, Union, AsyncIterator, cast
17+
from typing import Any, Mapping, Optional, Sequence, Union, AsyncIterator, cast, overload
1818

1919
# === Third-party ===
2020
from typing_extensions import TypedDict
@@ -42,19 +42,18 @@
4242
from azure.core.credentials import AzureKeyCredential
4343
from azure.core.credentials_async import AsyncTokenCredential
4444
from azure.core.exceptions import AzureError
45+
from azure.ai.voicelive.models import ClientEvent, ServerEvent, RequestSession
4546

4647
# === Local ===
47-
from ..models import ClientEvent, ServerEvent, RequestSession
4848

4949
if sys.version_info >= (3, 11):
50-
from typing import NotRequired, Required # noqa: F401
50+
from typing import NotRequired # noqa: F401
5151
else:
52-
from typing_extensions import NotRequired, Required # noqa: F401
52+
from typing_extensions import NotRequired # noqa: F401
5353

5454
__all__: list[str] = [
5555
"connect",
5656
"WebsocketConnectionOptions",
57-
"AgentSessionConfig",
5857
"VoiceLiveConnection",
5958
"SessionResource",
6059
"ResponseResource",
@@ -94,33 +93,31 @@ def _json_default(o: Any) -> Any:
9493
raise TypeError(f"{type(o).__name__} is not JSON serializable")
9594

9695

97-
class AgentSessionConfig(TypedDict, total=False):
98-
"""
99-
Configuration for agent session connection.
100-
101-
This TypedDict defines the parameters needed to connect to a Voice Live
102-
session using an Azure AI Foundry agent.
103-
104-
:keyword agent_name: The name of the agent to use. Required.
105-
:type agent_name: str
106-
:keyword project_name: The name of the Foundry project containing the agent. Required.
107-
:type project_name: str
108-
:keyword agent_version: The version of the agent to use.
109-
:type agent_version: str
110-
:keyword conversation_id: The ID of an existing conversation to continue.
111-
:type conversation_id: str
112-
:keyword authentication_identity_client_id: The client ID for authentication identity.
113-
:type authentication_identity_client_id: str
114-
:keyword foundry_resource_override: An optional override for the Foundry resource.
115-
:type foundry_resource_override: str
116-
"""
117-
118-
agent_name: Required[str]
119-
project_name: Required[str]
120-
agent_version: NotRequired[str]
121-
conversation_id: NotRequired[str]
122-
authentication_identity_client_id: NotRequired[str]
123-
foundry_resource_override: NotRequired[str]
96+
def _build_foundry_agent_config(
97+
*,
98+
agent_name: Optional[str],
99+
project_name: Optional[str],
100+
agent_version: Optional[str],
101+
conversation_id: Optional[str],
102+
authentication_identity_client_id: Optional[str],
103+
foundry_resource_override: Optional[str],
104+
) -> Optional[dict[str, str]]:
105+
agent_config = {
106+
"agent_name": agent_name,
107+
"project_name": project_name,
108+
"agent_version": agent_version,
109+
"conversation_id": conversation_id,
110+
"authentication_identity_client_id": authentication_identity_client_id,
111+
"foundry_resource_override": foundry_resource_override,
112+
}
113+
114+
if not any(value is not None for value in agent_config.values()):
115+
return None
116+
117+
if agent_name is None or project_name is None:
118+
raise ValueError("Both 'agent_name' and 'project_name' are required when connecting to an Azure AI Foundry agent.")
119+
120+
return {key: value for key, value in agent_config.items() if value is not None}
124121

125122

126123
class ConnectionError(AzureError):
@@ -673,21 +670,21 @@ def __init__(
673670
*,
674671
credential: Union["AzureKeyCredential", "AsyncTokenCredential"],
675672
endpoint: str,
676-
api_version: str = "2026-01-01-preview",
673+
api_version: str = "2026-04-10",
677674
model: Optional[str] = None,
678-
agent_config: Optional[AgentSessionConfig] = None,
675+
agent_config: Optional[Mapping[str, str]] = None,
679676
extra_query: Mapping[str, Any],
680677
extra_headers: Mapping[str, Any],
681678
connection_options: Optional[WebsocketConnectionOptions] = None,
682-
**kwargs: Any,
679+
credential_scopes: Optional[Union[str, Sequence[str]]] = None,
683680
) -> None:
684681
self._credential = credential
685682
self._endpoint = endpoint
686-
raw_scopes = kwargs.pop("credential_scopes", ["https://ai.azure.com/.default"])
683+
raw_scopes = credential_scopes if credential_scopes is not None else ["https://ai.azure.com/.default"]
687684
self.__credential_scopes = [raw_scopes] if isinstance(raw_scopes, str) else list(raw_scopes)
688685
self.__api_version = api_version
689686
self.__model = model
690-
self.__agent_config = agent_config
687+
self.__agent_config = dict(agent_config) if agent_config is not None else None
691688

692689
self.__connection: Optional["VoiceLiveConnection"] = None
693690
self.__extra_query = extra_query
@@ -869,16 +866,58 @@ async def __aexit__(self, exc_type, exc, exc_tb) -> None:
869866
await self.__connection.close()
870867

871868

869+
@overload
870+
def connect(
871+
*,
872+
credential: Union[AzureKeyCredential, AsyncTokenCredential],
873+
endpoint: str,
874+
api_version: str = "2026-04-10",
875+
model: Optional[str] = None,
876+
query: Optional[Mapping[str, Any]] = None,
877+
headers: Optional[Mapping[str, Any]] = None,
878+
connection_options: Optional[WebsocketConnectionOptions] = None,
879+
credential_scopes: Optional[Union[str, Sequence[str]]] = None,
880+
) -> AbstractAsyncContextManager["VoiceLiveConnection"]:
881+
...
882+
883+
884+
@overload
885+
def connect(
886+
*,
887+
credential: Union[AzureKeyCredential, AsyncTokenCredential],
888+
endpoint: str,
889+
api_version: str = "2026-04-10",
890+
model: Optional[str] = None,
891+
agent_name: str,
892+
project_name: str,
893+
agent_version: Optional[str] = None,
894+
conversation_id: Optional[str] = None,
895+
authentication_identity_client_id: Optional[str] = None,
896+
foundry_resource_override: Optional[str] = None,
897+
query: Optional[Mapping[str, Any]] = None,
898+
headers: Optional[Mapping[str, Any]] = None,
899+
connection_options: Optional[WebsocketConnectionOptions] = None,
900+
credential_scopes: Optional[Union[str, Sequence[str]]] = None,
901+
) -> AbstractAsyncContextManager["VoiceLiveConnection"]:
902+
...
903+
904+
872905
def connect(
873906
*,
874907
credential: Union[AzureKeyCredential, AsyncTokenCredential],
875908
endpoint: str,
876-
api_version: str = "2026-01-01-preview",
909+
api_version: str = "2026-04-10",
877910
model: Optional[str] = None,
878-
agent_config: Optional[AgentSessionConfig] = None,
911+
agent_name: Optional[str] = None,
912+
project_name: Optional[str] = None,
913+
agent_version: Optional[str] = None,
914+
conversation_id: Optional[str] = None,
915+
authentication_identity_client_id: Optional[str] = None,
916+
foundry_resource_override: Optional[str] = None,
879917
query: Optional[Mapping[str, Any]] = None,
880918
headers: Optional[Mapping[str, Any]] = None,
881919
connection_options: Optional[WebsocketConnectionOptions] = None,
920+
credential_scopes: Optional[Union[str, Sequence[str]]] = None,
882921
**kwargs: Any,
883922
) -> AbstractAsyncContextManager["VoiceLiveConnection"]:
884923
"""
@@ -890,33 +929,58 @@ def connect(
890929
- Establishes the connection and yields a :class:`~azure.ai.voicelive.aio.VoiceLiveConnection`.
891930
- Automatically cleans up the connection when the context exits.
892931
932+
Additional legacy keyword arguments are accepted for backward compatibility.
933+
Unknown values are ignored.
934+
893935
:keyword credential: The credential used to authenticate with the service.
894-
:paramtype type credential: ~azure.core.credentials.AzureKeyCredential or ~azure.core.credentials.AsyncTokenCredential
936+
:paramtype credential: ~azure.core.credentials.AzureKeyCredential or ~azure.core.credentials.AsyncTokenCredential
895937
:keyword endpoint: Service endpoint, e.g., ``https://<region>.api.cognitive.microsoft.com``.
896-
:paramtype type endpoint: str
897-
:keyword api_version: The API version to use. Defaults to ``"2026-01-01-preview"``.
898-
:paramtype type api_version: str
938+
:paramtype endpoint: str
939+
:keyword api_version: The API version to use. Defaults to ``"2026-04-10"``.
940+
:paramtype api_version: str
899941
:keyword model: Model identifier to use for the session.
900942
In most scenarios, this parameter is required.
901943
It may be omitted only when connecting through an **Agent** scenario,
902944
in which case the service will use the model associated with the Agent.
903945
:paramtype model: str
904-
:keyword agent_config: Optional agent session configuration for connecting to an Azure AI
905-
Foundry agent. When provided, the connection will be established with the specified agent
906-
and the ``model`` parameter may be omitted.
907-
:paramtype agent_config: ~azure.ai.voicelive.aio.AgentSessionConfig
946+
:keyword agent_name: Optional Azure AI Foundry agent name. When set, ``project_name`` is also required.
947+
:paramtype agent_name: str
948+
:keyword project_name: Azure AI Foundry project name. Required when ``agent_name`` is provided.
949+
:paramtype project_name: str
950+
:keyword agent_version: Optional Azure AI Foundry agent version.
951+
:paramtype agent_version: str
952+
:keyword conversation_id: Optional Azure AI Foundry conversation ID to continue.
953+
:paramtype conversation_id: str
954+
:keyword authentication_identity_client_id: Optional client ID used for Foundry authentication identity.
955+
:paramtype authentication_identity_client_id: str
956+
:keyword foundry_resource_override: Optional override for the Azure AI Foundry resource.
957+
:paramtype foundry_resource_override: str
908958
:keyword query: Optional query parameters to include in the WebSocket URL.
909-
:paramtype type query: Mapping[str, Any]
959+
:paramtype query: Mapping[str, Any]F
910960
:keyword headers: Optional HTTP headers to include in the WebSocket handshake.
911-
:paramtype type headers: Mapping[str, Any]
961+
:paramtype headers: Mapping[str, Any]
912962
:keyword connection_options: Optional advanced WebSocket options compatible with :mod:`aiohttp`.
913-
:paramtype type connection_options: ~azure.ai.voicelive.aio.WebsocketConnectionOptions
963+
:paramtype connection_options: ~azure.ai.voicelive.aio.WebsocketConnectionOptions
964+
:keyword credential_scopes: Optional scope override for token-based authentication.
965+
:paramtype credential_scopes: str | Sequence[str]
914966
:return: An async context manager yielding a connected :class:`~azure.ai.voicelive.aio.VoiceLiveConnection`.
915967
:rtype: collections.abc.AsyncContextManager[~azure.ai.voicelive.aio.VoiceLiveConnection]
916-
917-
.. note::
918-
Additional keyword arguments can be passed and will be forwarded to the underlying connection.
968+
:raises ValueError: If only one of ``agent_name`` or ``project_name`` is provided.
919969
"""
970+
credential_scopes = cast(
971+
Optional[Union[str, Sequence[str]]],
972+
kwargs.pop("credential_scopes", credential_scopes),
973+
)
974+
975+
agent_config = _build_foundry_agent_config(
976+
agent_name=agent_name,
977+
project_name=project_name,
978+
agent_version=agent_version,
979+
conversation_id=conversation_id,
980+
authentication_identity_client_id=authentication_identity_client_id,
981+
foundry_resource_override=foundry_resource_override,
982+
)
983+
920984
return _VoiceLiveConnectionManager(
921985
credential=credential,
922986
endpoint=endpoint,
@@ -926,7 +990,7 @@ def connect(
926990
extra_query=query or {},
927991
extra_headers=headers or {},
928992
connection_options=connection_options or {},
929-
**kwargs,
993+
credential_scopes=credential_scopes,
930994
)
931995

932996

sdk/voicelive/azure-ai-voicelive/azure/ai/voicelive/models/_enums.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,12 @@ class OpenAIVoiceName(str, Enum, metaclass=CaseInsensitiveEnumMeta):
245245
"""Cedar voice."""
246246

247247

248+
_LEGACY_OUTPUT_AUDIO_FORMAT_VALUES = {
249+
"pcm16-8000hz": "pcm16_8000hz",
250+
"pcm16-16000hz": "pcm16_16000hz",
251+
}
252+
253+
248254
class OutputAudioFormat(str, Enum, metaclass=CaseInsensitiveEnumMeta):
249255
"""Output audio format types supported."""
250256

@@ -259,6 +265,14 @@ class OutputAudioFormat(str, Enum, metaclass=CaseInsensitiveEnumMeta):
259265
G711_ALAW = "g711_alaw"
260266
"""G.711 A-law audio format at 8kHz sampling rate."""
261267

268+
@classmethod
269+
def _missing_(cls, value):
270+
if isinstance(value, str):
271+
current_value = _LEGACY_OUTPUT_AUDIO_FORMAT_VALUES.get(value.lower())
272+
if current_value is not None:
273+
return cls(current_value)
274+
return None
275+
262276

263277
class PersonalVoiceModels(str, Enum, metaclass=CaseInsensitiveEnumMeta):
264278
"""PersonalVoice models."""

sdk/voicelive/azure-ai-voicelive/azure/ai/voicelive/models/_models.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -512,8 +512,8 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
512512
class AvatarConfig(_Model):
513513
"""Configuration for avatar streaming and behavior during the session.
514514
515-
:ivar type: Type of avatar to use. Known values are: "video-avatar" and "photo-avatar".
516-
:vartype type: str or ~azure.ai.voicelive.models.AvatarConfigTypes
515+
:ivar avatar_type: Type of avatar to use. Known values are: "video-avatar" and "photo-avatar".
516+
:vartype avatar_type: str or ~azure.ai.voicelive.models.AvatarConfigTypes
517517
:ivar ice_servers: Optional list of ICE servers to use for WebRTC connection establishment.
518518
:vartype ice_servers: list[~azure.ai.voicelive.models.IceServer]
519519
:ivar character: The character name or ID used for the avatar. Required.
@@ -537,8 +537,8 @@ class AvatarConfig(_Model):
537537
:vartype output_audit_audio: bool
538538
"""
539539

540-
type: Optional[Union[str, "_models.AvatarConfigTypes"]] = rest_field(
541-
visibility=["read", "create", "update", "delete", "query"]
540+
avatar_type: Optional[Union[str, "_models.AvatarConfigTypes"]] = rest_field(
541+
name="type", visibility=["read", "create", "update", "delete", "query"]
542542
)
543543
"""Type of avatar to use. Known values are: \"video-avatar\" and \"photo-avatar\"."""
544544
ice_servers: Optional[list["_models.IceServer"]] = rest_field(
@@ -575,7 +575,7 @@ def __init__(
575575
*,
576576
character: str,
577577
customized: bool,
578-
type: Optional[Union[str, "_models.AvatarConfigTypes"]] = None,
578+
avatar_type: Optional[Union[str, "_models.AvatarConfigTypes"]] = None,
579579
ice_servers: Optional[list["_models.IceServer"]] = None,
580580
style: Optional[str] = None,
581581
model: Optional[Union[str, "_models.PhotoAvatarBaseModes"]] = None,

0 commit comments

Comments
 (0)