Skip to content

Commit 7b6cf62

Browse files
Merge pull request #36 from AgoraIO/release/v2.2.0
Release/v2.2.0
2 parents 55c2f7e + 474f1b2 commit 7b6cf62

28 files changed

Lines changed: 20388 additions & 195 deletions

.fern/replay.lock

Lines changed: 17989 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Python AgentKit Snake Case API Audit
2+
3+
Scope: `agora-agents-python` public AgentKit wrappers, docs, and tests.
4+
5+
Search terms:
6+
7+
```bash
8+
rg -n "apiKey|baseUrl|modelId|voiceId|groupId|keyTerm|turnDetection|inputAudioTranscription|greetingMessage|failureMessage|projectId|adcCredentialsString|sampleRate|targetLanguageCode|resourceName|deploymentName" agora-agents-python
9+
```
10+
11+
## Result
12+
13+
No shipped camelCase public Python constructor kwargs were found in source or docs examples. No deprecated alias helper is required for this pass.
14+
15+
| File | Class / symbol | Public arg or example | Current spelling | Desired Python spelling | `to_config()` key | Wire key | Action | Compatibility needed | Test coverage |
16+
|---|---|---|---|---|---|---|---|---|---|
17+
| `src/agora_agent/agentkit/vendors/tts.py` | `GoogleTTS` | constructor arg | `voice_name` | `voice_name` | `params.VoiceSelectionParams` | `params.VoiceSelectionParams` | keep | no | `tests/custom/test_tts_vendors.py` |
18+
| `src/agora_agent/agentkit/vendors/tts.py` | `RimeTTS` | constructor arg | `model_id` | `model_id` | `params.modelId` | `params.modelId` | keep | no | `tests/custom/test_tts_vendors.py` |
19+
| `src/agora_agent/agentkit/vendors/tts.py` | `MurfTTS` | constructor arg | `voice_id` | `voice_id` | `params.voiceId` | `params.voiceId` | keep | no | `tests/custom/test_tts_vendors.py`, `tests/custom/test_request_body.py` |
20+
| `src/agora_agent/types/rime_tts_params.py` | generated model | generated alias | `modelId` | n/a | `model_id` | `modelId` | keep | no | `tests/custom/test_tts_vendors.py` |
21+
| `src/agora_agent/types/murf_tts_params.py` | generated model | generated alias | `voiceId` | n/a | `voice_id` | `voiceId` | keep | no | `tests/custom/test_tts_vendors.py` |
22+
| `tests/custom/test_request_body.py` | wire assertion | payload key | `voiceId` | n/a | `params.voiceId` | `params.voiceId` | keep | no | request-body test |
23+
| `tests/custom/test_tts_vendors.py` | wire assertion | payload key | `modelId`, `voiceId`, `VoiceSelectionParams` | n/a | generated model fields | wire aliases | keep | no | wire serialization test |
24+
25+
## Guardrail Added
26+
27+
`tests/custom/test_docs_snake_case.py` scans Python markdown code fences and fails on common camelCase kwargs such as `apiKey`, `baseUrl`, `modelId`, `voiceId`, `projectId`, and `greetingMessage`. JSON, TypeScript, Go, shell, and YAML examples are skipped so wire payload examples can retain required non-Python keys.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ pip install agora-agents
2020
## Quick Start
2121

2222
Start with the `Agent` builder: create a client with app credentials, choose your ASR, LLM, and TTS providers, then start a session. Omit vendor API keys for supported Agora-managed models, or provide keys when you want BYOK.
23-
Set Agora interaction language with `turn_detection.language`; provider-specific STT language values remain under `asr.params`.
23+
Set Agora interaction language with `turn_detection.language`; provider-specific STT language values remain under `asr.params`. Ares uses only the REST `asr.language` value sourced from `turn_detection.language`.
2424

2525
```python
2626
import os

changelog.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,26 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](https://keepachangelog.com/).
66

7+
## [v2.2.0] — 2026-06-05
8+
9+
### Added
10+
11+
- **Expanded provider surface** — Added generated API support for the latest Conversational AI vendors and configuration types, including Dify LLM and Generic Avatar.
12+
- **Interaction language handling** — AgentKit now consistently derives REST `asr.language` from `turn_detection.language` while keeping provider-specific STT language values under `asr.params`.
13+
- **Deepgram keyterm** — Added `keyterm` support on `DeepgramSTT`, serialized as `asr.params.keyterm`.
14+
15+
### Changed
16+
17+
- **MiniMax managed presets** — MiniMax preset-backed TTS now keeps the preset model as an internal hint while sending only supported partial TTS settings such as `voice_setting.voice_id`.
18+
- **Vertex AI LLM routing**`VertexAILLM` now keeps project and location in the generated endpoint URL instead of duplicating them in `llm.params`.
19+
20+
### Fixed
21+
22+
- **Provider wire keys** — Corrected alias-sensitive TTS payloads so Google TTS emits `VoiceSelectionParams` and `AudioConfig`, Rime TTS emits `modelId`, and Murf TTS preserves `voiceId`.
23+
- **AgentKit request validation** — Start request validation now de-aliases REST-shaped provider dictionaries before constructing generated request models, while still allowing preset and pipeline-backed partial configs.
24+
- **Request body coverage** — Added regression tests for BYOK, preset-backed, mixed preset/BYOK, and pipeline override request shapes across provider configurations.
25+
- **Python docs examples** — Added a docs guard to keep Python examples on snake_case kwargs while allowing documented JSON wire keys.
26+
727
## [v2.1.0] — 2026-06-02
828

929
### Added
@@ -21,7 +41,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
2141
### Fixed
2242

2343
- **Managed-provider validation** — AgentKit validation now distinguishes preset-backed providers from BYOK providers so required provider fields are only required when credentials are caller-supplied.
24-
- **Language placement** — Provider-specific STT language values remain under `asr.params`, while Agora interaction language is emitted separately as `turn_detection.language`.
44+
- **Language placement** — Provider-specific STT language values remain under `asr.params`; the REST `asr.language` field is populated from `turn_detection.language`.
2545

2646
## [v2.0.0] — 2026-05-21
2747

@@ -114,7 +134,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
114134

115135
### Fixed
116136

117-
- **`AresSTT`** — Removed redundant `language` key from the `params` dict. Language is now emitted only at the top level. `params` is only included when `additional_params` is provided.
137+
- **`AresSTT`** — Removed redundant `language` key from the `params` dict. Ares only selects the provider; AgentKit populates REST `asr.language` from `turn_detection.language`. `params` is only included when `additional_params` is provided.
118138
- **`OpenAIRealtime` / `VertexAI` (MLLM)** — Agent-level `greeting` and `failure_message` defaults are now correctly applied when missing in MLLM mode. Previously these values were silently dropped.
119139
- **`VertexAI` (MLLM)**`messages` is emitted at the MLLM top level, matching the generated core SDK contract.
120140

compat/agora-agent-server-sdk/pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name = "agora-agent-server-sdk"
33

44
[tool.poetry]
55
name = "agora-agent-server-sdk"
6-
version = "v2.1.1"
6+
version = "v2.2.0"
77
description = "Compatibility shim for the renamed agora-agents package."
88
readme = "README.md"
99
authors = []
@@ -35,7 +35,7 @@ Repository = 'https://github.com/AgoraIO-Conversational-AI/agent-server-sdk-pyth
3535

3636
[tool.poetry.dependencies]
3737
python = "^3.8"
38-
agora-agents = ">=2.1.1,<3.0.0"
38+
agora-agents = ">=2.2.0,<3.0.0"
3939

4040
[build-system]
4141
requires = ["poetry-core"]

docs/concepts/vendors.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,12 +75,12 @@ tts = ElevenLabsTTS(
7575

7676
Used with `agent.with_stt()`.
7777

78-
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. STT vendor `language` options are serialized under `asr.params` using each provider's own format.
78+
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. STT vendor `language` options are serialized under `asr.params` using each provider's own format. Ares does not take a provider language option; AgentKit uses `turn_detection.language` for REST `asr.language`.
7979

8080
| Class | Provider | Required Parameters |
8181
|---|---|---|
8282
| `SpeechmaticsSTT` | Speechmatics | `api_key`, `language` |
83-
| `DeepgramSTT` | Deepgram | `model` for Agora-managed `nova-2`/`nova-3`; `api_key` for BYOK |
83+
| `DeepgramSTT` | Deepgram | `model` for Agora-managed `nova-2`/`nova-3`; `api_key` for BYOK; `language?`, `keyterm?` |
8484
| `MicrosoftSTT` | Microsoft Azure | `key`, `region`, `language` |
8585
| `OpenAISTT` | OpenAI | `api_key` |
8686
| `GoogleSTT` | Google Cloud | `project_id`, `location`, `adc_credentials_string`, `language` |

docs/reference/vendors.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -318,7 +318,7 @@ The SDK also includes named helpers for the remaining Agora-supported LLM provid
318318

319319
## STT Vendors
320320

321-
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. Provider-specific language values remain under `asr.params` and may use a different format.
321+
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. Provider-specific language values remain under `asr.params` and may use a different format. AgentKit populates REST `asr.language` from `turn_detection.language`.
322322

323323
### `SpeechmaticsSTT`
324324

@@ -336,6 +336,7 @@ Use `turn_detection.language` for Agora interaction language; it defaults to `en
336336
| `api_key` | `str` | BYOK only | `None` | Deepgram API key. Optional only for Agora-managed `nova-2` and `nova-3`. |
337337
| `model` | `str` | No | `None` | Model (e.g., `nova-2`) |
338338
| `language` | `str` | No | `None` | Language code (e.g., `en-US`) |
339+
| `keyterm` | `str` | No | `None` | Boost specialized terms and brands; serialized as `asr.params.keyterm` |
339340
| `smart_format` | `bool` | No | `None` | Enable smart formatting |
340341
| `punctuation` | `bool` | No | `None` | Enable punctuation |
341342
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional parameters |
@@ -396,7 +397,6 @@ For `nova-2` and `nova-3`, omit `api_key` to use Agora-managed credentials. For
396397

397398
| Parameter | Type | Required | Default | Description |
398399
|---|---|---|---|---|
399-
| `language` | `str` | No | `None` | Language code |
400400
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional parameters |
401401

402402
### `SarvamSTT`

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name = "agora-agents"
33

44
[tool.poetry]
55
name = "agora-agents"
6-
version = "v2.1.1"
6+
version = "v2.2.0"
77
description = ""
88
readme = "README.md"
99
authors = []

src/agora_agent/agentkit/agent.py

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@
7676
from ..agent_management.types.agent_think_agent_management_response import (
7777
AgentThinkAgentManagementResponse,
7878
)
79+
from ..core.pydantic_utilities import parse_obj_as
7980
from .vendors.base import BaseAvatar, BaseLLM, BaseMLLM, BaseSTT, BaseTTS
8081

8182
# Top-level aliases
@@ -188,6 +189,13 @@ class SessionOptions(typing_extensions.TypedDict, total=False):
188189
debug: bool
189190
warn: typing.Callable[[str], None]
190191

192+
193+
def _start_properties_from_mapping(
194+
properties: typing.Mapping[str, typing.Any],
195+
) -> StartAgentsRequestProperties:
196+
return parse_obj_as(StartAgentsRequestProperties, dict(properties))
197+
198+
191199
# LLM sub-type aliases
192200
LlmGreetingConfigs = typing.Dict[str, typing.Any]
193201
LlmGreetingConfigsMode = typing.Any
@@ -298,7 +306,7 @@ def _is_turn_detection_language(value: typing.Any) -> bool:
298306

299307
def _validate_turn_detection_language(value: typing.Any) -> TurnDetectionLanguage:
300308
if not _is_turn_detection_language(value):
301-
raise ValueError(f"Invalid interaction language: {value}")
309+
raise ValueError(f"Invalid turn_detection.language: {value}")
302310
return value # type: ignore[return-value]
303311

304312

@@ -896,7 +904,7 @@ def to_properties(
896904
if self._failure_message is not None:
897905
mllm_config.setdefault("failure_message", self._failure_message)
898906
base_kwargs["mllm"] = mllm_config
899-
return StartAgentsRequestProperties(**base_kwargs)
907+
return _start_properties_from_mapping(base_kwargs)
900908

901909
if skip_vendor_validation:
902910
warnings.warn(
@@ -919,12 +927,13 @@ def to_properties(
919927
allow_missing_llm = "llm" in allow_missing_categories
920928
allow_missing_tts = "tts" in allow_missing_categories
921929

930+
turn_detection_config = self._resolve_turn_detection_config()
922931
if not skip_asr_validation and (self._stt is not None or not allow_missing_asr):
923-
base_kwargs["asr"] = self._resolve_asr_config()
924-
base_kwargs["turn_detection"] = self._resolve_turn_detection_config()
932+
base_kwargs["asr"] = self._resolve_asr_config(turn_detection_config)
933+
base_kwargs["turn_detection"] = turn_detection_config
925934

926935
if skip_vendor_validation:
927-
return StartAgentsRequestProperties(**base_kwargs)
936+
return _start_properties_from_mapping(base_kwargs)
928937

929938
if self._tts is None and not (skip_tts_validation or allow_missing_tts):
930939
raise ValueError("TTS configuration is required. Use with_tts() to set it.")
@@ -937,39 +946,34 @@ def to_properties(
937946
if self._tts is not None and not skip_tts_validation:
938947
base_kwargs["tts"] = self._tts
939948

940-
return StartAgentsRequestProperties(**base_kwargs)
949+
return _start_properties_from_mapping(base_kwargs)
941950

942951
def _resolve_llm_config(self) -> typing.Dict[str, typing.Any]:
943952
llm_config = dict(self._llm or {})
944-
# Agent-level fields take priority over the vendor's defaults.
945-
# This matches the TS SDK where agent-level values override vendor config.
946-
if self._instructions is not None:
953+
if self._instructions is not None and "system_messages" not in llm_config:
947954
llm_config["system_messages"] = [{"role": "system", "content": self._instructions}]
948-
if self._greeting is not None:
955+
if self._greeting is not None and "greeting_message" not in llm_config:
949956
llm_config["greeting_message"] = self._greeting
950-
if self._greeting_configs is not None:
957+
if self._greeting_configs is not None and "greeting_configs" not in llm_config:
951958
llm_config["greeting_configs"] = _dump_optional_model(self._greeting_configs)
952-
if self._failure_message is not None:
959+
if self._failure_message is not None and "failure_message" not in llm_config:
953960
llm_config["failure_message"] = self._failure_message
954-
if self._max_history is not None:
961+
if self._max_history is not None and "max_history" not in llm_config:
955962
llm_config["max_history"] = self._max_history
956963
return llm_config
957964

958-
def _resolve_asr_config(self) -> typing.Dict[str, typing.Any]:
965+
def _resolve_asr_config(self, turn_detection_config: TurnDetectionConfig) -> typing.Dict[str, typing.Any]:
959966
asr_config = dict(self._stt or {})
960-
asr_config.pop("language", None)
961967
if not asr_config:
962968
asr_config["vendor"] = "ares"
969+
asr_config["language"] = self._field_value(turn_detection_config, "language")
963970
return asr_config
964971

965972
def _resolve_turn_detection_config(self) -> TurnDetectionConfig:
966-
existing_stt_language = self._stt.get("language") if self._stt is not None else None
967973
existing_turn_detection_language = self._field_value(self._turn_detection, "language")
968974
language = (
969975
existing_turn_detection_language
970976
if existing_turn_detection_language is not None
971-
else existing_stt_language
972-
if _is_turn_detection_language(existing_stt_language)
973977
else DEFAULT_TURN_DETECTION_LANGUAGE
974978
)
975979
language = _validate_turn_detection_language(language)

src/agora_agent/agentkit/agent_session.py

Lines changed: 58 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,15 @@
1515
AgentThinkAgentManagementResponse as AgentThinkResponse,
1616
)
1717
from ..agents.types.get_turns_agents_response import GetTurnsAgentsResponse
18-
from ..agents.types.start_agents_request_properties import StartAgentsRequestProperties
19-
from .agent import Agent, GetTurnsOptions, SayOptions, ThinkOptions
18+
from .agent import Agent, GetTurnsOptions, SayOptions, ThinkOptions, _start_properties_from_mapping
2019
from .avatar_types import (
2120
is_akool_avatar,
2221
is_anam_avatar,
2322
is_avatar_token_managed,
2423
is_generic_avatar,
2524
is_heygen_avatar,
2625
is_live_avatar_avatar,
26+
is_rtc_avatar,
2727
validate_avatar_config,
2828
validate_tts_sample_rate,
2929
)
@@ -333,22 +333,63 @@ def _build_start_properties(
333333
properties["tts"] = self._dump_model(self._agent.tts)
334334
if self._agent.llm is not None:
335335
llm = dict(self._agent.llm)
336-
if self._agent.instructions is not None:
336+
if self._agent.instructions is not None and "system_messages" not in llm:
337337
llm["system_messages"] = [{"role": "system", "content": self._agent.instructions}]
338-
if self._agent.greeting is not None:
338+
if self._agent.greeting is not None and "greeting_message" not in llm:
339339
llm["greeting_message"] = self._agent.greeting
340-
if self._agent.greeting_configs is not None:
340+
if self._agent.greeting_configs is not None and "greeting_configs" not in llm:
341341
llm["greeting_configs"] = self._dump_model(self._agent.greeting_configs)
342-
if self._agent.failure_message is not None:
342+
if self._agent.failure_message is not None and "failure_message" not in llm:
343343
llm["failure_message"] = self._agent.failure_message
344-
if self._agent.max_history is not None:
344+
if self._agent.max_history is not None and "max_history" not in llm:
345345
llm["max_history"] = self._agent.max_history
346346
properties["llm"] = llm
347347
if self._agent.stt is not None:
348348
properties["asr"] = self._dump_model(self._agent.stt)
349349

350350
return properties
351351

352+
@staticmethod
353+
def _request_properties_for_start(
354+
resolved_properties: typing.Dict[str, typing.Any],
355+
*,
356+
resolved_preset: typing.Optional[str],
357+
pipeline_id: typing.Optional[str],
358+
) -> typing.Any:
359+
try:
360+
return _start_properties_from_mapping(resolved_properties)
361+
except Exception as exc:
362+
if pipeline_id:
363+
return resolved_properties
364+
if resolved_preset:
365+
normalized_preset = normalize_preset_input(resolved_preset)
366+
if not normalized_preset:
367+
raise
368+
preset_categories = {
369+
category
370+
for item in normalized_preset.split(",")
371+
for category in [get_preset_category(item)]
372+
if category is not None
373+
}
374+
error_categories = _AgentSessionBase._validation_error_categories(exc)
375+
if error_categories and error_categories.issubset(preset_categories):
376+
return resolved_properties
377+
raise
378+
379+
@staticmethod
380+
def _validation_error_categories(exc: Exception) -> typing.Set[str]:
381+
errors = getattr(exc, "errors", None)
382+
if not callable(errors):
383+
return set()
384+
categories: typing.Set[str] = set()
385+
for error in errors():
386+
loc = error.get("loc") if isinstance(error, dict) else None
387+
if isinstance(loc, tuple) and loc:
388+
field = loc[0]
389+
if field in {"asr", "llm", "tts"}:
390+
categories.add(typing.cast(str, field))
391+
return categories
392+
352393
def _vendor_validation_categories(
353394
self,
354395
pipeline_id: typing.Optional[str],
@@ -513,10 +554,11 @@ def start(self) -> str:
513554
"properties": resolved_properties,
514555
})
515556

516-
try:
517-
request_properties: typing.Any = StartAgentsRequestProperties(**resolved_properties)
518-
except Exception:
519-
request_properties = resolved_properties
557+
request_properties = self._request_properties_for_start(
558+
resolved_properties,
559+
resolved_preset=resolved_preset,
560+
pipeline_id=pipeline_id,
561+
)
520562

521563
response = self._client.agents.start(
522564
self._app_id,
@@ -840,10 +882,11 @@ async def start(self) -> str:
840882
"properties": resolved_properties,
841883
})
842884

843-
try:
844-
request_properties: typing.Any = StartAgentsRequestProperties(**resolved_properties)
845-
except Exception:
846-
request_properties = resolved_properties
885+
request_properties = self._request_properties_for_start(
886+
resolved_properties,
887+
resolved_preset=resolved_preset,
888+
pipeline_id=pipeline_id,
889+
)
847890

848891
response = await self._client.agents.start(
849892
self._app_id,

0 commit comments

Comments
 (0)