Skip to content

Commit cd1a749

Browse files
authored
Merge pull request #46 from seymourtang/feat/2.8-2.9
feat:add engine 2.8 & 2.9 features
2 parents 8b7aa4a + 99f82ab commit cd1a749

23 files changed

Lines changed: 713 additions & 13 deletions

docs/concepts/vendors.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,10 @@ Used with `agent.with_tts()`. Each TTS vendor produces audio at a specific sampl
6666
| `FishAudioTTS` | Fish Audio | `key`, `reference_id`, `backend` ||
6767
| `MurfTTS` | Murf | `key`, `voice_id`, `model` ||
6868
| `MiniMaxTTS` | MiniMax | `model` for supported Agora-managed global models; `key`, `group_id`, `model`, `voice_id`, `url` for BYOK ||
69+
| `GenericTTS` | Generic OpenAI-compatible TTS | `url`, `headers`, `model`, `voice` | Configurable |
6970
| `DeepgramTTS` | Deepgram | `api_key`, `model` | Configurable |
7071
| `SarvamTTS` | Sarvam | `api_key` ||
72+
| `XaiTTS` | xAI | `api_key`, `language` | Configurable |
7173

7274
### CN TTS Vendors
7375

@@ -82,6 +84,7 @@ Used with `agent.with_tts()` when routing to `Area.CN`. Use `MiniMaxCNTTS` and `
8284
| `CosyVoiceTTS` | CosyVoice | `api_key`, `model`, `voice` ||
8385
| `BytedanceDuplexTTS` | ByteDance Duplex | `app_id`, `token`, `resource_id`, `speaker` ||
8486
| `StepFunTTS` | StepFun | `api_key`, `model`, `voice_id` ||
87+
| `GenericTTS` | Generic OpenAI-compatible TTS | `url`, `headers`, `model`, `voice` | Configurable |
8588

8689
<!-- snippet: executable -->
8790
```python
@@ -113,6 +116,7 @@ Use `turn_detection.language` for Agora interaction language; it defaults to `en
113116
| `AssemblyAISTT` | AssemblyAI | `api_key`, `language` |
114117
| `AresSTT` | Ares | — (all optional) |
115118
| `SarvamSTT` | Sarvam | `api_key`, `language` |
119+
| `XaiSTT` | xAI | `api_key` |
116120

117121
### CN STT Vendors
118122

@@ -164,6 +168,7 @@ Used with `agent.with_avatar()` in the cascading ASR + LLM + TTS pipeline. Some
164168
| `AnamAvatar` | Anam | `api_key` | None |
165169
| `GenericAvatar` | Generic Avatar | `api_key`, `api_base_url`, `avatar_id`, `agora_uid` | None |
166170
| `SenseTimeAvatar` | SenseTime (CN) | `agora_uid`, `app_key`, `sceneList` | None |
171+
| `SpatiusAvatar` | Spatius (CN) | `spatius_api_key`, `spatius_app_id`, `spatius_avatar_id`, `agora_uid` | Optional avatar-declared sample rate |
167172

168173
<!-- snippet: executable -->
169174
```python

docs/guides/avatars.md

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,13 @@ Avatars are currently supported only with the cascading ASR + LLM + TTS pipeline
1818
| Anam | `AnamAvatar` | None |
1919
| Generic | `GenericAvatar` | None |
2020
| SenseTime (CN) | `SenseTimeAvatar` | None |
21+
| Spatius (CN) | `SpatiusAvatar` | Optional avatar-declared sample rate |
2122

2223
## Token Model
2324

2425
The agent and avatar join the same RTC channel with separate UIDs. The agent token is scoped to `agent_uid`; `avatar.params.agora_token` is scoped to the avatar `agora_uid`.
2526

26-
When using `AgentSession.start()`, `agora_token` is optional for LiveAvatar, HeyGen, Generic, and SenseTime avatars. If omitted, AgentKit generates it with the same ConvoAI token path as the agent, using the avatar UID. You can still pass `agora_token` explicitly.
27+
When using `AgentSession.start()`, `agora_token` is optional for LiveAvatar, HeyGen, Generic, SenseTime, and Spatius avatars. If omitted, AgentKit generates it with the same ConvoAI token path as the agent, using the avatar UID. You can still pass `agora_token` explicitly.
2728

2829
## Sample Rate Constraint
2930

@@ -126,6 +127,39 @@ agent = (
126127
)
127128
```
128129

130+
## Spatius Avatar (CN)
131+
132+
`SpatiusAvatar` is available for `Area.CN` sessions. Provide `spatius_api_key`, `spatius_app_id`, `spatius_avatar_id`, and `agora_uid` when constructing the avatar. `agora_token` is optional and is generated at session start when omitted, like SenseTime and Generic avatars.
133+
134+
```python
135+
from agora_agent import Agora, Area, CNAgent, GenericTTS, SpatiusAvatar, TencentSTT
136+
137+
client = Agora(
138+
area=Area.CN,
139+
app_id="your-app-id",
140+
app_certificate="your-app-certificate",
141+
)
142+
143+
agent = (
144+
CNAgent(client=client)
145+
.with_stt(TencentSTT(key="...", app_id="...", secret="...", engine_model_type="16k_zh", voice_id="..."))
146+
.with_tts(GenericTTS(
147+
url="https://tts.example.com/v1/audio",
148+
headers={"Authorization": "Bearer token"},
149+
model="tts-model",
150+
voice="voice-1",
151+
))
152+
.with_avatar(SpatiusAvatar(
153+
spatius_api_key="your-spatius-api-key",
154+
spatius_app_id="your-spatius-app-id",
155+
spatius_avatar_id="your-spatius-avatar-id",
156+
agora_uid="2",
157+
region="cn-beijing",
158+
sample_rate=16000,
159+
))
160+
)
161+
```
162+
129163
## Akool Avatar (16 kHz)
130164

131165
Akool requires a TTS vendor configured at 16000 Hz:
@@ -246,3 +280,18 @@ If you call `with_avatar()` before `with_tts()`, the sample rate check is deferr
246280
| `appId` | `str` | No | SenseTime application ID |
247281
| `enable` | `bool` | No | Whether to enable the avatar |
248282
| `additional_params` | `Dict[str, Any]` | No | Additional SenseTime avatar parameters |
283+
284+
## Spatius Options
285+
286+
| Parameter | Type | Required | Description |
287+
|---|---|---|---|
288+
| `spatius_api_key` | `str` | Yes | Spatius API key |
289+
| `spatius_app_id` | `str` | Yes | Spatius application ID |
290+
| `spatius_avatar_id` | `str` | Yes | Spatius avatar ID |
291+
| `agora_uid` | `str` | Yes | Avatar publisher RTC UID |
292+
| `agora_token` | `str` | No | Avatar publisher RTC token; generated at session start when omitted |
293+
| `region` | `str` | No | Spatius service region, for example `cn-beijing` |
294+
| `sample_rate` | `int` | No | Optional avatar-declared sample rate. When set, TTS sample rate should match it. |
295+
| `session_expire_minutes` | `int` | No | Spatius session validity duration in minutes |
296+
| `enable` | `bool` | No | Whether to enable the avatar |
297+
| `additional_params` | `Dict[str, Any]` | No | Additional Spatius avatar parameters |

docs/guides/regional-routing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ Bind `client` into `Agent(client=client, ...)` and construct vendors directly wi
3939

4040
| Client area | STT classes | LLM classes | TTS classes | Avatar classes |
4141
|---|---|---|---|---|
42-
| `Area.US`, `Area.EU`, `Area.AP` | `DeepgramSTT`, `SpeechmaticsSTT`, `MicrosoftSTT`, `OpenAISTT`, `GoogleSTT`, `AmazonSTT`, `AssemblyAISTT`, `AresSTT`, `SarvamSTT` | `OpenAI`, `AzureOpenAI`, `Anthropic`, `Gemini`, `Groq`, `VertexAILLM`, `AmazonBedrock`, `Dify`, `CustomLLM` | `ElevenLabsTTS`, `MicrosoftTTS`, `OpenAITTS`, `CartesiaTTS`, `GoogleTTS`, `AmazonTTS`, `DeepgramTTS`, `HumeAITTS`, `RimeTTS`, `FishAudioTTS`, `MiniMaxTTS`, `MurfTTS`, `SarvamTTS` | `LiveAvatarAvatar`, `HeyGenAvatar`, `AkoolAvatar`, `AnamAvatar`, `GenericAvatar` |
43-
| `Area.CN` | `FengmingSTT`, `TencentSTT`, `MicrosoftCNSTT`, `XfyunSTT`, `XfyunBigModelSTT`, `XfyunDialectSTT` | `AliyunLLM`, `BytedanceLLM`, `DeepSeekLLM`, `TencentLLM` | `MiniMaxCNTTS`, `TencentTTS`, `BytedanceTTS`, `MicrosoftCNTTS`, `CosyVoiceTTS`, `BytedanceDuplexTTS`, `StepFunTTS` | `SenseTimeAvatar` |
42+
| `Area.US`, `Area.EU`, `Area.AP` | `DeepgramSTT`, `SpeechmaticsSTT`, `MicrosoftSTT`, `OpenAISTT`, `GoogleSTT`, `AmazonSTT`, `AssemblyAISTT`, `AresSTT`, `SarvamSTT`, `XaiSTT` | `OpenAI`, `AzureOpenAI`, `Anthropic`, `Gemini`, `Groq`, `VertexAILLM`, `AmazonBedrock`, `Dify`, `CustomLLM` | `ElevenLabsTTS`, `MicrosoftTTS`, `OpenAITTS`, `CartesiaTTS`, `GoogleTTS`, `AmazonTTS`, `DeepgramTTS`, `HumeAITTS`, `RimeTTS`, `FishAudioTTS`, `MiniMaxTTS`, `MurfTTS`, `SarvamTTS`, `GenericTTS`, `XaiTTS` | `LiveAvatarAvatar`, `HeyGenAvatar`, `AkoolAvatar`, `AnamAvatar`, `GenericAvatar` |
43+
| `Area.CN` | `FengmingSTT`, `TencentSTT`, `MicrosoftCNSTT`, `XfyunSTT`, `XfyunBigModelSTT`, `XfyunDialectSTT` | `AliyunLLM`, `BytedanceLLM`, `DeepSeekLLM`, `TencentLLM` | `MiniMaxCNTTS`, `TencentTTS`, `BytedanceTTS`, `MicrosoftCNTTS`, `CosyVoiceTTS`, `BytedanceDuplexTTS`, `StepFunTTS`, `GenericTTS` | `SenseTimeAvatar`, `SpatiusAvatar` |
4444

4545
Global client example:
4646

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ The Agora Conversational AI Python SDK lets you build voice-powered AI agents on
4747
| [Vendors](./concepts/vendors.md) | Browse all LLM, TTS, STT, MLLM, and Avatar providers |
4848
| [Cascading Flow](./guides/cascading-flow.md) | Build an ASR -> LLM -> TTS pipeline |
4949
| [MLLM Flow](./guides/mllm-flow.md) | Use OpenAI Realtime, Gemini Live, Vertex AI, or xAI Grok for end-to-end audio |
50-
| [Avatars](./guides/avatars.md) | Add a digital avatar with LiveAvatar, Akool, Anam, or Generic Avatar |
50+
| [Avatars](./guides/avatars.md) | Add a digital avatar with LiveAvatar, Akool, Anam, Generic Avatar, SenseTime, or Spatius |
5151
| [Regional Routing](./guides/regional-routing.md) | Route requests to the nearest region |
5252
| [Error Handling](./guides/error-handling.md) | Handle API errors with ApiError |
5353
| [Pagination](./guides/pagination.md) | Iterate over paginated list endpoints |

docs/reference/vendors.md

Lines changed: 57 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ Construct vendors directly from `agora_agent`, then bind a client with `Agent(cl
1919

2020
| Area | STT classes | LLM classes | TTS classes | Avatar classes |
2121
|---|---|---|---|---|
22-
| `Area.US`, `Area.EU`, `Area.AP` | `DeepgramSTT`, `SpeechmaticsSTT`, `MicrosoftSTT`, `OpenAISTT`, `GoogleSTT`, `AmazonSTT`, `AssemblyAISTT`, `AresSTT`, `SarvamSTT` | `OpenAI`, `AzureOpenAI`, `Anthropic`, `Gemini`, `Groq`, `VertexAILLM`, `AmazonBedrock`, `Dify`, `CustomLLM` | `ElevenLabsTTS`, `MicrosoftTTS`, `OpenAITTS`, `CartesiaTTS`, `GoogleTTS`, `AmazonTTS`, `DeepgramTTS`, `HumeAITTS`, `RimeTTS`, `FishAudioTTS`, `MiniMaxTTS`, `MurfTTS`, `SarvamTTS` | `LiveAvatarAvatar`, `HeyGenAvatar`, `AkoolAvatar`, `AnamAvatar`, `GenericAvatar` |
23-
| `Area.CN` | `FengmingSTT`, `TencentSTT`, `MicrosoftCNSTT`, `XfyunSTT`, `XfyunBigModelSTT`, `XfyunDialectSTT` | `AliyunLLM`, `BytedanceLLM`, `DeepSeekLLM`, `TencentLLM` | `MiniMaxCNTTS`, `TencentTTS`, `BytedanceTTS`, `MicrosoftCNTTS`, `CosyVoiceTTS`, `BytedanceDuplexTTS`, `StepFunTTS` | `SenseTimeAvatar` |
22+
| `Area.US`, `Area.EU`, `Area.AP` | `DeepgramSTT`, `SpeechmaticsSTT`, `MicrosoftSTT`, `OpenAISTT`, `GoogleSTT`, `AmazonSTT`, `AssemblyAISTT`, `AresSTT`, `SarvamSTT`, `XaiSTT` | `OpenAI`, `AzureOpenAI`, `Anthropic`, `Gemini`, `Groq`, `VertexAILLM`, `AmazonBedrock`, `Dify`, `CustomLLM` | `ElevenLabsTTS`, `MicrosoftTTS`, `OpenAITTS`, `CartesiaTTS`, `GoogleTTS`, `AmazonTTS`, `DeepgramTTS`, `HumeAITTS`, `RimeTTS`, `FishAudioTTS`, `MiniMaxTTS`, `MurfTTS`, `SarvamTTS`, `GenericTTS`, `XaiTTS` | `LiveAvatarAvatar`, `HeyGenAvatar`, `AkoolAvatar`, `AnamAvatar`, `GenericAvatar` |
23+
| `Area.CN` | `FengmingSTT`, `TencentSTT`, `MicrosoftCNSTT`, `XfyunSTT`, `XfyunBigModelSTT`, `XfyunDialectSTT` | `AliyunLLM`, `BytedanceLLM`, `DeepSeekLLM`, `TencentLLM` | `MiniMaxCNTTS`, `TencentTTS`, `BytedanceTTS`, `MicrosoftCNTTS`, `CosyVoiceTTS`, `BytedanceDuplexTTS`, `StepFunTTS`, `GenericTTS` | `SenseTimeAvatar`, `SpatiusAvatar` |
2424

2525
Global example:
2626

@@ -85,6 +85,7 @@ tts = MiniMaxCNTTS(
8585
| `max_tokens` | `int` | No | `None` | Maximum tokens to generate |
8686
| `system_messages` | `List[Dict]` | No | `None` | System messages |
8787
| `greeting_message` | `str` | No | `None` | Greeting message |
88+
| `greeting_audio_url` | `str` | No | `None` | Publicly accessible greeting audio URL |
8889
| `failure_message` | `str` | No | `None` | Failure message |
8990
| `input_modalities` | `List[str]` | No | `None` | Input modalities |
9091
| `output_modalities` | `List[str]` | No | `None` | Output modalities |
@@ -93,6 +94,8 @@ tts = MiniMaxCNTTS(
9394
| `greeting_configs` | `Dict[str, Any]` | No | `None` | Greeting playback configuration |
9495
| `template_variables` | `Dict[str, str]` | No | `None` | Template variables for messages |
9596

97+
`greeting_configs` may also include `audio_download_timeout_ms`, `audio_pcm_sample_rate`, and `uninterruptible_asr_policy`.
98+
9699
<!-- snippet: fragment -->
97100
```python
98101
from agora_agent import OpenAI
@@ -378,6 +381,33 @@ The SDK also includes named helpers for the remaining Agora-supported LLM provid
378381
| `sample_rate` | `int` | No | `None` | Audio sample rate |
379382
| `skip_patterns` | `List[int]` | No | `None` | Skip patterns |
380383

384+
### `GenericTTS`
385+
386+
| Parameter | Type | Required | Default | Description |
387+
|---|---|---|---|---|
388+
| `url` | `str` | Yes || Callback address of the generic TTS service |
389+
| `headers` | `Dict[str, str]` | Yes || Custom headers to include in requests to the generic TTS service |
390+
| `model` | `str` | Yes || TTS model name |
391+
| `voice` | `str` | Yes || Voice name |
392+
| `api_key` | `str` | No | `None` | API key for the generic TTS service |
393+
| `speed` | `float` | No | `None` | Speech rate |
394+
| `sample_rate` | `int` | No | `None` | Output audio sample rate in Hz |
395+
| `response_format` | `str` | No | `None` | Output audio format; use `pcm` |
396+
| `instruction` | `str` | No | `None` | Additional voice style control instruction |
397+
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional generic TTS parameters |
398+
| `skip_patterns` | `List[int]` | No | `None` | Skip patterns |
399+
400+
### `XaiTTS`
401+
402+
| Parameter | Type | Required | Default | Description |
403+
|---|---|---|---|---|
404+
| `api_key` | `str` | Yes || xAI API key |
405+
| `language` | `str` | Yes || BCP-47 language code for speech synthesis |
406+
| `voice_id` | `str` | No | `None` | xAI voice identifier |
407+
| `sample_rate` | `int` | No | `None` | Audio sample rate in Hz |
408+
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional xAI TTS parameters |
409+
| `skip_patterns` | `List[int]` | No | `None` | Skip patterns |
410+
381411
---
382412

383413
## STT Vendors
@@ -471,6 +501,16 @@ For `nova-2` and `nova-3`, omit `api_key` to use Agora-managed credentials. For
471501
| `language` | `str` | Yes || Language code (e.g., `en`, `hi`) |
472502
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional parameters |
473503

504+
### `XaiSTT`
505+
506+
| Parameter | Type | Required | Default | Description |
507+
|---|---|---|---|---|
508+
| `api_key` | `str` | Yes || xAI API key |
509+
| `base_url` | `str` | No | `None` | WebSocket endpoint URL for the xAI streaming STT API |
510+
| `sample_rate` | `int` | No | `None` | Audio sample rate in Hz |
511+
| `language` | `str` | No | `None` | Language code for speech recognition |
512+
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional xAI STT parameters |
513+
474514
---
475515

476516
## CN Vendors
@@ -658,6 +698,21 @@ No constructor parameters. Use `FengmingSTT()`.
658698
| `enable` | `bool` | No | `None` | Whether to enable the avatar |
659699
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional SenseTime avatar parameters |
660700

701+
#### `SpatiusAvatar`
702+
703+
| Parameter | Type | Required | Default | Description |
704+
|---|---|---|---|---|
705+
| `spatius_api_key` | `str` | Yes || Spatius API key |
706+
| `spatius_app_id` | `str` | Yes || Spatius application ID |
707+
| `spatius_avatar_id` | `str` | Yes || Spatius avatar ID |
708+
| `agora_uid` | `str` | Yes || Agora UID used by the avatar service |
709+
| `agora_token` | `str` | No | `None` | RTC token for avatar publisher; generated by AgentSession when omitted |
710+
| `region` | `str` | No | `None` | Spatius service region, for example `cn-beijing` |
711+
| `sample_rate` | `int` | No | `None` | Optional avatar-declared sample rate; TTS sample rate should match when set |
712+
| `session_expire_minutes` | `int` | No | `None` | Spatius session validity duration in minutes |
713+
| `enable` | `bool` | No | `None` | Whether to enable the avatar |
714+
| `additional_params` | `Dict[str, Any]` | No | `None` | Additional Spatius avatar parameters |
715+
661716
## MLLM Vendors
662717

663718
### `OpenAIRealtime`

src/agora_agent/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
CNAgent,
2222
GlobalAgent,
2323
GenericAvatar,
24+
SpatiusAvatar,
2425
RegionalAgent,
2526
XaiGrok,
2627
generate_rtc_token,
@@ -39,6 +40,7 @@
3940
"AsyncAgora": ".pool_client",
4041
"AsyncAgentClient": ".pool_client",
4142
"GenericAvatar": ".agentkit",
43+
"SpatiusAvatar": ".agentkit",
4244
"XaiGrok": ".agentkit",
4345
"GenerateTokenOptions": ".agentkit",
4446
"__version__": ".version",
@@ -62,6 +64,7 @@
6264
"AsyncAgora",
6365
"AsyncAgentClient",
6466
"GenericAvatar",
67+
"SpatiusAvatar",
6568
"XaiGrok",
6669
"GenerateTokenOptions",
6770
"Pool",

src/agora_agent/agentkit/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@
9292
is_live_avatar_avatar,
9393
is_rtc_avatar,
9494
is_sensetime_avatar,
95+
is_spatius_avatar,
9596
validate_avatar_config,
9697
validate_tts_sample_rate,
9798
)
@@ -189,15 +190,19 @@
189190
OpenAISampleRate,
190191
OpenAISTT,
191192
OpenAITTS,
193+
GenericTTS,
192194
RimeTTS,
193195
SampleRate,
194196
SarvamSTT,
195197
SarvamTTS,
196198
SpeechmaticsSTT,
199+
XaiSTT,
200+
XaiTTS,
197201
VertexAI,
198202
VertexAILLM,
199203
XaiGrok,
200204
LiveAvatarAvatar,
205+
SpatiusAvatar,
201206
)
202207
from .vendors.cn import (
203208
AliyunLLM,
@@ -382,6 +387,7 @@
382387
"MicrosoftTTS",
383388
"MicrosoftCNTTS",
384389
"OpenAITTS",
390+
"GenericTTS",
385391
"CartesiaTTS",
386392
"DeepgramTTS",
387393
"GoogleTTS",
@@ -398,6 +404,7 @@
398404
"StepFunTTS",
399405
"MurfTTS",
400406
"SarvamTTS",
407+
"XaiTTS",
401408
"SpeechmaticsSTT",
402409
"DeepgramSTT",
403410
"MicrosoftSTT",
@@ -408,6 +415,7 @@
408415
"AssemblyAISTT",
409416
"AresSTT",
410417
"SarvamSTT",
418+
"XaiSTT",
411419
"TencentSTT",
412420
"FengmingSTT",
413421
"XfyunBigModelSTT",
@@ -422,13 +430,15 @@
422430
"AkoolAvatar",
423431
"AnamAvatar",
424432
"GenericAvatar",
433+
"SpatiusAvatar",
425434
"SenseTimeAvatar",
426435
"is_heygen_avatar",
427436
"is_live_avatar_avatar",
428437
"is_akool_avatar",
429438
"is_anam_avatar",
430439
"is_generic_avatar",
431440
"is_sensetime_avatar",
441+
"is_spatius_avatar",
432442
"validate_avatar_config",
433443
"validate_tts_sample_rate",
434444
]

src/agora_agent/agentkit/agent_session.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
is_live_avatar_avatar,
3232
is_rtc_avatar,
3333
is_sensetime_avatar,
34+
is_spatius_avatar,
3435
validate_avatar_config,
3536
validate_tts_sample_rate,
3637
)
@@ -221,6 +222,7 @@ def _validate_avatar_config(self) -> None:
221222
or is_anam_avatar(avatar)
222223
or is_generic_avatar(avatar)
223224
or is_sensetime_avatar(avatar)
225+
or is_spatius_avatar(avatar)
224226
):
225227
validate_avatar_config(avatar)
226228

@@ -249,6 +251,13 @@ def _validate_avatar_config(self) -> None:
249251
"Warning: Akool avatar detected but TTS sample_rate is not explicitly set. "
250252
"Akool requires 16,000 Hz. Please ensure your TTS provider is configured for 16kHz."
251253
)
254+
elif is_spatius_avatar(avatar):
255+
avatar_sample_rate = avatar.get("params", {}).get("sample_rate")
256+
if isinstance(avatar_sample_rate, int):
257+
self._warn(
258+
"Warning: Spatius avatar declares a sample_rate but TTS sample_rate is not explicitly set. "
259+
"Please ensure your TTS provider matches the avatar sample_rate."
260+
)
252261

253262
def _enrich_avatar_for_session(
254263
self, properties: typing.Dict[str, typing.Any]

0 commit comments

Comments
 (0)