Skip to content

Commit 1979c32

Browse files
plutolessseymourtang
authored andcommitted
Merge pull request AgoraIO#48 from seymourtang/remove-default-asr-vendor
fix:fix default ASR vendor fallback for global and CN profiles
2 parents 4f83e51 + 09412e2 commit 1979c32

11 files changed

Lines changed: 90 additions & 11 deletions

File tree

docs/concepts/agent.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ agent = Agent(client=client).with_llm(
4646
| `rtc` | `RtcConfig` | No | RTC media encryption |
4747
| `filler_words` | `FillerWordsConfig` | No | Filler words while waiting for LLM |
4848

49-
When `client` is provided, `Agent(client=...)` returns `CNAgent` for `Area.CN` and `GlobalAgent` for global areas.
49+
When `client` is provided, `Agent(client=...)` returns `CNAgent` for `Area.CN` and `GlobalAgent` for global areas. If `with_stt()` is omitted, the bound client also determines the default ASR vendor: `Fengming` for `Area.CN`, otherwise `Ares`.
5050

5151
## Builder Methods
5252

docs/concepts/vendors.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Used with `agent.with_llm()` for the cascading flow (ASR → LLM → TTS).
2424
| `OpenAI` | OpenAI | `model` for Agora-managed global models; `api_key`, `base_url`, `model` for BYOK |
2525
| `AzureOpenAI` | Azure OpenAI | `api_key`, `model`, `endpoint`, `deployment_name` |
2626
| `Anthropic` | Anthropic | `api_key`, `model`, `url`, `headers`, `max_tokens` |
27-
| `Gemini` | Google Gemini | `api_key`, `model` |
27+
| `Gemini` | Google Gemini | `api_key`, `model`; optional `url` |
2828
| `Groq` | Groq | `api_key`, `model`, `base_url` |
2929
| `VertexAILLM` | Google Vertex AI | `api_key`, `model`, `project_id`, `location` |
3030
| `AmazonBedrock` | Amazon Bedrock | `access_key`, `secret_key`, `region`, `model` |
@@ -103,7 +103,7 @@ tts = ElevenLabsTTS(
103103

104104
Used with `agent.with_stt()`.
105105

106-
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. STT vendor `language` options are serialized under `asr.params` using each provider's own format. Ares does not take a provider language option; AgentKit uses `turn_detection.language` for REST `asr.language`.
106+
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. STT vendor `language` options are serialized under `asr.params` using each provider's own format. If `with_stt()` is omitted, AgentKit defaults to `AresSTT` for global clients and `FengmingSTT` for `Area.CN` clients. Ares does not take a provider language option; AgentKit uses `turn_detection.language` for REST `asr.language`.
107107

108108
| Class | Provider | Required Parameters |
109109
|---|---|---|

docs/guides/regional-routing.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ client = Agora(
3636
## Recommended vendors by area
3737

3838
Bind `client` into `Agent(client=client, ...)` and construct vendors directly with SDK classes. The bound client selects `CNAgent` or `GlobalAgent` for IDE hints based on `area`, but does not restrict which vendor classes you can configure.
39+
If you omit `with_stt()`, AgentKit uses `FengmingSTT` by default for `Area.CN` clients and `AresSTT` for global clients.
3940

4041
| Client area | STT classes | LLM classes | TTS classes | Avatar classes |
4142
|---|---|---|---|---|
@@ -125,6 +126,7 @@ agent = Agent(client=client, turn_detection={"language": "zh-CN"})
125126
```
126127

127128
`Agent(client=...)` returns `CNAgent` for `Area.CN` and `GlobalAgent` for global areas. A bound `client` is required. The SDK does not reject mismatched vendor classes at build time or session start.
129+
The same bound client also controls the default ASR vendor when `with_stt()` is omitted.
128130

129131
## How the domain pool works
130132

docs/reference/agent.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ description: Full API reference for the Python Agent builder class.
99
**Import:** `from agora_agent import Agent, CNAgent, GlobalAgent`
1010

1111
Bind the client on every `Agent` builder via `Agent(client=client, ...)`, then pass vendor classes directly. The bound client sets the API routing region and provides area-specific IDE hints via `CNAgent` / `GlobalAgent`:
12+
it also selects the default ASR vendor when `with_stt()` is omitted (`Fengming` for `Area.CN`, otherwise `Ares`).
1213

1314
> **`client` is required.** `create_session()` and `create_async_session()` raise `ValueError` if no client was bound on the agent.
1415

docs/reference/vendors.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ llm = Anthropic(
176176
|---|---|---|---|---|
177177
| `api_key` | `str` | Yes || Google AI API key |
178178
| `model` | `str` | Yes || Model name |
179+
| `url` | `str` | No | `None` | Custom Gemini `streamGenerateContent` URL. When omitted, the SDK constructs one from `model` and `api_key`. |
179180
| `temperature` | `float` | No | `None` | Sampling temperature (0.0–2.0) |
180181
| `top_p` | `float` | No | `None` | Nucleus sampling (0.0–1.0) |
181182
| `top_k` | `int` | No | `None` | Top-k sampling |
@@ -194,7 +195,10 @@ llm = Anthropic(
194195
```python
195196
from agora_agent import Gemini
196197

197-
llm = Gemini(api_key='your-google-key', model='gemini-2.0-flash-exp')
198+
llm = Gemini(
199+
api_key='your-google-key',
200+
model='gemini-2.0-flash-exp',
201+
)
198202
```
199203

200204
### Other LLM vendors
@@ -412,7 +416,7 @@ The SDK also includes named helpers for the remaining Agora-supported LLM provid
412416

413417
## STT Vendors
414418

415-
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. Provider-specific language values remain under `asr.params` and may use a different format. AgentKit populates REST `asr.language` from `turn_detection.language`.
419+
Use `turn_detection.language` for Agora interaction language; it defaults to `en-US`. Provider-specific language values remain under `asr.params` and may use a different format. If `with_stt()` is omitted, AgentKit defaults to `AresSTT` for global clients and `FengmingSTT` for `Area.CN` clients. AgentKit populates REST `asr.language` from `turn_detection.language`.
416420

417421
### `SpeechmaticsSTT`
418422

src/agora_agent/agentkit/agent.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1078,7 +1078,8 @@ def _resolve_llm_config(self) -> typing.Dict[str, typing.Any]:
10781078
def _resolve_asr_config(self, turn_detection_config: TurnDetectionInput) -> typing.Dict[str, typing.Any]:
10791079
asr_config = dict(self._stt or {})
10801080
if not asr_config:
1081-
asr_config["vendor"] = "ares"
1081+
area_scope = getattr(self._client, "area_scope", None)
1082+
asr_config["vendor"] = "fengming" if area_scope == "cn" else "ares"
10821083
asr_config["language"] = self._field_value(turn_detection_config, "language")
10831084
return asr_config
10841085

src/agora_agent/agentkit/vendors/llm.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -304,8 +304,10 @@ def to_config(self) -> Dict[str, Any]:
304304
params["max_output_tokens"] = self.options.max_output_tokens
305305

306306
config: Dict[str, Any] = {
307-
"url": self.options.url or "https://generativelanguage.googleapis.com/v1beta/models",
308-
"api_key": self.options.api_key,
307+
"url": self.options.url or (
308+
f"https://generativelanguage.googleapis.com/v1beta/models/"
309+
f"{self.options.model}:streamGenerateContent?alt=sse&key={self.options.api_key}"
310+
),
309311
"params": params,
310312
"style": "gemini",
311313
"input_modalities": self.options.input_modalities or ["text"],
@@ -394,7 +396,9 @@ def to_config(self) -> Dict[str, Any]:
394396
f"{self.options.project_id}/locations/{self.options.location}/"
395397
f"publishers/google/models/{self.options.model}:streamGenerateContent?alt=sse"
396398
)
397-
return Gemini(**options).to_config()
399+
config = Gemini(**options).to_config()
400+
config["api_key"] = self.options.api_key
401+
return config
398402

399403

400404
class AmazonBedrockOptions(BaseModel):

tests/custom/test_llm_vendors.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,32 @@ def test_vertex_ai_llm_includes_project_routing() -> None:
6969
assert "location" not in config.get("params", {})
7070

7171

72+
def test_vertex_ai_llm_preserves_explicit_url() -> None:
73+
config = VertexAILLM(
74+
api_key="vertex-token",
75+
model="gemini-2.0-flash",
76+
project_id="project",
77+
location="us-central1",
78+
url="https://vertex.example.com/custom-endpoint",
79+
).to_config()
80+
81+
assert config["url"] == "https://vertex.example.com/custom-endpoint"
82+
assert config["api_key"] == "vertex-token"
83+
assert config["params"]["model"] == "gemini-2.0-flash"
84+
85+
86+
def test_gemini_constructs_url_from_api_key_and_model() -> None:
87+
config = Gemini(
88+
api_key="google-key",
89+
model="gemini-2.0-flash",
90+
).to_config()
91+
92+
assert config["url"] == "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=google-key"
93+
assert "api_key" not in config
94+
assert config["style"] == "gemini"
95+
assert config["params"]["model"] == "gemini-2.0-flash"
96+
97+
7298
def test_amazon_bedrock_serializes_as_bedrock_style() -> None:
7399
config = AmazonBedrock(
74100
access_key="aws-access",

tests/custom/test_regional_vendors.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,28 @@ def test_agent_constructor_auto_selects_area_aware_subclass() -> None:
6666
assert global_agent.__class__.__name__ == "GlobalAgent"
6767

6868

69+
def test_default_asr_vendor_is_area_aware_when_with_stt_is_omitted() -> None:
70+
cn_properties = Agent(client=_client(Area.CN)).to_properties(
71+
channel="room",
72+
agent_uid="1",
73+
remote_uids=["100"],
74+
token="rtc-token",
75+
allow_missing_vendor_categories={"llm", "tts"},
76+
)
77+
global_properties = Agent(client=_client(Area.US)).to_properties(
78+
channel="room",
79+
agent_uid="1",
80+
remote_uids=["100"],
81+
token="rtc-token",
82+
allow_missing_vendor_categories={"llm", "tts"},
83+
)
84+
85+
assert cn_properties.asr is not None
86+
assert cn_properties.asr.vendor == "fengming"
87+
assert global_properties.asr is not None
88+
assert global_properties.asr.vendor == "ares"
89+
90+
6991
def test_cn_client_allows_global_only_vendor() -> None:
7092
client = _client(Area.CN)
7193
agent = Agent(client=client).with_stt(

tests/custom/test_request_body.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -898,9 +898,15 @@ def test_byok_anthropic_llm_params() -> None:
898898

899899

900900
def test_byok_gemini_llm_params() -> None:
901-
agent = Agent(test_client()).with_llm(Gemini(api_key="gemini-key", model="gemini-2.0-flash"))
901+
agent = Agent(test_client()).with_llm(
902+
Gemini(
903+
api_key="gemini-key",
904+
model="gemini-2.0-flash",
905+
)
906+
)
902907
props = build_properties(agent, allow_missing={"asr", "tts"})
903-
assert props["llm"]["api_key"] == "gemini-key"
908+
assert "api_key" not in props["llm"]
909+
assert props["llm"]["url"] == "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=gemini-key"
904910
assert props["llm"]["style"] == "gemini"
905911
assert props["llm"]["params"]["model"] == "gemini-2.0-flash"
906912

0 commit comments

Comments
 (0)