Skip to content

Commit e37d049

Browse files
updated docs
1 parent 4138183 commit e37d049

3 files changed

Lines changed: 382 additions & 3 deletions

File tree

docs/concepts/agent.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,10 @@ agent = Agent(
3333
| `sal` | `SalConfig` | No | SAL (Speech Activity Level) configuration |
3434
| `advanced_features` | `Dict[str, Any]` | No | Advanced features (e.g., `{'enable_mllm': True}`) |
3535
| `parameters` | `SessionParams` | No | Additional session parameters |
36+
| `geofence` | `GeofenceConfig` | No | Regional access restriction |
37+
| `labels` | `Dict[str, str]` | No | Custom key-value labels (returned in callbacks) |
38+
| `rtc` | `RtcConfig` | No | RTC media encryption |
39+
| `filler_words` | `FillerWordsConfig` | No | Filler words while waiting for LLM |
3640

3741
## Builder Methods
3842

@@ -55,7 +59,16 @@ Each `with_*` method returns a **new** `Agent` instance — the original is unch
5559
| `with_instructions(text)` | `str` | Override the system prompt |
5660
| `with_greeting(text)` | `str` | Override the greeting message |
5761
| `with_name(name)` | `str` | Override the agent name |
58-
| `with_turn_detection(config)` | `TurnDetectionConfig` | Override turn detection settings |
62+
| `with_turn_detection(config)` | `TurnDetectionConfig` | Override turn detection (use `config.start_of_speech` / `config.end_of_speech` for SOS/EOS) |
63+
| `with_sal(config)` | `SalConfig` | Set SAL configuration |
64+
| `with_advanced_features(features)` | `Dict[str, Any]` | Set advanced features |
65+
| `with_parameters(parameters)` | `SessionParams` | Set session parameters |
66+
| `with_failure_message(message)` | `str` | Set failure message |
67+
| `with_max_history(max_history)` | `int` | Set max history length |
68+
| `with_geofence(geofence)` | `GeofenceConfig` | Set geofence configuration |
69+
| `with_labels(labels)` | `Dict[str, str]` | Set custom labels |
70+
| `with_rtc(rtc)` | `RtcConfig` | Set RTC configuration |
71+
| `with_filler_words(filler_words)` | `FillerWordsConfig` | Set filler words configuration |
5972

6073
## Chaining Example
6174

@@ -139,9 +152,19 @@ See [Avatar Integration](../guides/avatars.md) for details.
139152
| `agent.name` | `Optional[str]` | Agent name |
140153
| `agent.instructions` | `Optional[str]` | System prompt |
141154
| `agent.greeting` | `Optional[str]` | Greeting message |
155+
| `agent.failure_message` | `Optional[str]` | Message spoken when LLM fails |
156+
| `agent.max_history` | `Optional[int]` | Max conversation history length |
142157
| `agent.llm` | `Optional[Dict]` | LLM configuration dict |
143158
| `agent.tts` | `Optional[Dict]` | TTS configuration dict |
144159
| `agent.stt` | `Optional[Dict]` | STT configuration dict |
145160
| `agent.mllm` | `Optional[Dict]` | MLLM configuration dict |
161+
| `agent.avatar` | `Optional[Dict]` | Avatar configuration dict |
146162
| `agent.turn_detection` | `Optional[TurnDetectionConfig]` | Turn detection settings |
163+
| `agent.sal` | `Optional[SalConfig]` | SAL configuration |
164+
| `agent.advanced_features` | `Optional[Dict]` | Advanced features |
165+
| `agent.parameters` | `Optional[SessionParams]` | Session parameters |
166+
| `agent.geofence` | `Optional[GeofenceConfig]` | Geofence configuration |
167+
| `agent.labels` | `Optional[Dict[str, str]]` | Custom labels |
168+
| `agent.rtc` | `Optional[RtcConfig]` | RTC configuration |
169+
| `agent.filler_words` | `Optional[FillerWordsConfig]` | Filler words configuration |
147170
| `agent.config` | `Dict[str, Any]` | Full configuration dict |
Lines changed: 302 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
---
2+
sidebar_position: 5
3+
title: Agent Builder Features
4+
description: Configure SAL, advanced features, parameters, geofence, labels, RTC, filler words, and more.
5+
---
6+
7+
# Agent Builder Features
8+
9+
The Agent builder supports many configuration options beyond the core LLM, TTS, and STT vendors. This guide shows how to use each feature.
10+
11+
## Overview
12+
13+
| Feature | Method | Description |
14+
|---|---|---|
15+
| `sal` | `with_sal(config)` | Selective Attention Locking — speaker recognition and noise suppression |
16+
| `advanced_features` | `with_advanced_features(features)` | Enable MLLM, RTM, SAL, tools |
17+
| `parameters` | `with_parameters(params)` | Silence config, farewell config, data channel |
18+
| `failure_message` | `with_failure_message(msg)` | Message spoken when LLM fails |
19+
| `max_history` | `with_max_history(n)` | Max conversation turns in LLM context |
20+
| `geofence` | `with_geofence(config)` | Restrict backend server regions |
21+
| `labels` | `with_labels(labels)` | Custom key-value labels (returned in callbacks) |
22+
| `rtc` | `with_rtc(config)` | RTC media encryption |
23+
| `filler_words` | `with_filler_words(config)` | Filler words while waiting for LLM |
24+
25+
## SAL (Selective Attention Locking)
26+
27+
SAL helps the agent focus on the primary speaker and suppress background noise. Enable it via `advanced_features` and configure with `with_sal`:
28+
29+
```python
30+
from agora_agent import Agora, Area
31+
from agora_agent.agentkit import Agent
32+
from agora_agent.agentkit.vendors import OpenAI, ElevenLabsTTS, DeepgramSTT
33+
34+
agent = (
35+
Agent(
36+
name='sal-assistant',
37+
instructions='You are a helpful assistant.',
38+
advanced_features={'enable_sal': True},
39+
)
40+
.with_sal({
41+
'sal_mode': 'locking',
42+
'sample_urls': {'primary-speaker': 'https://example.com/voiceprint.pcm'},
43+
})
44+
.with_llm(OpenAI(api_key='your-key', model='gpt-4o-mini'))
45+
.with_tts(ElevenLabsTTS(key='your-key', model_id='eleven_flash_v2_5', voice_id='your-voice-id', sample_rate=24000))
46+
.with_stt(DeepgramSTT(api_key='your-key', model='nova-2', language='en-US'))
47+
)
48+
```
49+
50+
`sal_mode` can be `'locking'` (speaker lock) or `'recognition'` (voiceprint recognition).
51+
52+
## Advanced Features
53+
54+
Enable MLLM, RTM, SAL, or tools:
55+
56+
```python
57+
from agora_agent.agentkit.vendors import OpenAIRealtime
58+
59+
# MLLM mode (see mllm-flow guide)
60+
agent = Agent(advanced_features={'enable_mllm': True}).with_mllm(OpenAIRealtime(api_key='...'))
61+
62+
# RTM signaling for custom data delivery
63+
agent = Agent(advanced_features={'enable_rtm': True})
64+
65+
# Enable tool invocation via MCP
66+
agent = Agent(advanced_features={'enable_tools': True})
67+
```
68+
69+
## Session Parameters
70+
71+
Configure silence handling, farewell behavior, and data channel:
72+
73+
```python
74+
from agora_agent.agentkit import Agent
75+
76+
agent = (
77+
Agent(name='params-agent')
78+
.with_parameters({
79+
'silence_config': {
80+
'timeout_ms': 10000,
81+
'action': 'speak',
82+
'content': "I'm still here. Take your time.",
83+
},
84+
'farewell_config': {
85+
'graceful_enabled': True,
86+
'graceful_timeout_seconds': 10,
87+
},
88+
'data_channel': 'rtm', # or 'datastream'
89+
})
90+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
91+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
92+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
93+
)
94+
```
95+
96+
## Failure Message and Max History
97+
98+
```python
99+
agent = (
100+
Agent(
101+
name='assistant',
102+
failure_message='Sorry, I encountered an error. Please try again.',
103+
max_history=20,
104+
)
105+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
106+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
107+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
108+
)
109+
110+
# Or via builder methods
111+
agent = (
112+
Agent()
113+
.with_failure_message('Something went wrong.')
114+
.with_max_history(15)
115+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
116+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
117+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
118+
)
119+
```
120+
121+
## Geofence
122+
123+
Restrict which geographic regions the backend can use:
124+
125+
```python
126+
agent = (
127+
Agent()
128+
.with_geofence({'area': 'NORTH_AMERICA'})
129+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
130+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
131+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
132+
)
133+
134+
# Global with exclusion
135+
agent = (
136+
Agent()
137+
.with_geofence({'area': 'GLOBAL', 'exclude_area': 'EUROPE'})
138+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
139+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
140+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
141+
)
142+
```
143+
144+
Valid `area` values: `'GLOBAL'`, `'NORTH_AMERICA'`, `'EUROPE'`, `'ASIA'`, `'INDIA'`, `'JAPAN'`.
145+
146+
## Labels
147+
148+
Attach custom labels returned in notification callbacks:
149+
150+
```python
151+
agent = (
152+
Agent()
153+
.with_labels({
154+
'environment': 'production',
155+
'team': 'support',
156+
'version': '1.2.0',
157+
})
158+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
159+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
160+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
161+
)
162+
```
163+
164+
## RTC Encryption
165+
166+
Configure RTC media encryption:
167+
168+
```python
169+
agent = (
170+
Agent()
171+
.with_rtc({
172+
'encryption_key': 'your-32-byte-key',
173+
'encryption_mode': 5, # AES_128_GCM
174+
})
175+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
176+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
177+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
178+
)
179+
```
180+
181+
## Filler Words
182+
183+
Play filler words while waiting for the LLM response:
184+
185+
```python
186+
agent = (
187+
Agent()
188+
.with_filler_words({
189+
'enable': True,
190+
'trigger': {
191+
'mode': 'fixed_time',
192+
'fixed_time_config': {'response_wait_ms': 2000},
193+
},
194+
'content': {
195+
'mode': 'static',
196+
'static_config': {
197+
'phrases': ['Let me think...', 'One moment...', 'Hmm...'],
198+
'selection_rule': 'shuffle',
199+
},
200+
},
201+
})
202+
.with_llm(OpenAI(api_key='...', model='gpt-4o-mini'))
203+
.with_tts(ElevenLabsTTS(key='...', model_id='...', voice_id='...', sample_rate=24000))
204+
.with_stt(DeepgramSTT(api_key='...', model='nova-2'))
205+
)
206+
```
207+
208+
## Properties (Getters)
209+
210+
Read back configuration via properties:
211+
212+
```python
213+
agent = (
214+
Agent(max_history=20)
215+
.with_geofence({'area': 'EUROPE'})
216+
.with_labels({'env': 'staging'})
217+
)
218+
219+
agent.name # str | None
220+
agent.max_history # 20
221+
agent.geofence # {'area': 'EUROPE'}
222+
agent.labels # {'env': 'staging'}
223+
agent.sal # SalConfig | None
224+
agent.advanced_features
225+
agent.parameters
226+
agent.failure_message
227+
agent.rtc
228+
agent.filler_words
229+
agent.config # Full read-only snapshot
230+
```
231+
232+
## Chaining All Features
233+
234+
```python
235+
from agora_agent import Agora, Area
236+
from agora_agent.agentkit import Agent
237+
from agora_agent.agentkit.vendors import OpenAI, ElevenLabsTTS, DeepgramSTT
238+
239+
client = Agora(
240+
area=Area.US,
241+
app_id='your-app-id',
242+
app_certificate='your-app-certificate',
243+
)
244+
245+
agent = (
246+
Agent(
247+
name='full-featured-assistant',
248+
instructions='You are a helpful voice assistant.',
249+
greeting='Hello! How can I help?',
250+
failure_message='Sorry, I had trouble processing that.',
251+
max_history=20,
252+
)
253+
.with_llm(OpenAI(api_key='your-key', model='gpt-4o-mini'))
254+
.with_tts(ElevenLabsTTS(key='your-key', model_id='eleven_flash_v2_5', voice_id='your-voice-id', sample_rate=24000))
255+
.with_stt(DeepgramSTT(api_key='your-key', model='nova-2', language='en-US'))
256+
.with_advanced_features({'enable_rtm': True})
257+
.with_parameters({
258+
'silence_config': {
259+
'timeout_ms': 8000,
260+
'action': 'speak',
261+
'content': "I'm listening.",
262+
},
263+
'farewell_config': {
264+
'graceful_enabled': True,
265+
'graceful_timeout_seconds': 5,
266+
},
267+
})
268+
.with_geofence({'area': 'NORTH_AMERICA'})
269+
.with_labels({'app': 'voice-assistant', 'version': '2.0'})
270+
.with_filler_words({
271+
'enable': True,
272+
'trigger': {
273+
'mode': 'fixed_time',
274+
'fixed_time_config': {'response_wait_ms': 1500},
275+
},
276+
'content': {
277+
'mode': 'static',
278+
'static_config': {
279+
'phrases': ['Let me think...', 'One moment please.'],
280+
'selection_rule': 'shuffle',
281+
},
282+
},
283+
})
284+
)
285+
286+
session = agent.create_session(
287+
client,
288+
channel='demo-room',
289+
agent_uid='1',
290+
remote_uids=['100'],
291+
idle_timeout=120,
292+
)
293+
294+
agent_id = session.start()
295+
```
296+
297+
## Next steps
298+
299+
- [Agent Reference](../reference/agent.md) — full API signatures
300+
- [Cascading Flow](./cascading-flow.md) — ASR → LLM → TTS setup
301+
- [MLLM Flow](./mllm-flow.md) — multimodal flow with `enable_mllm`
302+
- [Regional Routing](./regional-routing.md) — client area and geofence

0 commit comments

Comments
 (0)