| sidebar_position | 2 |
|---|---|
| title | Agent |
| description | Full API reference for the Python Agent builder class. |
Import: from agora_agent import Agent, CNAgent, GlobalAgent
Bind the client on every Agent builder via Agent(client=client, ...), then pass vendor classes directly. The bound client sets the API routing region and provides area-specific IDE hints via CNAgent / GlobalAgent:
clientis required.create_session()andcreate_async_session()raiseValueErrorif no client was bound on the agent.
from agora_agent import Agent, Agora, Area
client = Agora(area=Area.US, app_id="...", app_certificate="...")
agent = Agent(client=client)Agent(
client: Agora | AsyncAgora,
instructions: Optional[str] = None,
turn_detection: Optional[TurnDetectionConfig] = None,
interruption: Optional[InterruptionConfig] = None,
sal: Optional[SalConfig] = None,
advanced_features: Optional[Dict[str, Any]] = None,
parameters: Optional[SessionParams] = None,
greeting: Optional[str] = None,
failure_message: Optional[str] = None,
max_history: Optional[int] = None,
geofence: Optional[GeofenceConfig] = None,
labels: Optional[Dict[str, str]] = None,
rtc: Optional[RtcConfig] = None,
filler_words: Optional[FillerWordsConfig] = None,
pipeline_id: Optional[str] = None,
)| Parameter | Type | Default | Description |
|---|---|---|---|
client |
Agora / AsyncAgora |
— | Required. Authenticated client used by create_session() and create_async_session() |
instructions |
Optional[str] |
None |
Deprecated. Use LLM vendor system_messages instead. |
turn_detection |
Optional[TurnDetectionConfig] |
None |
Interaction language and turn detection configuration |
interruption |
Optional[InterruptionConfig] |
None |
Unified interruption control configuration |
sal |
Optional[SalConfig] |
None |
Speech Activity Level configuration |
advanced_features |
Optional[Dict[str, Any]] |
None |
Advanced features dict (e.g., {'enable_rtm': True}) |
parameters |
Optional[SessionParams] |
None |
Additional session parameters |
greeting |
Optional[str] |
None |
Deprecated. Use LLM/MLLM vendor greeting_message instead. |
failure_message |
Optional[str] |
None |
Deprecated. Use LLM/MLLM vendor failure_message instead. |
max_history |
Optional[int] |
None |
Deprecated. Use LLM vendor max_history instead. |
geofence |
Optional[GeofenceConfig] |
None |
Regional access restriction |
labels |
Optional[Dict[str, str]] |
None |
Custom key-value labels (returned in callbacks) |
rtc |
Optional[RtcConfig] |
None |
RTC media encryption |
filler_words |
Optional[FillerWordsConfig] |
None |
Filler words while waiting for LLM |
pipeline_id |
Optional[str] |
None |
Published AI Studio pipeline ID used as this agent's base configuration |
pipeline_id is an AI Studio base configuration. Explicit Agent config such as with_llm(), with_tts(), with_stt(), with_mllm(), advanced_features, and other builder options may send fields in properties that override the saved pipeline settings. Session-level pipeline_id overrides the agent-level value.
The Agent-level instructions, greeting, failure_message, max_history, and greeting_configs fields are compatibility shims. New code should configure those values on the LLM or MLLM vendor because that matches the core request schema.
All builder methods return a new Agent instance (immutable pattern).
Set the LLM vendor for cascading flow.
from agora_agent import Agora, Area, Agent, OpenAI
client = Agora(area=Area.US, app_id='your-app-id', app_certificate='your-app-certificate')
agent = Agent(client=client).with_llm(OpenAI(model='gpt-4o-mini'))Set the TTS vendor. Records the vendor's sample_rate for avatar validation.
from agora_agent import Agora, Area, Agent, ElevenLabsTTS
client = Agora(area=Area.US, app_id='your-app-id', app_certificate='your-app-certificate')
agent = Agent(client=client).with_tts(ElevenLabsTTS(key='your-key', model_id='eleven_flash_v2_5', voice_id='your-voice-id', base_url='wss://api.elevenlabs.io/v1'))Set the STT (ASR) vendor.
from agora_agent import Agora, Area, Agent, DeepgramSTT
client = Agora(area=Area.US, app_id='your-app-id', app_certificate='your-app-certificate')
agent = Agent(client=client).with_stt(DeepgramSTT(api_key='your-key', language='en-US'))Set the MLLM vendor for multimodal flow. Calling with_mllm() automatically sets mllm.enable = True. MLLM sessions do not require TTS, STT, or LLM vendors.
from agora_agent import Agent, Agora, Area, OpenAIRealtime
client = Agora(area=Area.US, app_id='your-app-id', app_certificate='your-app-certificate')
agent = Agent(client=client).with_mllm(OpenAIRealtime(api_key='your-key'))Set the avatar vendor for the cascading ASR + LLM + TTS pipeline. Avatars are not supported when MLLM is enabled — combining with_mllm() and an enabled with_avatar() is rejected at to_properties() and AgentSession.start(). A disabled avatar (enable=False) is allowed alongside MLLM.
Raises ValueError if the TTS sample rate does not match the avatar's required_sample_rate.
from agora_agent import HeyGenAvatar
agent = agent.with_avatar(HeyGenAvatar(api_key='your-key', quality='medium', agora_uid='2'))Raises: ValueError — "Avatar requires TTS sample rate of {required} Hz, but TTS is configured with {actual} Hz. Please update your TTS sample_rate to {required}."
Override cascading-flow turn detection settings. Use language for the Agora interaction language, config.start_of_speech and config.end_of_speech for SOS/EOS detection, with_interruption() for interruption behavior, and MLLM vendor turn_detection for MLLM turn detection.
Pause-state detection is configured under semantic end-of-speech:
agent = agent.with_turn_detection({
"mode": "default",
"config": {
"end_of_speech": {
"mode": "semantic",
"semantic_config": {
"pause_state_enabled": True,
},
},
},
})Configure unified interruption behavior using the top-level interruption object. Use this for start_of_speech and keywords interruption modes.
Deprecated. Configure system_messages on the LLM vendor instead.
Deprecated. Configure greeting_message on the LLM or MLLM vendor instead.
Set SAL (Selective Attention Locking) configuration.
Set advanced features (e.g. {'enable_rtm': True}).
When enable_rtm=True, AgentKit defaults parameters.data_channel to "rtm" unless you explicitly set another data channel.
Enable or disable MCP tool invocation by setting advanced_features.enable_tools.
Set session parameters (silence config, farewell config, data channel, audio scenario, etc.).
Set parameters.audio_scenario without replacing existing session parameters.
Deprecated. Configure failure_message on the LLM or MLLM vendor instead.
Deprecated. Configure max_history on the LLM vendor instead.
Set geofence configuration (restricts backend server regions).
Set custom labels (key-value pairs returned in notification callbacks).
Set RTC configuration.
Set filler words configuration (played while waiting for LLM response).
create_session(
channel: str,
agent_uid: str,
remote_uids: List[str],
name: Optional[str] = None,
token: Optional[str] = None,
idle_timeout: Optional[int] = None,
enable_string_uid: Optional[bool] = None,
preset: Optional[Union[str, Sequence[str]]] = None,
pipeline_id: Optional[str] = None,
expires_in: Optional[int] = None,
) -> AgentSessionCreates an AgentSession using the client already bound to Agent(client=...). Pass the agent instance name with the name parameter.
| Parameter | Type | Required | Description |
|---|---|---|---|
channel |
str |
Yes | Channel name |
agent_uid |
str |
Yes | UID for the agent |
remote_uids |
List[str] |
Yes | UIDs of remote participants |
name |
Optional[str] |
No | Session name sent to the Start Agent API (defaults to agent-{timestamp} if omitted) |
token |
Optional[str] |
No | Pre-built RTC+RTM token |
expires_in |
Optional[int] |
No | Token lifetime in seconds (default: 86400 = 24 h, Agora max). Only applies when the token is auto-generated. Use expires_in_hours() or expires_in_minutes() for clarity. Valid range: 1–86400. |
idle_timeout |
Optional[int] |
No | Idle timeout in seconds |
enable_string_uid |
Optional[bool] |
No | Enable string UIDs |
preset |
Optional[Union[str, Sequence[str]]] |
No | Advanced preset value for project-specific routing |
pipeline_id |
Optional[str] |
No | Published AI Studio pipeline ID for this session. Overrides agent.pipeline_id. |
pipeline_id is sent as the top-level /join field pipeline_id, not inside properties.
create_session() requires that the agent was constructed with client=.... If no client is bound, it raises ValueError.
Example:
import time
session = agent.create_session(
channel=f"demo-channel-{int(time.time())}",
agent_uid="1",
remote_uids=["100"],
name=f"conversation-{int(time.time())}",
)Returns: AgentSession
Same parameters and behavior as create_session(), but returns AsyncAgentSession for asyncio applications.
import time
session = agent.create_async_session(
channel=f"demo-channel-{int(time.time())}",
agent_uid="1",
remote_uids=["100"],
name=f"conversation-{int(time.time())}",
)
agent_id = await session.start()Returns: AsyncAgentSession
Requires client=... on the agent builder, same as create_session().
When you omit credentials for supported Agora-managed global models, AgentKit sends the matching Agora-managed configuration automatically:
- Deepgram STT:
nova-2,nova-3 - OpenAI LLM:
gpt-4o-mini,gpt-4.1-mini,gpt-5-nano,gpt-5-mini - OpenAI TTS:
tts-1 - MiniMax TTS:
speech-2.6-turbo,speech-2.8-turbo,speech_2_6_turbo,speech_2_8_turbo
If you provide your own vendor API key for those same models, AgentKit keeps the request in BYOK mode.
Converts the agent configuration into a StartAgentsRequestProperties object for the Agora API. Called internally by AgentSession.start().
to_properties(
channel: str,
agent_uid: str,
remote_uids: List[str],
idle_timeout: Optional[int] = None,
enable_string_uid: Optional[bool] = None,
token: Optional[str] = None,
app_id: Optional[str] = None,
app_certificate: Optional[str] = None,
expires_in: Optional[int] = None,
) -> StartAgentsRequestPropertiesRaises: ValueError if neither token nor app_id+app_certificate is provided, or if required vendors (LLM, TTS) are missing in cascading mode.
| Property | Type | Description |
|---|---|---|
instructions |
Optional[str] |
Deprecated Agent-level system prompt |
greeting |
Optional[str] |
Deprecated Agent-level greeting message |
failure_message |
Optional[str] |
Deprecated Agent-level failure message |
max_history |
Optional[int] |
Deprecated Agent-level max history |
llm |
Optional[Dict[str, Any]] |
LLM config dict (from to_config()) |
tts |
Optional[Dict[str, Any]] |
TTS config dict |
stt |
Optional[Dict[str, Any]] |
STT config dict |
mllm |
Optional[Dict[str, Any]] |
MLLM config dict |
avatar |
Optional[Dict[str, Any]] |
Avatar config dict |
turn_detection |
Optional[TurnDetectionConfig] |
Interaction language and turn detection settings |
sal |
Optional[SalConfig] |
SAL configuration |
advanced_features |
Optional[Dict[str, Any]] |
Advanced features |
parameters |
Optional[SessionParams] |
Session parameters |
geofence |
Optional[GeofenceConfig] |
Geofence configuration |
labels |
Optional[Dict[str, str]] |
Custom labels |
rtc |
Optional[RtcConfig] |
RTC configuration |
filler_words |
Optional[FillerWordsConfig] |
Filler words configuration |
config |
Dict[str, Any] |
Full configuration dict |
Public aliases over Fern-generated types: LlmConfig, SttConfig, AsrConfig (= SttConfig), MllmConfig, AvatarConfig, session/conversation types, and think types (ThinkOnListeningAction, etc.).
Think value constants: ThinkOnListeningActionInject, ThinkOnListeningActionInterrupt, ThinkOnListeningActionIgnore, ThinkOnThinkingActionInterrupt, ThinkOnThinkingActionIgnore, ThinkOnSpeakingActionInterrupt, ThinkOnSpeakingActionIgnore.