You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+66Lines changed: 66 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,7 @@ and multimodal flows (MLLM) for real-time audio processing.
14
14
-[Installation](#installation)
15
15
-[Reference](#reference)
16
16
-[Mllm Flow Multimodal](#mllm-flow-multimodal)
17
+
-[Mllm Flow Multimodal](#mllm-flow-multimodal)
17
18
-[Usage](#usage)
18
19
-[Async Client](#async-client)
19
20
-[Exception Handling](#exception-handling)
@@ -104,6 +105,71 @@ client.agents.start(
104
105
```
105
106
106
107
108
+
## MLLM Flow (Multimodal)
109
+
110
+
For real-time audio processing using OpenAI's Realtime API or Google Gemini Live, use the MLLM (Multimodal Large Language Model) flow instead of the cascading ASR -> LLM -> TTS flow. See the [MLLM Overview](https://docs.agora.io/en/conversational-ai/models/mllm/overview) for more details.
0 commit comments