Skip to content

Commit b23e6ce

Browse files
committed
Update starter to use new turn detection model
1 parent f31b78f commit b23e6ce

4 files changed

Lines changed: 17 additions & 19 deletions

File tree

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The starter project includes:
1212
- A voice AI pipeline built on [LiveKit Inference](https://docs.livekit.io/agents/models/inference)
1313
with [models](https://docs.livekit.io/agents/models) from OpenAI, Cartesia, and Deepgram. More than 50 other model providers are supported, including [Realtime models](https://docs.livekit.io/agents/models/realtime)
1414
- Eval suite based on the LiveKit Agents [testing & evaluation framework](https://docs.livekit.io/agents/start/testing/)
15-
- [LiveKit Turn Detector](https://docs.livekit.io/agents/logic/turns/turn-detector/) for contextually-aware speaker detection, with multilingual support
15+
- [LiveKit Turn Detector](https://docs.livekit.io/agents/logic/turns/turn-detector/), a multimodal end-of-turn model that listens to the user's audio directly, combining semantic understanding with acoustic cues for state-of-the-art accuracy across 14 languages
1616
- [Background voice cancellation](https://docs.livekit.io/transport/media/noise-cancellation/)
1717
- Deep session insights from LiveKit [Agent Observability](https://docs.livekit.io/deploy/observability/)
1818
- A Dockerfile ready for [production deployment to LiveKit Cloud](https://docs.livekit.io/deploy/agents/)
@@ -92,12 +92,14 @@ lk app env -w -d .env.local
9292

9393
## Run the agent
9494

95-
Before your first run, you must download certain models such as [Silero VAD](https://docs.livekit.io/agents/logic/turns/vad/) and the [LiveKit turn detector](https://docs.livekit.io/agents/logic/turns/turn-detector/):
95+
Before your first run, download the [ai-coustics noise cancellation](https://docs.livekit.io/transport/media/noise-cancellation/) model used by the agent:
9696

9797
```console
98-
uv run python src/agent.py download-files
98+
uv run --module livekit.agents download-files
9999
```
100100

101+
The [LiveKit turn detector](https://docs.livekit.io/agents/logic/turns/turn-detector/) and the agent's voice activity detection both run on [LiveKit Inference](https://docs.livekit.io/agents/models/inference) and are built into the Agents SDK, so they don't require a separate download.
102+
101103
Next, run this command to speak to your agent directly in your terminal:
102104

103105
```console

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description = "Simple voice AI assistant built with LiveKit Agents for Python"
99
requires-python = ">=3.10, <3.15"
1010

1111
dependencies = [
12-
"livekit-agents[silero,turn-detector]==1.5.17",
12+
"livekit-agents==1.6.0",
1313
"livekit-plugins-ai-coustics~=0.2",
1414
"python-dotenv",
1515
]

src/agent.py

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,12 @@
77
AgentServer,
88
AgentSession,
99
JobContext,
10-
JobProcess,
10+
TurnHandlingOptions,
1111
cli,
1212
inference,
1313
room_io,
1414
)
15-
from livekit.plugins import ai_coustics, silero
16-
from livekit.plugins.turn_detector.multilingual import MultilingualModel
15+
from livekit.plugins import ai_coustics
1716

1817
logger = logging.getLogger("agent")
1918

@@ -92,13 +91,6 @@ def __init__(self) -> None:
9291
server = AgentServer()
9392

9493

95-
def prewarm(proc: JobProcess):
96-
proc.userdata["vad"] = silero.VAD.load()
97-
98-
99-
server.setup_fnc = prewarm
100-
101-
10294
@server.rtc_session(agent_name="my-agent")
10395
async def my_agent(ctx: JobContext):
10496
# Logging setup
@@ -117,10 +109,14 @@ async def my_agent(ctx: JobContext):
117109
tts=inference.TTS(
118110
model="cartesia/sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"
119111
),
120-
# VAD and turn detection are used to determine when the user is speaking and when the agent should respond
112+
# The LiveKit turn detector determines when the user is done speaking and the agent should respond.
113+
# AudioTurnDetector is a multimodal model that listens to the user's audio directly, combining
114+
# semantic understanding with acoustic cues (intonation, pitch, rhythm) for state-of-the-art accuracy.
115+
# AgentSession supplies the required VAD automatically.
121116
# See more at https://docs.livekit.io/agents/build/turns
122-
turn_detection=MultilingualModel(),
123-
vad=ctx.proc.userdata["vad"],
117+
turn_handling=TurnHandlingOptions(
118+
turn_detection=inference.AudioTurnDetector(),
119+
),
124120
# allow the LLM to generate a response while waiting for the end of turn
125121
# See more at https://docs.livekit.io/agents/build/audio/#preemptive-generation
126122
preemptive_generation=True,

taskfile.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ tasks:
4343
- echo ''
4444
- echo '{{ indent .INDENT "cd" }} {{ .REL_PATH }}'
4545
- echo '{{ indent .INDENT "uv sync" }}'
46-
- echo '{{ indent .INDENT "uv run" }} {{ .PYTHON_MAIN }} download-files'
46+
- echo '{{ indent .INDENT "uv run --module livekit.agents download-files" }}'
4747
- echo '{{ indent .INDENT "uv run" }} {{ .PYTHON_MAIN }} console'
4848

4949
help_open_web_console:
@@ -57,7 +57,7 @@ tasks:
5757
- echo ''
5858
- echo '{{ indent .INDENT "cd" }} {{ .REL_PATH }}'
5959
- echo '{{ indent .INDENT "uv sync" }}'
60-
- echo '{{ indent .INDENT "uv run" }} {{ .PYTHON_MAIN }} download-files'
60+
- echo '{{ indent .INDENT "uv run --module livekit.agents download-files" }}'
6161
- echo '{{ indent .INDENT "uv run" }} {{ .PYTHON_MAIN }} dev'
6262
- echo ''
6363
- echo 'Then visit:'

0 commit comments

Comments
 (0)