Skip to content

Commit 423d4a2

Browse files
authored
Inference gateway integration (#22)
1 parent f998059 commit 423d4a2

7 files changed

Lines changed: 57 additions & 177 deletions

File tree

.env.example

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,3 @@
11
LIVEKIT_URL=
22
LIVEKIT_API_KEY=
33
LIVEKIT_API_SECRET=
4-
5-
OPENAI_API_KEY=
6-
DEEPGRAM_API_KEY=
7-
CARTESIA_API_KEY=

.github/workflows/tests.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,5 +28,7 @@ jobs:
2828

2929
- name: Run tests
3030
env:
31-
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
31+
LIVEKIT_URL: ${{ secrets.LIVEKIT_URL }}
32+
LIVEKIT_API_KEY: ${{ secrets.LIVEKIT_API_KEY }}
33+
LIVEKIT_API_SECRET: ${{ secrets.LIVEKIT_API_SECRET }}
3234
run: uv run pytest -v

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,4 @@ KMS
99
.vscode
1010
*.egg-info
1111
.pytest_cache
12-
.ruff_cache
12+
.ruff_cache

README.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,18 @@
44

55
# LiveKit Agents Starter - Python
66

7-
A complete starter project for building voice AI apps with [LiveKit Agents for Python](https://github.com/livekit/agents).
7+
A complete starter project for building voice AI apps with [LiveKit Agents for Python](https://github.com/livekit/agents) and [LiveKit Cloud](https://cloud.livekit.io/).
88

99
The starter project includes:
1010

11-
- A simple voice AI assistant based on the [Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai/)
12-
- Voice AI pipeline based on [OpenAI](https://docs.livekit.io/agents/integrations/llm/openai/), [Cartesia](https://docs.livekit.io/agents/integrations/tts/cartesia/), and [Deepgram](https://docs.livekit.io/agents/integrations/llm/deepgram/)
13-
- Easily integrate your preferred [LLM](https://docs.livekit.io/agents/integrations/llm/), [STT](https://docs.livekit.io/agents/integrations/stt/), and [TTS](https://docs.livekit.io/agents/integrations/tts/) instead, or swap to a realtime model like the [OpenAI Realtime API](https://docs.livekit.io/agents/integrations/realtime/openai)
11+
- A simple voice AI assistant, ready for extension and customization
12+
- A voice AI pipeline with [models](https://docs.livekit.io/agents/models) from OpenAI, Cartesia, and AssemblyAI served through LiveKit Cloud
13+
- Easily integrate your preferred [LLM](https://docs.livekit.io/agents/models/llm/), [STT](https://docs.livekit.io/agents/models/stt/), and [TTS](https://docs.livekit.io/agents/models/tts/) instead, or swap to a realtime model like the [OpenAI Realtime API](https://docs.livekit.io/agents/models/realtime/openai)
1414
- Eval suite based on the LiveKit Agents [testing & evaluation framework](https://docs.livekit.io/agents/build/testing/)
1515
- [LiveKit Turn Detector](https://docs.livekit.io/agents/build/turns/turn-detector/) for contextually-aware speaker detection, with multilingual support
16-
- [LiveKit Cloud enhanced noise cancellation](https://docs.livekit.io/home/cloud/noise-cancellation/)
16+
- [Background voice cancellation](https://docs.livekit.io/home/cloud/noise-cancellation/)
1717
- Integrated [metrics and logging](https://docs.livekit.io/agents/build/metrics/)
18+
- A Dockerfile ready for [production deployment](https://docs.livekit.io/agents/ops/deployment/)
1819

1920
This starter app is compatible with any [custom web/mobile frontend](https://docs.livekit.io/agents/start/frontend/) or [SIP-based telephony](https://docs.livekit.io/agents/start/telephony/).
2021

@@ -27,19 +28,17 @@ cd agent-starter-python
2728
uv sync
2829
```
2930

30-
Set up the environment by copying `.env.example` to `.env.local` and filling in the required values:
31+
Sign up for [LiveKit Cloud](https://cloud.livekit.io/) then set up the environment by copying `.env.example` to `.env.local` and filling in the required keys:
3132

32-
- `LIVEKIT_URL`: Use [LiveKit Cloud](https://cloud.livekit.io/) or [run your own](https://docs.livekit.io/home/self-hosting/)
33+
- `LIVEKIT_URL`
3334
- `LIVEKIT_API_KEY`
3435
- `LIVEKIT_API_SECRET`
35-
- `OPENAI_API_KEY`: [Get a key](https://platform.openai.com/api-keys) or use your [preferred LLM provider](https://docs.livekit.io/agents/integrations/llm/)
36-
- `DEEPGRAM_API_KEY`: [Get a key](https://console.deepgram.com/) or use your [preferred STT provider](https://docs.livekit.io/agents/integrations/stt/)
37-
- `CARTESIA_API_KEY`: [Get a key](https://play.cartesia.ai/keys) or use your [preferred TTS provider](https://docs.livekit.io/agents/integrations/tts/)
3836

3937
You can load the LiveKit environment automatically using the [LiveKit CLI](https://docs.livekit.io/home/cli/cli-setup):
4038

4139
```bash
42-
lk app env -w .env.local
40+
lk cloud auth
41+
lk app env -w -d .env.local
4342
```
4443

4544
## Run the agent
@@ -100,12 +99,16 @@ Once you've started your own project based on this repo, you should:
10099

101100
2. **Remove the git tracking test**: Delete the "Check files not tracked in git" step from `.github/workflows/tests.yml` since you'll now want this file to be tracked. These are just there for development purposes in the template repo itself.
102101

103-
3. **Add your own repository secrets**: You must [add secrets](https://docs.github.com/en/actions/how-tos/writing-workflows/choosing-what-your-workflow-does/using-secrets-in-github-actions) for `OPENAI_API_KEY` or your other LLM provider so that the tests can run in CI.
102+
3. **Add your own repository secrets**: You must [add secrets](https://docs.github.com/en/actions/how-tos/writing-workflows/choosing-what-your-workflow-does/using-secrets-in-github-actions) for `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET` so that the tests can run in CI.
104103

105104
## Deploying to production
106105

107106
This project is production-ready and includes a working `Dockerfile`. To deploy it to LiveKit Cloud or another environment, see the [deploying to production](https://docs.livekit.io/agents/ops/deployment/) guide.
108107

108+
## Self-hosted LiveKit
109+
110+
You can also self-host LiveKit instead of using LiveKit Cloud. See the [self-hosting](https://docs.livekit.io/home/self-hosting/) guide for more information. If you choose to self-host, you'll need to also use [model plugins](https://docs.livekit.io/agents/models/#plugins) instead of LiveKit Inference and will need to remove the [LiveKit Cloud noise cancellation](https://docs.livekit.io/home/cloud/noise-cancellation/) plugin.
111+
109112
## License
110113

111114
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description = "Simple voice AI assistant built with LiveKit Agents for Python"
99
requires-python = ">=3.9"
1010

1111
dependencies = [
12-
"livekit-agents[openai,turn-detector,silero,cartesia,deepgram]~=1.2",
12+
"livekit-agents[silero,turn-detector]~=1.2",
1313
"livekit-plugins-noise-cancellation~=0.2",
1414
"python-dotenv",
1515
]

src/agent.py

Lines changed: 35 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,17 @@
22

33
from dotenv import load_dotenv
44
from livekit.agents import (
5-
NOT_GIVEN,
65
Agent,
7-
AgentFalseInterruptionEvent,
86
AgentSession,
97
JobContext,
108
JobProcess,
119
MetricsCollectedEvent,
1210
RoomInputOptions,
13-
RunContext,
1411
WorkerOptions,
1512
cli,
1613
metrics,
1714
)
18-
from livekit.agents.llm import function_tool
19-
from livekit.plugins import cartesia, deepgram, noise_cancellation, openai, silero
15+
from livekit.plugins import noise_cancellation, silero
2016
from livekit.plugins.turn_detector.multilingual import MultilingualModel
2117

2218
logger = logging.getLogger("agent")
@@ -27,27 +23,28 @@
2723
class Assistant(Agent):
2824
def __init__(self) -> None:
2925
super().__init__(
30-
instructions="""You are a helpful voice AI assistant.
26+
instructions="""You are a helpful voice AI assistant. The user is interacting with you via voice, even if you perceive the conversation as text.
3127
You eagerly assist users with their questions by providing information from your extensive knowledge.
3228
Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
3329
You are curious, friendly, and have a sense of humor.""",
3430
)
3531

36-
# all functions annotated with @function_tool will be passed to the LLM when this
37-
# agent is active
38-
@function_tool
39-
async def lookup_weather(self, context: RunContext, location: str):
40-
"""Use this tool to look up current weather information in the given location.
41-
42-
If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable.
43-
44-
Args:
45-
location: The location to look up weather information for (e.g. city name)
46-
"""
47-
48-
logger.info(f"Looking up weather for {location}")
49-
50-
return "sunny with a temperature of 70 degrees."
32+
# To add tools, use the @function_tool decorator.
33+
# Here's an example that adds a simple weather tool.
34+
# You also have to add `from livekit.agents.llm import function_tool, RunContext` to the top of this file
35+
# @function_tool
36+
# async def lookup_weather(self, context: RunContext, location: str):
37+
# """Use this tool to look up current weather information in the given location.
38+
#
39+
# If the location is not supported by the weather service, the tool will indicate this. You must tell the user the location's weather is unavailable.
40+
#
41+
# Args:
42+
# location: The location to look up weather information for (e.g. city name)
43+
# """
44+
#
45+
# logger.info(f"Looking up weather for {location}")
46+
#
47+
# return "sunny with a temperature of 70 degrees."
5148

5249

5350
def prewarm(proc: JobProcess):
@@ -61,17 +58,17 @@ async def entrypoint(ctx: JobContext):
6158
"room": ctx.room.name,
6259
}
6360

64-
# Set up a voice AI pipeline using OpenAI, Cartesia, Deepgram, and the LiveKit turn detector
61+
# Set up a voice AI pipeline using OpenAI, Cartesia, AssemblyAI, and the LiveKit turn detector
6562
session = AgentSession(
66-
# A Large Language Model (LLM) is your agent's brain, processing user input and generating a response
67-
# See all providers at https://docs.livekit.io/agents/integrations/llm/
68-
llm=openai.LLM(model="gpt-4o-mini"),
6963
# Speech-to-text (STT) is your agent's ears, turning the user's speech into text that the LLM can understand
70-
# See all providers at https://docs.livekit.io/agents/integrations/stt/
71-
stt=deepgram.STT(model="nova-3", language="multi"),
64+
# See all available models at https://docs.livekit.io/agents/models/stt/
65+
stt="assemblyai/universal-streaming:en",
66+
# A Large Language Model (LLM) is your agent's brain, processing user input and generating a response
67+
# See all available models at https://docs.livekit.io/agents/models/llm/
68+
llm="openai/gpt-4.1-mini",
7269
# Text-to-speech (TTS) is your agent's voice, turning the LLM's text into speech that the user can hear
73-
# See all providers at https://docs.livekit.io/agents/integrations/tts/
74-
tts=cartesia.TTS(voice="6f84f4b8-58a2-430c-8c79-688dad597532"),
70+
# See all available models as well as voice selections at https://docs.livekit.io/agents/models/tts/
71+
tts="cartesia/sonic-2:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
7572
# VAD and turn detection are used to determine when the user is speaking and when the agent should respond
7673
# See more at https://docs.livekit.io/agents/build/turns
7774
turn_detection=MultilingualModel(),
@@ -81,19 +78,16 @@ async def entrypoint(ctx: JobContext):
8178
preemptive_generation=True,
8279
)
8380

84-
# To use a realtime model instead of a voice pipeline, use the following session setup instead:
81+
# To use a realtime model instead of a voice pipeline, use the following session setup instead.
82+
# (Note: This is for the OpenAI Realtime API. For other providers, see https://docs.livekit.io/agents/models/realtime/))
83+
# 1. Install livekit-agents[openai]
84+
# 2. Set OPENAI_API_KEY in .env.local
85+
# 3. Add `from livekit.plugins import openai` to the top of this file
86+
# 4. Use the following session setup instead of the version above
8587
# session = AgentSession(
86-
# # See all providers at https://docs.livekit.io/agents/integrations/realtime/
8788
# llm=openai.realtime.RealtimeModel(voice="marin")
8889
# )
8990

90-
# sometimes background noise could interrupt the agent session, these are considered false positive interruptions
91-
# when it's detected, you may resume the agent's speech
92-
@session.on("agent_false_interruption")
93-
def _on_agent_false_interruption(ev: AgentFalseInterruptionEvent):
94-
logger.info("false positive interruption, resuming")
95-
session.generate_reply(instructions=ev.extra_instructions or NOT_GIVEN)
96-
9791
# Metrics collection, to measure pipeline performance
9892
# For more information, see https://docs.livekit.io/agents/build/metrics/
9993
usage_collector = metrics.UsageCollector()
@@ -110,9 +104,9 @@ async def log_usage():
110104
ctx.add_shutdown_callback(log_usage)
111105

112106
# # Add a virtual avatar to the session, if desired
113-
# # For other providers, see https://docs.livekit.io/agents/integrations/avatar/
107+
# # For other providers, see https://docs.livekit.io/agents/models/avatar/
114108
# avatar = hedra.AvatarSession(
115-
# avatar_id="...", # See https://docs.livekit.io/agents/integrations/avatar/hedra
109+
# avatar_id="...", # See https://docs.livekit.io/agents/models/avatar/plugins/hedra
116110
# )
117111
# # Start the avatar and wait for it to join
118112
# await avatar.start(session, room=ctx.room)
@@ -122,9 +116,7 @@ async def log_usage():
122116
agent=Assistant(),
123117
room=ctx.room,
124118
room_input_options=RoomInputOptions(
125-
# LiveKit Cloud enhanced noise cancellation
126-
# - If self-hosting, omit this parameter
127-
# - For telephony applications, use `BVCTelephony` for best results
119+
# For telephony applications, use `BVCTelephony` for best results
128120
noise_cancellation=noise_cancellation.BVC(),
129121
),
130122
)

tests/test_agent.py

Lines changed: 2 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
import pytest
2-
from livekit.agents import AgentSession, llm, mock_tools
3-
from livekit.plugins import openai
2+
from livekit.agents import AgentSession, inference, llm
43

54
from agent import Assistant
65

76

87
def _llm() -> llm.LLM:
9-
return openai.LLM(model="gpt-4o-mini")
8+
return inference.LLM(model="openai/gpt-4.1-mini")
109

1110

1211
@pytest.mark.asyncio
@@ -41,118 +40,6 @@ async def test_offers_assistance() -> None:
4140
result.expect.no_more_events()
4241

4342

44-
@pytest.mark.asyncio
45-
async def test_weather_tool() -> None:
46-
"""Unit test for the weather tool combined with an evaluation of the agent's ability to incorporate its results."""
47-
async with (
48-
_llm() as llm,
49-
AgentSession(llm=llm) as session,
50-
):
51-
await session.start(Assistant())
52-
53-
# Run an agent turn following the user's request for weather information
54-
result = await session.run(user_input="What's the weather in Tokyo?")
55-
56-
# Test that the agent calls the weather tool with the correct arguments
57-
result.expect.next_event().is_function_call(
58-
name="lookup_weather", arguments={"location": "Tokyo"}
59-
)
60-
61-
# Test that the tool invocation works and returns the correct output
62-
# To mock the tool output instead, see https://docs.livekit.io/agents/build/testing/#mock-tools
63-
result.expect.next_event().is_function_call_output(
64-
output="sunny with a temperature of 70 degrees."
65-
)
66-
67-
# Evaluate the agent's response for accurate weather information
68-
await (
69-
result.expect.next_event()
70-
.is_message(role="assistant")
71-
.judge(
72-
llm,
73-
intent="""
74-
Informs the user that the weather is sunny with a temperature of 70 degrees.
75-
76-
Optional context that may or may not be included (but the response must not contradict these facts)
77-
- The location for the weather report is Tokyo
78-
""",
79-
)
80-
)
81-
82-
# Ensures there are no function calls or other unexpected events
83-
result.expect.no_more_events()
84-
85-
86-
@pytest.mark.asyncio
87-
async def test_weather_unavailable() -> None:
88-
"""Evaluation of the agent's ability to handle tool errors."""
89-
async with (
90-
_llm() as llm,
91-
AgentSession(llm=llm) as sess,
92-
):
93-
await sess.start(Assistant())
94-
95-
# Simulate a tool error
96-
with mock_tools(
97-
Assistant,
98-
{"lookup_weather": lambda: RuntimeError("Weather service is unavailable")},
99-
):
100-
result = await sess.run(user_input="What's the weather in Tokyo?")
101-
result.expect.skip_next_event_if(type="message", role="assistant")
102-
result.expect.next_event().is_function_call(
103-
name="lookup_weather", arguments={"location": "Tokyo"}
104-
)
105-
result.expect.next_event().is_function_call_output()
106-
await result.expect.next_event(type="message").judge(
107-
llm,
108-
intent="""
109-
Acknowledges that the weather request could not be fulfilled and communicates this to the user.
110-
111-
The response should convey that there was a problem getting the weather information, but can be expressed in various ways such as:
112-
- Mentioning an error, service issue, or that it couldn't be retrieved
113-
- Suggesting alternatives or asking what else they can help with
114-
- Being apologetic or explaining the situation
115-
116-
The response does not need to use specific technical terms like "weather service error" or "temporary".
117-
""",
118-
)
119-
120-
# leaving this commented, some LLMs may occasionally try to retry.
121-
# result.expect.no_more_events()
122-
123-
124-
@pytest.mark.asyncio
125-
async def test_unsupported_location() -> None:
126-
"""Evaluation of the agent's ability to handle a weather response with an unsupported location."""
127-
async with (
128-
_llm() as llm,
129-
AgentSession(llm=llm) as sess,
130-
):
131-
await sess.start(Assistant())
132-
133-
with mock_tools(Assistant, {"lookup_weather": lambda: "UNSUPPORTED_LOCATION"}):
134-
result = await sess.run(user_input="What's the weather in Tokyo?")
135-
136-
# Evaluate the agent's response for an unsupported location
137-
await result.expect.next_event(type="message").judge(
138-
llm,
139-
intent="""
140-
Communicates that the weather request for the specific location could not be fulfilled.
141-
142-
The response should indicate that weather information is not available for the requested location, but can be expressed in various ways such as:
143-
- Saying they can't get weather for that location
144-
- Explaining the location isn't supported or available
145-
- Suggesting alternatives or asking what else they can help with
146-
- Being apologetic about the limitation
147-
148-
The response does not need to explicitly state "unsupported" or discourage retrying.
149-
""",
150-
)
151-
152-
# Ensures there are no function calls or other unexpected events
153-
result.expect.no_more_events()
154-
155-
15643
@pytest.mark.asyncio
15744
async def test_grounding() -> None:
15845
"""Evaluation of the agent's ability to refuse to answer when it doesn't know something."""

0 commit comments

Comments
 (0)