Supporting async function calls.#4217
Conversation
583a045 to
dfdb929
Compare
| await self._cancel_clock_task() | ||
| await self._cancel_video_task() | ||
|
|
||
| if self._audio_queue.has_uninterruptible: |
There was a problem hiding this comment.
The same applies, if an UninterruptibleFrame was in the _audio_queue, it would be discarded.
|
@aconchillo, @markbackman, @kompfner, I believe I’ve addressed everything we discussed during our meeting. It would be great if you could take another look. |
…nterruptible frames.
…erruptibleFrames.
6f0ec3a to
bbb605a
Compare
| await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.") | ||
| async def fetch_weather_from_api(params: FunctionCallParams): | ||
| # Simulate a long-running API call, so we can test async function calls. | ||
| await asyncio.sleep(20) |
There was a problem hiding this comment.
I think we should create a new example. Maybe function-calling-anthropic-delayed.py, or maybe not even change the example. This will run in evalss, we can't wait for 20 seconds.
There was a problem hiding this comment.
Makes sense. I’ll create just one new example for OpenAI and revert the changes here.
|
|
||
| async def fetch_weather_from_api(params: FunctionCallParams): | ||
| # Simulate a long-running API call, so we can test async function calls. | ||
| await asyncio.sleep(20) |
| self._context.add_message( | ||
| { | ||
| "role": "tool", | ||
| "content": json.dumps({"type": "tool", "status": "started"}), |
There was a problem hiding this comment.
@aconchillo , @kompfner , I think this is the last thing we need to decide, whether we are going to call it "tool" or "async_tool".
There was a problem hiding this comment.
Or distinguish even further from the language of "tool calling" or "function calling", and go with "task" or "job" or something like that—in my understanding of things, the more distinct we make this mechanism from "vanilla" tool calling, the easier it will be for LLMs (and honestly for human readers of context) to understand.
There's no downside, in my mind, to underscoring in bold that when you (the LLM) are calling get_weather, you're not getting the weather, you're actually kicking off a job to eventually get the weather.
There was a problem hiding this comment.
I have a suspicion that if we tried to maximize clarity (and honesty about what's going on) then we may be able to fix issues like the LLM trying to re-invoke an already-running async tool on a subsequent turn. We need more natural language. This is a prompt engineering issue, I think.
For example, Claude goes pretty far and suggests messages that look like this:
"content": "Task dispatched asynchronously. job_id: job_xyz789. Result not yet available — it will be provided in a subsequent user message. Do not speculate about or act on the result until then."
I don't think we need to abandon the result being JSON (which has its benefits, as @aconchillo pointed out a while back, and which is leveraged in the changes in this PR), but the above is a lot closer to what I've had in mind. Maybe something like:
"content": json.dumps({
"type": "async_task",
"status": "dispatched",
"task_id": frame.tool_call_id,
"message": "Task dispatched asynchronously. Result not yet available. It will be provided in a subsequent message with matching task ID. Do not speculate about or act on the result until then."
})
(And of course we can adjust the "message" to add support for multi-result streaming tasks)
There was a problem hiding this comment.
"content": json.dumps({
"type": "async_tool",
"status": "started",
"tool_call_id": frame.tool_call_id,
"description": "The tool with this tool_call_id is in progress and the result is not yet available. It will be provided in a subsequent message with matching tool_call_id."
})
There was a problem hiding this comment.
Try with the CerebrasLLMService to see if it works fine.
There was a problem hiding this comment.
I have applied the changes we discussed. 👍
Do we need this parameter? Based on a bit of research, to me it seems like opting out would mean going against best practice and violating LLM expectations. We could choose to not expose this, and add it if/when we have a clear need for it. |
There was a problem hiding this comment.
Nit: do we want to name these files *-async.py, in keeping with the terminology "async function call" that you use elsewhere?
There was a problem hiding this comment.
Yep, I think it makes sense.
I have just renamed the OpenAI and Anthropic examples to “async” instead of “delayed.” 👍
There was a problem hiding this comment.
You mentioned that other LLMs also handle async functioning well, too. Should we get at least Google in this set of examplees, too?
There was a problem hiding this comment.
Also...I know that OpenAIResponsesLLMService is still fairly new, but we should eventually start treating it as the "main" or "recommended" OpenAI service. We're maybe not quite there yet, but it might be worth testing it and including it as an example this group of example. WDYT?
There was a problem hiding this comment.
Done. I have added two new examples, one for Google and the other one using OpenAI Response. 👍
| tool_call_id: Unique identifier for this function call. | ||
| arguments: Arguments passed to the function. | ||
| cancel_on_interruption: Whether to cancel this call if interrupted. | ||
| When ``False`` the call is treated as asynchronous: the LLM |
There was a problem hiding this comment.
👍 Nice clear and concise description here
| tool_call_id="1", | ||
| arguments={"location": "Los Angeles"}, | ||
| cancel_on_interruption=False, | ||
| cancel_on_interruption=True, |
There was a problem hiding this comment.
Because the test was expecting the old syntax, it was trying to read the context from the replaced result message.
assert json.loads(context.messages[-1]["content"]) == {"conditions": "Sunny"}
| function_name: str | ||
| tool_call_id: str | ||
| arguments: Any | ||
| cancel_on_interruption: bool = False |
There was a problem hiding this comment.
Not to revisit an old decision, but: why does this field have a default value? Won't we always specify it?
There was a problem hiding this comment.
Yeah, as far as I can tell, we always do.
kompfner
left a comment
There was a problem hiding this comment.
Had a few minor questions/suggestions. Pre-approving, given all your testing. Nice work!
Co-authored-by: kompfner <paul@daily.co>
Summary
register_function()andregister_direct_function()viacancel_on_interruption=False. When set toFalse, the LLM continues the conversation immediately without waiting for the function result. The result is injected back into the context as adevelopermessage once available, triggering a new LLM inference at that point.group_idto function calls so all calls originating from the same LLM response share an identifier. The LLM is triggered exactly once when the last call in the group completes.MediaSender.handle_interruptionsto preserve and deliver pendingUninterruptibleFrameitems instead of discarding them on interruptionIN_PROGRESStool messages as unresolved, preventing async function calls from being summarized away before their results arrive.group_parallel_toolsparameter toLLMService(defaultTrue). WhenTrue, all function calls from the same LLM response batch share a group ID and the LLM is triggered exactly once after the last call completes. Set toFalseto trigger inference independently for each function call result as it arrives.