Skip to content

Supporting async function calls.#4217

Merged
filipi87 merged 13 commits into
mainfrom
filipi/async_tools
Apr 7, 2026
Merged

Supporting async function calls.#4217
filipi87 merged 13 commits into
mainfrom
filipi/async_tools

Conversation

@filipi87
Copy link
Copy Markdown
Contributor

@filipi87 filipi87 commented Mar 31, 2026

Summary

  • Added async function call support to register_function() and register_direct_function() via cancel_on_interruption=False. When set to False, the LLM continues the conversation immediately without waiting for the function result. The result is injected back into the context as a developer message once available, triggering a new LLM inference at that point.
  • Added group_id to function calls so all calls originating from the same LLM response share an identifier. The LLM is triggered exactly once when the last call in the group completes.
  • Fixed MediaSender.handle_interruptions to preserve and deliver pending UninterruptibleFrame items instead of discarding them on interruption
  • Prevented spurious LLM inference: context frame is not pushed when a function call result arrives while the user is actively speaking
  • Updated context summarization to treat IN_PROGRESS tool messages as unresolved, preventing async function calls from being summarized away before their results arrive.
  • Added group_parallel_tools parameter to LLMService (default True). When True, all function calls from the same LLM response batch share a group ID and the LLM is triggered exactly once after the last call completes. Set to False to trigger inference independently for each function call result as it arrives.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 73.78641% with 27 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...t/processors/aggregators/llm_response_universal.py 59.37% 13 Missing ⚠️
...pipecat/utils/context/llm_context_summarization.py 66.66% 8 Missing ⚠️
src/pipecat/transports/base_output.py 16.66% 5 Missing ⚠️
src/pipecat/services/llm_service.py 75.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/pipecat/frames/frames.py 96.27% <100.00%> (+1.09%) ⬆️
src/pipecat/processors/frame_processor.py 88.15% <100.00%> (-0.25%) ⬇️
src/pipecat/utils/frame_queue.py 100.00% <100.00%> (ø)
src/pipecat/services/llm_service.py 54.51% <75.00%> (+0.10%) ⬆️
src/pipecat/transports/base_output.py 18.91% <16.66%> (+0.10%) ⬆️
...pipecat/utils/context/llm_context_summarization.py 81.22% <66.66%> (-3.08%) ⬇️
...t/processors/aggregators/llm_response_universal.py 83.82% <59.37%> (+3.75%) ⬆️

... and 74 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@filipi87 filipi87 force-pushed the filipi/async_tools branch from 583a045 to dfdb929 Compare March 31, 2026 20:26
Comment thread src/pipecat/processors/frame_processor.py
await self._cancel_clock_task()
await self._cancel_video_task()

if self._audio_queue.has_uninterruptible:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies, if an UninterruptibleFrame was in the _audio_queue, it would be discarded.

@filipi87 filipi87 changed the title Fix async tool handling for compatibility with all LLMs. Supporting async function calls. Apr 1, 2026
@filipi87 filipi87 marked this pull request as ready for review April 1, 2026 16:57
@filipi87
Copy link
Copy Markdown
Contributor Author

filipi87 commented Apr 2, 2026

@aconchillo, @markbackman, @kompfner, I believe I’ve addressed everything we discussed during our meeting. It would be great if you could take another look.

@filipi87 filipi87 force-pushed the filipi/async_tools branch from 6f0ec3a to bbb605a Compare April 2, 2026 19:58
await params.result_callback(f"The weather in {location} is currently 72 degrees and sunny.")
async def fetch_weather_from_api(params: FunctionCallParams):
# Simulate a long-running API call, so we can test async function calls.
await asyncio.sleep(20)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create a new example. Maybe function-calling-anthropic-delayed.py, or maybe not even change the example. This will run in evalss, we can't wait for 20 seconds.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I’ll create just one new example for OpenAI and revert the changes here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


async def fetch_weather_from_api(params: FunctionCallParams):
# Simulate a long-running API call, so we can test async function calls.
await asyncio.sleep(20)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@filipi87 filipi87 requested a review from aconchillo April 2, 2026 21:41
self._context.add_message(
{
"role": "tool",
"content": json.dumps({"type": "tool", "status": "started"}),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aconchillo , @kompfner , I think this is the last thing we need to decide, whether we are going to call it "tool" or "async_tool".

Copy link
Copy Markdown
Contributor

@kompfner kompfner Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or distinguish even further from the language of "tool calling" or "function calling", and go with "task" or "job" or something like that—in my understanding of things, the more distinct we make this mechanism from "vanilla" tool calling, the easier it will be for LLMs (and honestly for human readers of context) to understand.

There's no downside, in my mind, to underscoring in bold that when you (the LLM) are calling get_weather, you're not getting the weather, you're actually kicking off a job to eventually get the weather.

Copy link
Copy Markdown
Contributor

@kompfner kompfner Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suspicion that if we tried to maximize clarity (and honesty about what's going on) then we may be able to fix issues like the LLM trying to re-invoke an already-running async tool on a subsequent turn. We need more natural language. This is a prompt engineering issue, I think.

For example, Claude goes pretty far and suggests messages that look like this:

"content": "Task dispatched asynchronously. job_id: job_xyz789. Result not yet available — it will be provided in a subsequent user message. Do not speculate about or act on the result until then."

I don't think we need to abandon the result being JSON (which has its benefits, as @aconchillo pointed out a while back, and which is leveraged in the changes in this PR), but the above is a lot closer to what I've had in mind. Maybe something like:

"content": json.dumps({ 
  "type": "async_task",
  "status": "dispatched",
  "task_id": frame.tool_call_id,
  "message": "Task dispatched asynchronously. Result not yet available. It will be provided in a subsequent message with matching task ID. Do not speculate about or act on the result until then."
})

(And of course we can adjust the "message" to add support for multi-result streaming tasks)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"content": json.dumps({ 
  "type": "async_tool",
  "status": "started",
  "tool_call_id": frame.tool_call_id,
  "description": "The tool with this tool_call_id is in progress and the result is not yet available. It will be provided in a subsequent message with matching tool_call_id."
})

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try with the CerebrasLLMService to see if it works fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have applied the changes we discussed. 👍

@kompfner
Copy link
Copy Markdown
Contributor

kompfner commented Apr 3, 2026

Added group_parallel_tools parameter to LLMService (default True). When True, all function calls from the same LLM response batch share a group ID and the LLM is triggered exactly once after the last call completes. Set to False to trigger inference independently for each function call result as it arrives.

Do we need this parameter? Based on a bit of research, to me it seems like opting out would mean going against best practice and violating LLM expectations. We could choose to not expose this, and add it if/when we have a clear need for it.

Comment thread changelog/4217.changed.md Outdated
Comment thread examples/function-calling/function-calling-anthropic-delayed.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do we want to name these files *-async.py, in keeping with the terminology "async function call" that you use elsewhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think it makes sense.
I have just renamed the OpenAI and Anthropic examples to “async” instead of “delayed.” 👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned that other LLMs also handle async functioning well, too. Should we get at least Google in this set of examplees, too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also...I know that OpenAIResponsesLLMService is still fairly new, but we should eventually start treating it as the "main" or "recommended" OpenAI service. We're maybe not quite there yet, but it might be worth testing it and including it as an example this group of example. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I have added two new examples, one for Google and the other one using OpenAI Response. 👍

tool_call_id: Unique identifier for this function call.
arguments: Arguments passed to the function.
cancel_on_interruption: Whether to cancel this call if interrupted.
When ``False`` the call is treated as asynchronous: the LLM
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice clear and concise description here

tool_call_id="1",
arguments={"location": "Los Angeles"},
cancel_on_interruption=False,
cancel_on_interruption=True,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the test was expecting the old syntax, it was trying to read the context from the replaced result message.
assert json.loads(context.messages[-1]["content"]) == {"conditions": "Sunny"}

function_name: str
tool_call_id: str
arguments: Any
cancel_on_interruption: bool = False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to revisit an old decision, but: why does this field have a default value? Won't we always specify it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as far as I can tell, we always do.

Copy link
Copy Markdown
Contributor

@kompfner kompfner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a few minor questions/suggestions. Pre-approving, given all your testing. Nice work!

@filipi87 filipi87 merged commit 6eccd16 into main Apr 7, 2026
6 checks passed
@filipi87 filipi87 deleted the filipi/async_tools branch April 7, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants