fix(google): reject tool calls when toolChoice is none in realtime#1848
fix(google): reject tool calls when toolChoice is none in realtime#1848rosetta-livekit-bot[bot] wants to merge 1 commit into
Conversation
|
| } | ||
|
|
||
| private startNewGeneration(): void { | ||
| this.rejectedToolCalls = 0; |
There was a problem hiding this comment.
🚩 rejectedToolCalls counter not reset when toolChoice changes away from 'none'
The rejectedToolCalls counter is only reset in startNewGeneration() at line 1525. If toolChoice is changed from 'none' to 'auto' via updateOptions() mid-turn (e.g. after the agent_activity resets it at agents/src/voice/agent_activity.ts:3922-3923), and a subsequent server message arrives before a new generation starts, the stale rejectedToolCalls > 0 could cause handleServerContent or handleUsageMetadata to incorrectly suppress content. However, in practice, the agent_activity resets toolChoice in a finally block after the generation task completes, and a new generation would reset the counter. The window for this race is very narrow — it would require the model to send content between the toolChoice reset and the next generation start. This is worth noting but unlikely to be hit in practice.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (response.toolCall && this.options.toolChoice === 'none') { | ||
| // Reject without opening a generation, so a pending generateReply stays bound to the | ||
| // model's eventual reply and tools stay suppressed for the whole turn. | ||
| this.rejectToolCalls(response.toolCall.functionCalls ?? []); | ||
| return; | ||
| } |
There was a problem hiding this comment.
🚩 Early return in onReceiveMessage drops non-toolCall fields when toolChoice='none'
At line 1245-1250, when response.toolCall is present and toolChoice === 'none', the entire message is returned early after calling rejectToolCalls. This means any other fields on the same LiveServerMessage (like serverContent, sessionResumptionUpdate, usageMetadata, goAway, toolCallCancellation) are silently dropped. In the Gemini Live protocol, toolCall messages typically don't carry other significant fields alongside them, so this is unlikely to cause issues. However, sessionResumptionUpdate can theoretically accompany any message. If a session resumption handle is lost, it could affect reconnection reliability. This is a minor concern given the protocol's typical behavior.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
toolChoiceupdates and emulatetoolChoice: 'none'by rejecting emitted Gemini tool calls with error function responses.Testing
pnpm prettier --check plugins/google/src/realtime/realtime_api.tspnpm lint --filter @livekit/agents-plugin-google(viapnpm --filter @livekit/agents-plugin-google lint; completed with existing no-explicit-any warnings)pnpm build --filter @livekit/agents-plugin-google...pnpm test plugins/google/src/realtime/realtime_api.test.tsPorted from livekit/agents#6166
Original PR description
Closes #6002
The Google Realtime API has no per-response
tool_choice. When core requeststool_choice="none"(e.g.generate_reply()inside a tool, or the final post-tool reply), Gemini may still emit a tool call. With the default blocking tool behavior the turn then stalls waiting for a tool response that core drops (received a tool call with tool_choice set to 'none', ignoring), so the model never speaks its follow-up.This handles the case inside the plugin: the requested
tool_choiceis stored on the session, and when it is"none"any tool call the model emits during that turn is answered with an error response. That unblocks the session and lets it reply to the user directly, instead of hanging.It also unifies
FunctionResponseconstruction into a singlecreate_function_response, used by bothget_tool_results_for_realtimeand the rejection path, and honorsis_errorso error tool outputs are sent as{"error": ...}instead of{"output": ...}.