fix(google): reject tool calls when tool_choice="none" in realtime#6166
Conversation
7461128 to
8175882
Compare
| ) -> types.FunctionResponse: | ||
| res = types.FunctionResponse( | ||
| name=output.name, | ||
| response={"error": output.output} if output.is_error else {"output": output.output}, |
There was a problem hiding this comment.
🚩 Behavioral change in error response format for get_tool_results_for_realtime
The refactoring of create_function_response introduces a behavioral change: when is_error=True, the function response dict key changes from {"output": msg} (old behavior in get_tool_results_for_realtime) to {"error": msg} (new behavior). This affects all tool execution failures sent via update_chat_ctx → get_tool_results_for_realtime. While likely intentional (and arguably more correct since it signals errors differently to the model), this is a semantic change to an existing code path that could subtly affect model behavior for error-case tool responses. The Gemini API's FunctionResponse.response field is a generic dict, so the key name is what the model "sees" — changing it from "output" to "error" may change how the model interprets failed tool calls.
Was this helpful? React with 👍 or 👎 to provide feedback.
The Google Realtime API has no per-response tool_choice. When core requests tool_choice="none" (e.g. generate_reply() inside a tool, or the final post-tool reply), Gemini may still emit a tool call, and with the default blocking tool behavior the turn stalls waiting for a response that core drops, so the model never speaks its follow-up.
Handle this in the plugin: store the requested tool_choice and, when it is "none", reject any tool call the model emits with an error response, without opening a generation. Keeping the pending generate_reply unresolved binds the model's eventual reply to it and keeps tools suppressed for the whole turn; the trailing server content / usage metadata of the rejected turn is dropped to debug instead of warning.
Also unify FunctionResponse construction into create_function_response, used by both get_tool_results_for_realtime and the rejection path, and honor is_error so error tool outputs are sent as {"error": ...} instead of {"output": ...}.
8175882 to
e9049b3
Compare
| if response.tool_call and self._opts.tool_choice == "none": | ||
| # reject without opening a generation, so the pending generate_reply | ||
| # stays bound to the model's eventual reply and tools stay suppressed | ||
| # for the whole turn. | ||
| self._reject_tool_calls(response.tool_call.function_calls or []) | ||
| continue |
There was a problem hiding this comment.
🚩 The continue on rejected tool calls skips processing of co-occurring response fields
When tool_choice="none" and a response.tool_call is present, the continue at line 1044 skips ALL other processing for that response — including session_resumption_update, tool_call_cancellation, usage_metadata, and critically go_away (which signals an upcoming server disconnection). If any of these fields co-occur with tool_call in the same LiveServerMessage, they would be silently dropped. In practice, the Gemini API likely sends these as separate messages, but if go_away ever accompanies a tool_call, the session wouldn't prepare for disconnection. The usage_metadata case is partially mitigated by the _rejected_tool_calls guard in _handle_usage_metadata.
Was this helpful? React with 👍 or 👎 to provide feedback.
* upstream/main: (382 commits) chore(dep): update local-inference dep (livekit#6214) fix(agents): keep STT input anchor on the pipeline across handoff (livekit#6207) fix(google): clean up STT input frame task (livekit#6193) (xai realtime): set text modality (livekit#6198) chore(example): update otel trace example (livekit#6178) chore(elevenlabs): update `language_code` doc to remove error case (livekit#6197) disable eot connection error retry too (livekit#6196) chore(deps): update livekit dependency (livekit#6194) disable retry for eot errors (livekit#6195) livekit-agents@1.6.3 (livekit#6189) (assembly ai): add inference params (livekit#6185) fix(eot): restore timeout behavior for eot inference (livekit#6188) fix(google): reject tool calls when tool_choice="none" in realtime (livekit#6166) fix(slng): expose speed in update_options (livekit#6175) livekit-agents@1.6.2 (livekit#6170) feat(phonic): handle user text input (livekit#6169) docs(phonic): add forbid_speech_after_tool_call to README config table (livekit#6168) (inference): add assembly ai model (livekit#6162) (google): surface context exhaustion error (livekit#6144) (phonic): reuse ws connection across handoffs (livekit#6163) ...
Closes #6002
The Google Realtime API has no per-response
tool_choice. When core requeststool_choice="none"(e.g.generate_reply()inside a tool, or the final post-tool reply), Gemini may still emit a tool call. With the default blocking tool behavior the turn then stalls waiting for a tool response that core drops (received a tool call with tool_choice set to 'none', ignoring), so the model never speaks its follow-up.This handles the case inside the plugin: the requested
tool_choiceis stored on the session, and when it is"none"any tool call the model emits during that turn is answered with an error response. That unblocks the session and lets it reply to the user directly, instead of hanging.It also unifies
FunctionResponseconstruction into a singlecreate_function_response, used by bothget_tool_results_for_realtimeand the rejection path, and honorsis_errorso error tool outputs are sent as{"error": ...}instead of{"output": ...}.