fix(google): reject tool calls when tool_choice="none" in realtime#6166
fix(google): reject tool calls when tool_choice="none" in realtime#6166longcw wants to merge 1 commit into
Conversation
| if self._opts.tool_choice == "none": | ||
| responses = [ | ||
| create_function_response( | ||
| llm.FunctionCallOutput( | ||
| name=fnc_call.name or "", | ||
| call_id=fnc_call.id or "", | ||
| output="Tool calls are disabled for this turn, respond to the user directly.", | ||
| is_error=True, | ||
| ), | ||
| vertexai=self._opts.vertexai, | ||
| tool_response_scheduling=self._opts.tool_response_scheduling, | ||
| ) | ||
| for fnc_call in tool_call.function_calls or [] | ||
| ] | ||
| if responses: | ||
| self._send_client_event(types.LiveClientToolResponse(function_responses=responses)) | ||
| self._mark_current_generation_done() | ||
| return |
There was a problem hiding this comment.
π© No infinite-loop guard when model repeatedly calls tools despite rejection
When tool_choice="none", each tool call is rejected and a new generation starts. If the model persistently calls tools after receiving error responses (unlikely but possible with certain prompts or model behaviors), this creates a loop: reject β new generation β tool call β reject β ... There's no max-retry or circuit-breaker mechanism. In practice, models stop after receiving error responses, but a pathological case could stall the session indefinitely. Consider adding a counter to break the loop after N rejections.
Was this helpful? React with π or π to provide feedback.
The Google Realtime API has no per-response tool_choice. When core requests tool_choice="none" (e.g. generate_reply() inside a tool, or the final post-tool reply), Gemini may still emit a tool call, and with the default blocking tool behavior the turn stalls waiting for a response that core drops, so the model never speaks its follow-up.
Handle this in the plugin: store the requested tool_choice and, when it is "none", reject any tool call the model emits with an error response, without opening a generation. Keeping the pending generate_reply unresolved binds the model's eventual reply to it and keeps tools suppressed for the whole turn; the trailing server content / usage metadata of the rejected turn is dropped to debug instead of warning.
Also unify FunctionResponse construction into create_function_response, used by both get_tool_results_for_realtime and the rejection path, and honor is_error so error tool outputs are sent as {"error": ...} instead of {"output": ...}.
7461128 to
8175882
Compare
| ) -> types.FunctionResponse: | ||
| res = types.FunctionResponse( | ||
| name=output.name, | ||
| response={"error": output.output} if output.is_error else {"output": output.output}, |
There was a problem hiding this comment.
π© Behavioral change in error response format for get_tool_results_for_realtime
The refactoring of create_function_response introduces a behavioral change: when is_error=True, the function response dict key changes from {"output": msg} (old behavior in get_tool_results_for_realtime) to {"error": msg} (new behavior). This affects all tool execution failures sent via update_chat_ctx β get_tool_results_for_realtime. While likely intentional (and arguably more correct since it signals errors differently to the model), this is a semantic change to an existing code path that could subtly affect model behavior for error-case tool responses. The Gemini API's FunctionResponse.response field is a generic dict, so the key name is what the model "sees" β changing it from "output" to "error" may change how the model interprets failed tool calls.
Was this helpful? React with π or π to provide feedback.
Closes #6002
The Google Realtime API has no per-response
tool_choice. When core requeststool_choice="none"(e.g.generate_reply()inside a tool, or the final post-tool reply), Gemini may still emit a tool call. With the default blocking tool behavior the turn then stalls waiting for a tool response that core drops (received a tool call with tool_choice set to 'none', ignoring), so the model never speaks its follow-up.This handles the case inside the plugin: the requested
tool_choiceis stored on the session, and when it is"none"any tool call the model emits during that turn is answered with an error response. That unblocks the session and lets it reply to the user directly, instead of hanging.It also unifies
FunctionResponseconstruction into a singlecreate_function_response, used by bothget_tool_results_for_realtimeand the rejection path, and honorsis_errorso error tool outputs are sent as{"error": ...}instead of{"output": ...}.