Skip to content

Commit c62ff7f

Browse files
feat(agent): recover from unknown tool calls via opt-in handler
When the LLM hallucinates a tool name not registered on the agent, turn_resolution previously raised ModelBehaviorError and crashed the entire run. Add an opt-in Agent.unknown_tool_behavior field with "raise" (default, preserves existing behavior) and "respond" (append a synthetic tool-call output naming the available tools and let the run continue so the model can recover). Refs #325.
1 parent bdd228b commit c62ff7f

5 files changed

Lines changed: 233 additions & 6 deletions

File tree

docs/agents.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ The most common properties of an agent are:
4343
| `hooks` | no | Agent-scoped lifecycle callbacks. See [Lifecycle events (hooks)](#lifecycle-events-hooks). |
4444
| `tool_use_behavior` | no | Control whether tool results loop back to the model or end the run. See [Tool use behavior](#tool-use-behavior). |
4545
| `reset_tool_choice` | no | Reset `tool_choice` after a tool call (default: `True`) to avoid tool-use loops. See [Forcing tool use](#forcing-tool-use). |
46+
| `unknown_tool_behavior` | no | What to do when the model calls a tool that is not registered (default: `"raise"`). Set to `"respond"` to feed an error tool output back to the LLM and let the run continue. See [Recovering from unknown tool calls](#recovering-from-unknown-tool-calls). |
4647

4748
```python
4849
from agents import Agent, ModelSettings, function_tool
@@ -423,3 +424,18 @@ agent = Agent(
423424
!!! note
424425

425426
To prevent infinite loops, the framework automatically resets `tool_choice` to "auto" after a tool call. This behavior is configurable via [`agent.reset_tool_choice`][agents.agent.Agent.reset_tool_choice]. The infinite loop is because tool results are sent to the LLM, which then generates another tool call because of `tool_choice`, ad infinitum.
427+
428+
## Recovering from unknown tool calls
429+
430+
By default, the SDK raises [`ModelBehaviorError`][agents.exceptions.ModelBehaviorError] if the model hallucinates a tool that the agent does not expose. This is the safest behavior for development, but it can crash a long-running agent run when the model occasionally invents tool names.
431+
432+
Set `unknown_tool_behavior="respond"` on the agent to recover instead. When the model calls an unknown tool, the SDK appends a synthetic tool output describing the error and the list of available tools, and lets the agent continue. The LLM sees the error on the next turn and can pick a real tool.
433+
434+
```python
435+
agent = Agent(
436+
name="Weather Agent",
437+
instructions="Retrieve weather details.",
438+
tools=[get_weather],
439+
unknown_tool_behavior="respond",
440+
)
441+
```

src/agents/agent.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,15 @@ class Agent(AgentBase, Generic[TContext]):
368368
"""Whether to reset the tool choice to the default value after a tool has been called. Defaults
369369
to True. This ensures that the agent doesn't enter an infinite loop of tool usage."""
370370

371+
unknown_tool_behavior: Literal["raise", "respond"] = "raise"
372+
"""Controls what happens when the model invokes a tool the agent does not expose.
373+
374+
- ``"raise"`` (default): A `ModelBehaviorError` is raised, matching prior behavior.
375+
- ``"respond"``: A synthetic tool output is appended describing the error along with the list
376+
of currently available tool names, and the agent continues running so the LLM can recover
377+
on the next turn instead of crashing the run.
378+
"""
379+
371380
def __post_init__(self):
372381
from typing import get_origin
373382

@@ -484,6 +493,12 @@ def __post_init__(self):
484493
f"got {type(self.reset_tool_choice).__name__}"
485494
)
486495

496+
if self.unknown_tool_behavior not in ("raise", "respond"):
497+
raise TypeError(
498+
f"Agent unknown_tool_behavior must be 'raise' or 'respond', "
499+
f"got {self.unknown_tool_behavior!r}"
500+
)
501+
487502
def clone(self, **kwargs: Any) -> Agent[TContext]:
488503
"""Make a copy of the agent, with the given arguments changed.
489504
Notes:

src/agents/run_internal/turn_resolution.py

Lines changed: 96 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1468,6 +1468,81 @@ def _add_unmatched_pending(approval: ToolApprovalItem) -> None:
14681468
)
14691469

14701470

1471+
def _available_tool_names_for_recovery(all_tools: list[Tool]) -> list[str]:
1472+
"""Collect tool names suitable for inclusion in an unknown-tool recovery message."""
1473+
seen: set[str] = set()
1474+
names: list[str] = []
1475+
for tool in all_tools:
1476+
name = getattr(tool, "name", None)
1477+
if not isinstance(name, str) or not name or name in seen:
1478+
continue
1479+
seen.add(name)
1480+
names.append(name)
1481+
return names
1482+
1483+
1484+
def _build_unknown_tool_recovery_message(
1485+
tool_name: str,
1486+
agent_name: str,
1487+
all_tools: list[Tool],
1488+
) -> str:
1489+
"""Build the synthetic tool output sent back to the model after an unknown tool call."""
1490+
available = _available_tool_names_for_recovery(all_tools)
1491+
if available:
1492+
return (
1493+
f"Tool '{tool_name}' is not available on agent '{agent_name}'. "
1494+
f"Available tools: {', '.join(available)}."
1495+
)
1496+
return (
1497+
f"Tool '{tool_name}' is not available on agent '{agent_name}'. "
1498+
"No tools are currently available."
1499+
)
1500+
1501+
1502+
def _append_unknown_function_tool_recovery(
1503+
*,
1504+
agent: Agent[Any],
1505+
tool_call: ResponseFunctionToolCall,
1506+
items: list[RunItem],
1507+
all_tools: list[Tool],
1508+
display_name: str,
1509+
) -> None:
1510+
"""Emit a synthetic function-call output so the LLM can retry instead of crashing."""
1511+
message = _build_unknown_tool_recovery_message(display_name, agent.name, all_tools)
1512+
items.append(ToolCallItem(raw_item=tool_call, agent=agent))
1513+
items.append(
1514+
ToolCallOutputItem(
1515+
output=message,
1516+
raw_item=ItemHelpers.tool_call_output_item(tool_call, message),
1517+
agent=agent,
1518+
)
1519+
)
1520+
1521+
1522+
def _append_unknown_custom_tool_recovery(
1523+
*,
1524+
agent: Agent[Any],
1525+
tool_call: ResponseCustomToolCall,
1526+
items: list[RunItem],
1527+
all_tools: list[Tool],
1528+
) -> None:
1529+
"""Emit a synthetic custom_tool output so the LLM can retry instead of crashing."""
1530+
message = _build_unknown_tool_recovery_message(tool_call.name, agent.name, all_tools)
1531+
items.append(ToolCallItem(raw_item=cast(Any, tool_call), agent=agent))
1532+
output_raw: dict[str, Any] = {
1533+
"type": "custom_tool_call_output",
1534+
"call_id": tool_call.call_id,
1535+
"output": message,
1536+
}
1537+
items.append(
1538+
ToolCallOutputItem(
1539+
output=message,
1540+
raw_item=cast(Any, output_raw),
1541+
agent=agent,
1542+
)
1543+
)
1544+
1545+
14711546
def process_model_response(
14721547
*,
14731548
agent: Agent[Any],
@@ -1791,13 +1866,22 @@ def _dump_output_item(raw_item: Any) -> dict[str, Any]:
17911866
"Model produced apply_patch call without an apply_patch tool."
17921867
)
17931868
else:
1794-
items.append(ToolCallItem(raw_item=cast(Any, output), agent=agent))
17951869
_error_tracing.attach_error_to_current_span(
17961870
SpanError(
17971871
message="Custom tool not found",
17981872
data={"tool_name": output.name},
17991873
)
18001874
)
1875+
if agent.unknown_tool_behavior == "respond":
1876+
tools_used.append(output.name)
1877+
_append_unknown_custom_tool_recovery(
1878+
agent=agent,
1879+
tool_call=output,
1880+
items=items,
1881+
all_tools=all_tools,
1882+
)
1883+
continue
1884+
items.append(ToolCallItem(raw_item=cast(Any, output), agent=agent))
18011885
raise ModelBehaviorError(f"Tool {output.name} not found in agent {agent.name}")
18021886
elif (
18031887
isinstance(output, ResponseFunctionToolCall)
@@ -1873,9 +1957,17 @@ def _dump_output_item(raw_item: Any) -> dict[str, Any]:
18731957
data={"tool_name": qualified_output_name or output.name},
18741958
)
18751959
)
1876-
error = (
1877-
f"Tool {qualified_output_name or output.name} not found in agent {agent.name}"
1878-
)
1960+
display_name = qualified_output_name or output.name
1961+
if agent.unknown_tool_behavior == "respond":
1962+
_append_unknown_function_tool_recovery(
1963+
agent=agent,
1964+
tool_call=output,
1965+
items=items,
1966+
all_tools=all_tools,
1967+
display_name=display_name,
1968+
)
1969+
continue
1970+
error = f"Tool {display_name} not found in agent {agent.name}"
18791971
raise ModelBehaviorError(error)
18801972

18811973
items.append(

tests/test_run.py

Lines changed: 58 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,16 @@
44

55
import pytest
66

7-
from agents import Agent, Runner
7+
from agents import Agent, ModelBehaviorError, Runner
88
from agents.run import AgentRunner, set_default_agent_runner
99

1010
from .fake_model import FakeModel
11-
from .test_responses import get_text_input_item, get_text_message
11+
from .test_responses import (
12+
get_function_tool,
13+
get_function_tool_call,
14+
get_text_input_item,
15+
get_text_message,
16+
)
1217

1318

1419
@pytest.mark.asyncio
@@ -42,3 +47,54 @@ async def test_run_preserves_duplicate_user_messages() -> None:
4247
assert len(sent_input) == 2
4348
assert sent_input[0]["content"] == "repeat"
4449
assert sent_input[1]["content"] == "repeat"
50+
51+
52+
@pytest.mark.asyncio
53+
async def test_unknown_tool_default_raises_model_behavior_error() -> None:
54+
"""Default Agent still raises ModelBehaviorError when the model calls a missing tool."""
55+
model = FakeModel()
56+
model.add_multiple_turn_outputs(
57+
[
58+
[get_function_tool_call("does_not_exist", "")],
59+
[get_text_message("unreachable")],
60+
]
61+
)
62+
agent = Agent(name="test", model=model, tools=[get_function_tool("known", "ok")])
63+
64+
with pytest.raises(ModelBehaviorError, match="does_not_exist"):
65+
await Runner.run(agent, input="hello")
66+
67+
68+
@pytest.mark.asyncio
69+
async def test_unknown_tool_respond_lets_run_continue() -> None:
70+
"""With unknown_tool_behavior='respond', the run continues and the model can recover."""
71+
model = FakeModel()
72+
model.add_multiple_turn_outputs(
73+
[
74+
[get_function_tool_call("does_not_exist", "")],
75+
[get_text_message("recovered")],
76+
]
77+
)
78+
agent = Agent(
79+
name="test",
80+
model=model,
81+
tools=[get_function_tool("known", "ok")],
82+
unknown_tool_behavior="respond",
83+
)
84+
85+
result = await Runner.run(agent, input="hello")
86+
87+
assert result.final_output == "recovered"
88+
# The second model turn must have been fed the synthetic recovery tool output.
89+
sent_input = model.last_turn_args["input"]
90+
assert isinstance(sent_input, list)
91+
function_call_outputs = [
92+
item
93+
for item in sent_input
94+
if isinstance(item, dict) and item.get("type") == "function_call_output"
95+
]
96+
assert function_call_outputs, "expected a synthetic function_call_output for the unknown tool"
97+
output_text = function_call_outputs[-1].get("output")
98+
assert isinstance(output_text, str)
99+
assert "does_not_exist" in output_text
100+
assert "known" in output_text

tests/test_run_step_processing.py

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
RunHooks,
2929
RunItem,
3030
ToolCallItem,
31+
ToolCallOutputItem,
3132
Usage,
3233
handoff,
3334
)
@@ -135,6 +136,53 @@ async def test_missing_tool_call_raises_error():
135136
await process_response(agent=agent, response=response)
136137

137138

139+
@pytest.mark.asyncio
140+
async def test_unknown_function_tool_respond_appends_recovery_output():
141+
"""With unknown_tool_behavior='respond', an unknown function tool yields a tool output
142+
describing the error and the run continues instead of raising."""
143+
agent = Agent(
144+
name="test",
145+
tools=[get_function_tool(name="known_tool")],
146+
unknown_tool_behavior="respond",
147+
)
148+
response = ModelResponse(
149+
output=[get_function_tool_call("bogus_tool", "")],
150+
usage=Usage(),
151+
response_id=None,
152+
)
153+
154+
result = await process_response(agent=agent, response=response)
155+
156+
# No real function run scheduled; the loop should continue and let the LLM retry.
157+
assert not result.functions
158+
assert not result.handoffs
159+
# The unknown tool name is still recorded in tools_used (added before the lookup).
160+
assert "bogus_tool" in result.tools_used
161+
# The new items should contain a ToolCallItem for the unknown call followed by a
162+
# ToolCallOutputItem containing the recovery message that names available tools.
163+
tool_calls = [item for item in result.new_items if isinstance(item, ToolCallItem)]
164+
tool_outputs = [item for item in result.new_items if isinstance(item, ToolCallOutputItem)]
165+
assert len(tool_calls) == 1
166+
assert len(tool_outputs) == 1
167+
message = tool_outputs[0].output
168+
assert "bogus_tool" in message
169+
assert "known_tool" in message
170+
171+
172+
@pytest.mark.asyncio
173+
async def test_unknown_function_tool_default_still_raises():
174+
"""The default Agent behavior must continue to raise so existing users aren't broken."""
175+
agent = Agent(name="test", tools=[get_function_tool(name="known_tool")])
176+
response = ModelResponse(
177+
output=[get_function_tool_call("bogus_tool", "")],
178+
usage=Usage(),
179+
response_id=None,
180+
)
181+
182+
with pytest.raises(ModelBehaviorError, match="bogus_tool"):
183+
await process_response(agent=agent, response=response)
184+
185+
138186
@pytest.mark.asyncio
139187
async def test_multiple_tool_calls():
140188
agent = Agent(

0 commit comments

Comments
 (0)