Correction (updated after deeper investigation):
My initial diagnosis blamed background=True leaking into tool-result submissions. That was wrong. The actual culprit is continuation_token leaking into mutable_options across tool-loop iterations, causing the SDK to repeatedly GET the same completed response instead of POSTing tool results.
The HTTP log is the smoking gun: 1 POST + ~40 GETs all to the same response_id, no tool-result POSTs ever happen. See the updated Description below. A runtime monkeypatch that fixes it is verified working and included at the bottom.
background=True + function tools: SDK loops on responses.retrieve() of the same response, never POSTs tool results, returns empty text
Description
SDK version: agent-framework-core==1.1.0, agent-framework-openai==1.1.0
Python version:: 3.14
When an agent runs with background=True and has local function tools, the SDK gets stuck retrieving the same completed background response on every tool-loop iteration. Tool results are never POSTed back. After max_iterations (default 40) the loop exits and response.text is empty.
What actually happens
Verified from httpx INFO logs: the traffic is 1 POST followed by roughly 40 GETs, and every GET hits the same response_id.
- Caller does
agent.run(messages, options={background: True}). This POSTs and gets back a continuation_token.
- Caller polls with
agent.run(session=session, options={continuation_token: X}) in a loop.
- Eventually
RawOpenAIChatClient._inner_get_response() sees continuation_token in options and calls client.responses.retrieve(response_id). The retrieve returns a completed response with function_call output items.
FunctionInvocationLayer executes the tools locally and enters the next tool-loop iteration.
mutable_options (the shared dict inside FunctionInvocationLayer.get_response()) still contains continuation_token, so the next super_get_response() call does another responses.retrieve() to the same response_id.
- Same completed response, same
function_call items. Tools run again.
- Repeat up to 40 times. No tool result is ever POSTed. The model never gets the tool outputs. Final response text is empty.
Root cause
FunctionInvocationLayer.get_response() copies all options into a mutable_options dict (around line 59 of _tools.py) and reuses that same dict for every super_get_response() call in the tool loop. When the caller's polling code passes continuation_token in options, it leaks into every iteration. _inner_get_response() sees it and keeps doing GET retrieves instead of POSTing tool results.
background=True has the same problem. If only continuation_token were fixed, tool-result POSTs would still start new background jobs instead of running synchronously.
For comparison, conversation_id is handled correctly: it gets updated on mutable_options by _update_continuation_state after each response. continuation_token needs the inverse treatment, it should be removed from mutable_options once the background job returns a completed response.
Code Sample
Minimal reproduction. One tool, background mode, structured output.
import asyncio
from agent_framework.openai import OpenAIChatClient, OpenAIChatOptions
from agent_framework import AgentSession, Message, tool
from pydantic import BaseModel
@tool(name="search", description="Search documents for information")
def search(query: str) -> str:
return f"Document contains info about: {query}"
class Output(BaseModel):
answer: str
client = OpenAIChatClient(
api_key="YOUR_KEY",
model="gpt-4.1",
azure_endpoint="https://YOUR_ENDPOINT",
api_version="preview",
)
agent = client.as_agent(
instructions="Use the search tool to find info, then answer.",
name="repro_agent",
tools=[search],
default_options=OpenAIChatOptions(response_format=Output),
)
async def reproduce():
session = AgentSession()
messages = [Message(role="user", contents=["Search for 'project name'"])]
response = await agent.run(
messages, session=session, options=OpenAIChatOptions(background=True)
)
poll_count = 0
while response.continuation_token is not None:
poll_count += 1
await asyncio.sleep(2.0)
response = await agent.run(
session=session,
options=OpenAIChatOptions(continuation_token=response.continuation_token),
)
print(f"Polls: {poll_count}")
print(f"response.text: '{response.text}'") # empty
asyncio.run(reproduce())
Expected: response.text contains valid JSON matching Output. Tool called roughly once.
Actual: response.text is empty. Tool called 40+ times with identical arguments.
Error Messages / Stack Traces
No exception is raised. The bug is silent. The SDK exhausts all 40 iterations and returns an empty response.
HTTP traffic (before fix)
From httpx INFO logs. Note the identical response_id on every GET.
POST .../openai/v1/responses?api-version=preview "HTTP/1.1 200 OK"
Poll 1...
GET .../openai/v1/responses/resp_058e072487625bc8...?api-version=preview "HTTP/1.1 200 OK"
Poll 2...
GET .../openai/v1/responses/resp_058e072487625bc8...?api-version=preview "HTTP/1.1 200 OK"
Poll 3...
GET .../openai/v1/responses/resp_058e072487625bc8...?api-version=preview "HTTP/1.1 200 OK"
(~40 more GETs to the SAME resp_058e072487625bc8... id)
GET .../openai/v1/responses/resp_058e072487625bc8...?api-version=preview "HTTP/1.1 200 OK"
Exactly one POST. Zero tool-result POSTs. Every other request is a GET retrieve of the same response_id. This is what proves the SDK is stuck retrieving the same completed response instead of looping through fresh tool-result submissions.
SDK log when the loop exhausts
Maximum iterations reached (40). Requesting final response without tools.
Observed output
--- Result (polls=3) ---
Text length: 0
Text (raw):
Tool calls made: 120
{'tool': 'search_documents', 'query': 'prosjektnavn adresse byggherre...', 'hits': 0}
{'tool': 'search_documents', 'query': 'tilbudsinnbydelse prosjekt byggherre adresse', 'hits': 0}
{'tool': 'search_documents', 'query': 'konkurransegrunnlag prosjektbeskrivelse...', 'hits': 0}
(same 3 queries repeated 40 times = 120 calls)
*** EMPTY RESPONSE TEXT ***
The same tool calls repeat identically on every iteration because the SDK retrieves the same completed background response every time. Tool results are never POSTed, so the model never produces a final answer.
Additional Context
Affected code paths
| Location |
What it does |
agent_framework._tools.FunctionInvocationLayer.get_response() around line 59 |
mutable_options = dict(options). Copies continuation_token and background into the dict shared across all loop iterations. |
agent_framework._tools.FunctionInvocationLayer.get_response() around line 189 |
super_get_response(options=mutable_options). Reuses the same dict on every iteration, including after a retrieve has already returned a completed response. |
agent_framework_openai._chat_client.RawOpenAIChatClient._inner_get_response() |
Reads continuation_token from options. If set, does client.responses.retrieve(response_id) instead of POSTing. Never strips it after a completed retrieve. |
Workaround
Do not combine background mode with tools:
use_background = tools is None
Runtime monkeypatch (verified)
Patch RawOpenAIChatClient._inner_get_response so that once a responses.retrieve() returns a non-in-progress response, it removes continuation_token and background from the shared options dict. The FunctionInvocationLayer passes the same dict reference across iterations, so mutating it in place is enough to make the next iteration POST tool results properly.
from agent_framework_openai._chat_client import RawOpenAIChatClient
if not hasattr(RawOpenAIChatClient, "_true_original_inner_get_response"):
RawOpenAIChatClient._true_original_inner_get_response = RawOpenAIChatClient._inner_get_response
def _patched_inner_get_response(self, *, messages, options, stream=False, **kwargs):
continuation_token = options.get("continuation_token")
if not stream and continuation_token is not None:
async def _run():
validated_options = await self._validate_options(options)
try:
raw_response = await self.client.responses.retrieve(continuation_token["response_id"])
except Exception as ex:
self._handle_request_error(ex)
chat_response = self._parse_response_from_openai(raw_response, options=validated_options)
if chat_response.continuation_token is None and isinstance(options, dict):
options.pop("continuation_token", None)
options.pop("background", None)
return chat_response
return _run()
return RawOpenAIChatClient._true_original_inner_get_response(
self, messages=messages, options=options, stream=stream, **kwargs
)
RawOpenAIChatClient._inner_get_response = _patched_inner_get_response
HTTP traffic with the monkeypatch applied
POST .../openai/v1/responses?api-version=preview "HTTP/1.1 200 OK"
Poll 1...
GET .../openai/v1/responses/resp_01a797ec4c3e877b...?api-version=preview "HTTP/1.1 200 OK"
Poll 2...
GET .../openai/v1/responses/resp_01a797ec4c3e877b...?api-version=preview "HTTP/1.1 200 OK"
Poll 3...
GET .../openai/v1/responses/resp_01a797ec4c3e877b...?api-version=preview "HTTP/1.1 200 OK"
[MONKEYPATCH] stripped continuation_token+background from mutable_options
POST .../openai/v1/responses?api-version=preview "HTTP/1.1 200 OK"
POST .../openai/v1/responses?api-version=preview "HTTP/1.1 200 OK"
(roughly 16 tool-result POSTs)
POST .../openai/v1/responses?api-version=preview "HTTP/1.1 200 OK"
Result: valid JSON response matching the schema, tools called with varied queries (normal agent behavior) rather than the same 3 queries repeated 40 times.
background=True+ function tools: SDK loops onresponses.retrieve()of the same response, never POSTs tool results, returns empty textDescription
SDK version:
agent-framework-core==1.1.0,agent-framework-openai==1.1.0Python version::
3.14When an agent runs with
background=Trueand has local function tools, the SDK gets stuck retrieving the same completed background response on every tool-loop iteration. Tool results are never POSTed back. Aftermax_iterations(default 40) the loop exits andresponse.textis empty.What actually happens
Verified from
httpxINFO logs: the traffic is 1 POST followed by roughly 40 GETs, and every GET hits the sameresponse_id.agent.run(messages, options={background: True}). This POSTs and gets back acontinuation_token.agent.run(session=session, options={continuation_token: X})in a loop.RawOpenAIChatClient._inner_get_response()seescontinuation_tokenin options and callsclient.responses.retrieve(response_id). The retrieve returns a completed response withfunction_calloutput items.FunctionInvocationLayerexecutes the tools locally and enters the next tool-loop iteration.mutable_options(the shared dict insideFunctionInvocationLayer.get_response()) still containscontinuation_token, so the nextsuper_get_response()call does anotherresponses.retrieve()to the sameresponse_id.function_callitems. Tools run again.Root cause
FunctionInvocationLayer.get_response()copies all options into amutable_optionsdict (around line 59 of_tools.py) and reuses that same dict for everysuper_get_response()call in the tool loop. When the caller's polling code passescontinuation_tokenin options, it leaks into every iteration._inner_get_response()sees it and keeps doing GET retrieves instead of POSTing tool results.background=Truehas the same problem. If onlycontinuation_tokenwere fixed, tool-result POSTs would still start new background jobs instead of running synchronously.For comparison,
conversation_idis handled correctly: it gets updated onmutable_optionsby_update_continuation_stateafter each response.continuation_tokenneeds the inverse treatment, it should be removed frommutable_optionsonce the background job returns a completed response.Code Sample
Minimal reproduction. One tool, background mode, structured output.
Expected:
response.textcontains valid JSON matchingOutput. Tool called roughly once.Actual:
response.textis empty. Tool called 40+ times with identical arguments.Error Messages / Stack Traces
No exception is raised. The bug is silent. The SDK exhausts all 40 iterations and returns an empty response.
HTTP traffic (before fix)
From
httpxINFO logs. Note the identicalresponse_idon every GET.Exactly one POST. Zero tool-result POSTs. Every other request is a GET retrieve of the same
response_id. This is what proves the SDK is stuck retrieving the same completed response instead of looping through fresh tool-result submissions.SDK log when the loop exhausts
Observed output
The same tool calls repeat identically on every iteration because the SDK retrieves the same completed background response every time. Tool results are never POSTed, so the model never produces a final answer.
Additional Context
Affected code paths
agent_framework._tools.FunctionInvocationLayer.get_response()around line 59mutable_options = dict(options). Copiescontinuation_tokenandbackgroundinto the dict shared across all loop iterations.agent_framework._tools.FunctionInvocationLayer.get_response()around line 189super_get_response(options=mutable_options). Reuses the same dict on every iteration, including after a retrieve has already returned a completed response.agent_framework_openai._chat_client.RawOpenAIChatClient._inner_get_response()continuation_tokenfrom options. If set, doesclient.responses.retrieve(response_id)instead of POSTing. Never strips it after a completed retrieve.Workaround
Do not combine background mode with tools:
Runtime monkeypatch (verified)
Patch
RawOpenAIChatClient._inner_get_responseso that once aresponses.retrieve()returns a non-in-progress response, it removescontinuation_tokenandbackgroundfrom the shared options dict. TheFunctionInvocationLayerpasses the same dict reference across iterations, so mutating it in place is enough to make the next iteration POST tool results properly.HTTP traffic with the monkeypatch applied
Result: valid JSON response matching the schema, tools called with varied queries (normal agent behavior) rather than the same 3 queries repeated 40 times.