-
Notifications
You must be signed in to change notification settings - Fork 3.2k
streaming=True in /run_sse returns empty text after AgentTool calls (works withstreaming=False) #3754
Description
GitHub Issue: Empty Text Response After AgentTool Calls in SSE Streaming Mode
Bug Description
When using the /run_sse endpoint with streaming: true, the final text response after an AgentTool call (e.g., WebSearch wrapped in AgentTool) is empty. The tool executes successfully and returns results, but the agent's synthesized response based on those results is not included in the SSE stream.
This works correctly when using streaming: false - the agent properly synthesizes and returns a final text response after the tool call.
Expected Behavior
After an AgentTool executes (e.g., WebSearch), the agent should synthesize the tool results and return a final text response in the SSE stream.
Actual Behavior
- Tool calls execute correctly
- Tool responses are received
- Final text response is missing/empty
- The stream ends without the agent's synthesized answer
Reproduction Steps
Server Setup (Missing from original issue)
Here's the minimal server setup using get_fast_api_app():
# server.py
from google.adk.cli.fast_api import get_fast_api_app
app = get_fast_api_app(
agents_dir="./agents",
session_service_uri="agentengine://YOUR_AGENT_ENGINE_ID",
memory_service_uri="agentengine://YOUR_AGENT_ENGINE_ID",
web=True,
)
# Run with: uvicorn server:app --port 8000Agent structure (agents/root_agent/agent.py):
from google.adk.agents import LlmAgent
from google.adk.tools.agent_tool import AgentTool
from google.adk.tools import google_search
# Sub-agent for search
search_agent = LlmAgent(
name="web_search_agent",
model="gemini-2.5-flash",
tools=[google_search],
)
# Root agent using AgentTool
root_agent = LlmAgent(
name="root_agent",
model="gemini-2.5-pro",
tools=[AgentTool(agent=search_agent)],
instruction="You are a helpful assistant. Use web_search_agent to find information.",
)Complete Reproduction Steps
- Set up the server as shown above
- Start server:
uvicorn server:app --port 8000 - Create a session:
curl -s -X POST "http://localhost:8000/apps/root_agent/users/test_user/sessions" \
-H "Content-Type: application/json" | jq -r '.id'
# Returns: SESSION_ID- Bug case (
streaming: true):
curl -s -X POST "http://localhost:8000/run_sse" \
-H "Content-Type: application/json" \
-d '{
"app_name": "root_agent",
"user_id": "test_user",
"session_id": "SESSION_ID",
"new_message": {
"role": "user",
"parts": [{"text": "Search for the fastest opamp"}]
},
"streaming": true
}'Result: Tool executes, but final text response is empty/missing.
- Working case (
streaming: false):
curl -s -X POST "http://localhost:8000/run_sse" \
-H "Content-Type: application/json" \
-d '{
"app_name": "root_agent",
"user_id": "test_user",
"session_id": "SESSION_ID",
"new_message": {
"role": "user",
"parts": [{"text": "Search for the fastest opamp"}]
},
"streaming": false
}'Result: Complete response including synthesized answer like "Based on my search, the fastest opamp is..."
Key Observation
The streaming parameter maps to StreamingMode:
streaming: false→StreamingMode.NONE→ Worksstreaming: true→StreamingMode.SSE→ Bug: empty response after AgentTool
This is why adk web works - it uses streaming: false by default.
- Create an agent with an
AgentTool(e.g., WebSearch):
from google.adk.agents import LlmAgent
from google.adk.tools import AgentTool
from google.adk.tools.agent_tool import AgentTool
from google.adk.agents import Agent
# Create a search sub-agent
search_agent = Agent(
name="web_search_agent",
model="gemini-2.5-flash",
tools=[google_search], # Built-in search tool
)
# Root agent with AgentTool
root_agent = LlmAgent(
name="root_agent",
model="gemini-2.5-pro",
tools=[
AgentTool(agent=search_agent),
],
instruction="You are a helpful assistant. Use WebSearch to find information.",
)- Start the ADK server:
poetry run uvicorn server:app --port 8000- Create a session:
curl -s -X POST "http://localhost:8000/apps/root_agent/users/test_user/sessions" \
-H "Content-Type: application/json"
# Returns: {"id": "SESSION_ID", ...}- With
streaming: true(BUG):
curl -s -X POST "http://localhost:8000/run_sse" \
-H "Content-Type: application/json" \
-d '{
"app_name": "root_agent",
"user_id": "test_user",
"session_id": "SESSION_ID",
"new_message": {
"role": "user",
"parts": [{"text": "Find me the fastest opamp on the market"}]
},
"streaming": true
}'Result: SSE events include:
- Agent thought (thinking about using WebSearch)
- Tool call (WebSearch invocation)
- Tool response (search results)
- Missing: Final text response synthesizing the results
- With
streaming: false(WORKS):
curl -s -X POST "http://localhost:8000/run_sse" \
-H "Content-Type: application/json" \
-d '{
"app_name": "root_agent",
"user_id": "test_user",
"session_id": "SESSION_ID",
"new_message": {
"role": "user",
"parts": [{"text": "Find me the fastest opamp on the market"}]
},
"streaming": false
}'Result: All events received correctly, including the final text response like:
"Based on my search, the fastest opamp is the TI OPA855 with 8GHz bandwidth..."
Key Observation
The adk web command works correctly because it internally uses streaming: false by default. This is why testing via adk web shows the complete response, but programmatic access with streaming: true fails.
Environment
- google-adk version: 1.19.0
- Python version: 3.12
- Server:
get_fast_api_app()with defaultVertexAiSessionService - Model: gemini-2.5-pro (also reproducible with gemini-2.5-flash)
- Tool: Any
AgentToolwrapping a sub-agent (WebSearch, custom agents, etc.)
Workaround
Use streaming: false in API requests. The /run_sse endpoint still returns SSE-formatted events, but with complete (non-partial) events only.
# In client code
body = {
"streaming": False, # Workaround for empty response bug
...
}Related Issues
- stream_query and HTTP Streaming to Vertex AI Agent Return Empty/Exhausted Responses (Works via ADK Web) #1830 - Similar symptoms (empty responses) but different root cause
skip_summarizationdoes not work as intended when usingAgentToolwithin nested agent setup #1103 - Related toAgentToolandskip_summarizationbut different issue- Database UniqueViolation on events table during streaming with multi-part LLM responses (text + function call) #297 - Streaming issues with multi-part responses (database-related)
Additional Context
This issue specifically affects the text response after tool execution. The streaming works fine for:
- Initial agent thoughts
- Tool call events
- Tool response events
Only the final synthesized text response is missing when streaming: true.
Impact
This is a blocking issue for any application that:
- Uses
AgentToolfor sub-agent orchestration - Requires real-time streaming for UX
- Expects the agent to synthesize results from tool calls
Labels to Add
bugstreamingAgentToolrun_sse