Self Checks
Dify version
1.14.2
Plugin version
0.0.39 (Dify Agent Strategies)
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
## PluginDaemonInternalServerError: killed by timeout on agent_strategy/invoke despite successful LLM/tool execution
### Problem
I intermittently get the following error from `dify-plugin-daemon` when invoking an Agent Strategy:
```text
req_id: 9095f1ca5e
PluginDaemonInternalServerError: killed by timeout
The timeout occurs after approximately 120 seconds, which matches the current plugin timeout configuration:
PLUGIN_DAEMON_TIMEOUT=120.0
PLUGIN_MAX_EXECUTION_TIMEOUT=120
The problematic endpoint is:
POST /plugin/<tenant_id>/dispatch/agent_strategy/invoke
Relevant plugin_daemon log
2026-06-10T11:40:11.849682907Z ERROR dify-plugin-daemon factory.go:28 PluginDaemonInternalServerError error="killed by timeout"
github.com/langgenius/dify-plugin-daemon/internal/types/exception.InternalServerError
/app/internal/types/exception/factory.go:28
github.com/langgenius/dify-plugin-daemon/internal/service.baseSSEService[...]
/app/internal/service/base_sse.go:103
github.com/langgenius/dify-plugin-daemon/internal/service.baseSSEWithSession[...]
/app/internal/service/base_sse.go:142
github.com/langgenius/dify-plugin-daemon/internal/service.InvokeAgentStrategy
/app/internal/service/invoke_agent.go:19
2026-06-10T11:40:11.850575176Z INFO dify-plugin-daemon middleware.go:83
trace_id=ceb3a5a74a815f077aed88eafbb509dd
tenant_id=<redacted>
HTTP request status=200 latency_ms=120004
client_ip=172.22.0.12
method=POST
path=/plugin/<tenant_id>/dispatch/agent_strategy/invoke
Important observation
The lower-level LLM and tool calls seem to complete successfully before the global agent_strategy/invoke request times out.
From the API container logs, around the same period:
[on_llm_after_invoke]
Model: gpt-4o-mini-2024-07-18
Usage: prompt_tokens=6902 completion_tokens=79 total_tokens=6981
latency=1.9661424160003662
Then the tool call is executed successfully:
[on_tool_start] ToolCall:post_query_v1_query_post
{'query': 'PREFIX bot: <https://w3id.org/bot#>
SELECT DISTINCT (COUNT(?space) AS ?count)
WHERE {
GRAPH <http://zaventem/graph/iot/> {
?space a bot:Space .
}
}'}
Tool response:
{
"head": {
"vars": ["count"]
},
"results": {
"bindings": [
{
"count": {
"datatype": "http://www.w3.org/2001/XMLSchema#integer",
"type": "literal",
"value": "63"
}
}
]
}
}
Tool invocation completed successfully:
2026-06-10 12:34:35.506 INFO [MainThread] [handler.py:242]
881dbde8c62a5955bb4a9c7faaebd59b
"POST /inner/api/invoke/tool HTTP/1.1" 200 717 0.082712
A second LLM call then also completed successfully:
[on_llm_after_invoke]
Content: "Le nombre total de pièces est de **63**."
Model: gpt-4o-mini-2024-07-18
Usage: prompt_tokens=7086 completion_tokens=13 total_tokens=7099
latency=1.079631193075329
API response:
2026-06-10 12:34:36.858 INFO [MainThread] [handler.py:242]
01a7da653a5e578c9ff6e7521472b66e
"POST /inner/api/invoke/llm HTTP/1.1" 200 3931 1.301323
Additional diagnostics
I inspected plugin processes inside the plugin_daemon container.
The agent plugin process looked idle and healthy:
PID 1104
/app/storage/cwd/langgenius/agent-0.0.39.../.venv/bin/python -m main
State: S (sleeping)
Threads: 2
WCHAN: ep_poll
CPU: 0.0%
TCP: no direct connection
FD:
0 -> pipe
1 -> pipe
2 -> pipe
3 -> eventpoll
4 -> eventfd
The azure_openai plugin process was occasionally active via stdio/pipe, but no direct TCP connection was observed from that process:
PID 805
/app/storage/cwd/langgenius/azure_openai-0.0.56.../.venv/bin/python -m main
State: S (sleeping)
Threads: 2
WCHAN: ep_poll
FD:
0 -> pipe
1 -> pipe
2 -> pipe
5 -> socket
6 -> socket
During execution, some I/O activity was observed:
azure_openai:
rchar: 19966418 -> 20216326 -> 20251312
wchar: 392792 -> 423374 -> 424748
However, no TCP connection was visible from the plugin container during the monitored period.
What seems suspicious
The useful work appears to complete successfully:
- First LLM call succeeds.
- Tool call succeeds.
- Final LLM answer succeeds.
- The final answer is produced.
But the global request:
/dispatch/agent_strategy/invoke
still remains open until it is killed by the plugin daemon timeout after ~120 seconds.
This looks like the Agent Strategy SSE/session may not be finalized correctly after the final answer is produced.
### Questions
Could this be related to:
* Agent Strategy SSE stream not being closed correctly?
* `baseSSEService` / `baseSSEWithSession` waiting for a final event?
* Plugin daemon session lifecycle issue?
* A mismatch between successful `/inner/api/invoke/llm` completion and global `/dispatch/agent_strategy/invoke` completion?
Any guidance on additional logs or diagnostics to collect would be appreciated.
✔️ Expected Behavior
Once the agent has completed the final response, the agent_strategy/invoke SSE/session should close normally and the request should not remain pending until PLUGIN_DAEMON_TIMEOUT.
Actual behavior
The useful work completes, but the global plugin daemon request remains pending and is killed after 120 seconds:
PluginDaemonInternalServerError error="killed by timeout"
latency_ms=120004
❌ Actual Behavior
No response
✔️ Error log
No response
Self Checks
Dify version
1.14.2
Plugin version
0.0.39 (Dify Agent Strategies)
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
The timeout occurs after approximately 120 seconds, which matches the current plugin timeout configuration:
The problematic endpoint is:
Relevant plugin_daemon log
Important observation
The lower-level LLM and tool calls seem to complete successfully before the global
agent_strategy/invokerequest times out.From the API container logs, around the same period:
Then the tool call is executed successfully:
Tool response:
{ "head": { "vars": ["count"] }, "results": { "bindings": [ { "count": { "datatype": "http://www.w3.org/2001/XMLSchema#integer", "type": "literal", "value": "63" } } ] } }Tool invocation completed successfully:
A second LLM call then also completed successfully:
API response:
Additional diagnostics
I inspected plugin processes inside the
plugin_daemoncontainer.The
agentplugin process looked idle and healthy:The
azure_openaiplugin process was occasionally active via stdio/pipe, but no direct TCP connection was observed from that process:During execution, some I/O activity was observed:
However, no TCP connection was visible from the plugin container during the monitored period.
What seems suspicious
The useful work appears to complete successfully:
But the global request:
still remains open until it is killed by the plugin daemon timeout after ~120 seconds.
This looks like the Agent Strategy SSE/session may not be finalized correctly after the final answer is produced.
✔️ Expected Behavior
Once the agent has completed the final response, the
agent_strategy/invokeSSE/session should close normally and the request should not remain pending untilPLUGIN_DAEMON_TIMEOUT.Actual behavior
The useful work completes, but the global plugin daemon request remains pending and is killed after 120 seconds:
❌ Actual Behavior
No response
✔️ Error log
No response