Skip to content

PluginDaemonInternalServerError #760

Description

@nicho2

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues Dify issues & Dify Official Plugins, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.14.2

Plugin version

0.0.39 (Dify Agent Strategies)

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce


## PluginDaemonInternalServerError: killed by timeout on agent_strategy/invoke despite successful LLM/tool execution



### Problem

I intermittently get the following error from `dify-plugin-daemon` when invoking an Agent Strategy:

```text
req_id: 9095f1ca5e
PluginDaemonInternalServerError: killed by timeout

The timeout occurs after approximately 120 seconds, which matches the current plugin timeout configuration:

PLUGIN_DAEMON_TIMEOUT=120.0
PLUGIN_MAX_EXECUTION_TIMEOUT=120

The problematic endpoint is:

POST /plugin/<tenant_id>/dispatch/agent_strategy/invoke

Relevant plugin_daemon log

2026-06-10T11:40:11.849682907Z ERROR dify-plugin-daemon factory.go:28 PluginDaemonInternalServerError error="killed by timeout"

github.com/langgenius/dify-plugin-daemon/internal/types/exception.InternalServerError
    /app/internal/types/exception/factory.go:28

github.com/langgenius/dify-plugin-daemon/internal/service.baseSSEService[...]
    /app/internal/service/base_sse.go:103

github.com/langgenius/dify-plugin-daemon/internal/service.baseSSEWithSession[...]
    /app/internal/service/base_sse.go:142

github.com/langgenius/dify-plugin-daemon/internal/service.InvokeAgentStrategy
    /app/internal/service/invoke_agent.go:19

2026-06-10T11:40:11.850575176Z INFO dify-plugin-daemon middleware.go:83
trace_id=ceb3a5a74a815f077aed88eafbb509dd
tenant_id=<redacted>
HTTP request status=200 latency_ms=120004
client_ip=172.22.0.12
method=POST
path=/plugin/<tenant_id>/dispatch/agent_strategy/invoke

Important observation

The lower-level LLM and tool calls seem to complete successfully before the global agent_strategy/invoke request times out.

From the API container logs, around the same period:

[on_llm_after_invoke]

Model: gpt-4o-mini-2024-07-18
Usage: prompt_tokens=6902 completion_tokens=79 total_tokens=6981
latency=1.9661424160003662

Then the tool call is executed successfully:

[on_tool_start] ToolCall:post_query_v1_query_post

{'query': 'PREFIX bot:  <https://w3id.org/bot#>

SELECT DISTINCT (COUNT(?space) AS ?count)
WHERE {
  GRAPH <http://zaventem/graph/iot/> {
    ?space a bot:Space .
  }
}'}

Tool response:

{
  "head": {
    "vars": ["count"]
  },
  "results": {
    "bindings": [
      {
        "count": {
          "datatype": "http://www.w3.org/2001/XMLSchema#integer",
          "type": "literal",
          "value": "63"
        }
      }
    ]
  }
}

Tool invocation completed successfully:

2026-06-10 12:34:35.506 INFO [MainThread] [handler.py:242]
881dbde8c62a5955bb4a9c7faaebd59b
"POST /inner/api/invoke/tool HTTP/1.1" 200 717 0.082712

A second LLM call then also completed successfully:

[on_llm_after_invoke]

Content: "Le nombre total de pièces est de **63**."

Model: gpt-4o-mini-2024-07-18
Usage: prompt_tokens=7086 completion_tokens=13 total_tokens=7099
latency=1.079631193075329

API response:

2026-06-10 12:34:36.858 INFO [MainThread] [handler.py:242]
01a7da653a5e578c9ff6e7521472b66e
"POST /inner/api/invoke/llm HTTP/1.1" 200 3931 1.301323

Additional diagnostics

I inspected plugin processes inside the plugin_daemon container.

The agent plugin process looked idle and healthy:

PID 1104
/app/storage/cwd/langgenius/agent-0.0.39.../.venv/bin/python -m main

State: S (sleeping)
Threads: 2
WCHAN: ep_poll
CPU: 0.0%
TCP: no direct connection
FD:
0 -> pipe
1 -> pipe
2 -> pipe
3 -> eventpoll
4 -> eventfd

The azure_openai plugin process was occasionally active via stdio/pipe, but no direct TCP connection was observed from that process:

PID 805
/app/storage/cwd/langgenius/azure_openai-0.0.56.../.venv/bin/python -m main

State: S (sleeping)
Threads: 2
WCHAN: ep_poll
FD:
0 -> pipe
1 -> pipe
2 -> pipe
5 -> socket
6 -> socket

During execution, some I/O activity was observed:

azure_openai:
rchar: 19966418 -> 20216326 -> 20251312
wchar: 392792 -> 423374 -> 424748

However, no TCP connection was visible from the plugin container during the monitored period.

What seems suspicious

The useful work appears to complete successfully:

  1. First LLM call succeeds.
  2. Tool call succeeds.
  3. Final LLM answer succeeds.
  4. The final answer is produced.

But the global request:

/dispatch/agent_strategy/invoke

still remains open until it is killed by the plugin daemon timeout after ~120 seconds.

This looks like the Agent Strategy SSE/session may not be finalized correctly after the final answer is produced.


### Questions

Could this be related to:

* Agent Strategy SSE stream not being closed correctly?
* `baseSSEService` / `baseSSEWithSession` waiting for a final event?
* Plugin daemon session lifecycle issue?
* A mismatch between successful `/inner/api/invoke/llm` completion and global `/dispatch/agent_strategy/invoke` completion?

Any guidance on additional logs or diagnostics to collect would be appreciated.

✔️ Expected Behavior

Once the agent has completed the final response, the agent_strategy/invoke SSE/session should close normally and the request should not remain pending until PLUGIN_DAEMON_TIMEOUT.

Actual behavior

The useful work completes, but the global plugin daemon request remains pending and is killed after 120 seconds:

PluginDaemonInternalServerError error="killed by timeout"
latency_ms=120004

❌ Actual Behavior

No response

✔️ Error log

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions