Background
0.9.0 still depends on gevent and applies it at import time. For 0.10.0, we should make this an explicit breaking cleanup: remove the greenlet-based runtime path, keep synchronous plugin execution simple, and prepare the SDK for an asyncio-first direction.
This should be a root fix, not a compatibility shim around the current global patching model.
Current state
pyproject.toml declares direct runtime dependencies on gevent>=26.4.0 and requests>=2.33.1.
src/dify_plugin/__init__.py imports src/dify_plugin/_gevent.py, which runs monkey.patch_all(sys=True) as a package import side effect.
- Runtime readers still contain gevent-specific paths:
src/dify_plugin/core/server/stdio/request_reader.py uses gevent.os.tp_read.
src/dify_plugin/core/server/tcp/request_reader.py uses gevent.socket, gevent.select.select, and gevent.sleep.
src/dify_plugin/core/server/serverless/request_reader.py uses gevent.pywsgi.WSGIServer when sockets are patched.
src/dify_plugin/interfaces/model/ai_model.py detects patched sockets and uses gevent.threadpool.ThreadPool for token estimation.
src/dify_plugin/core/server/io_server.py already dispatches sync plugin work through ThreadPoolExecutor(max_workers=MAX_WORKER), which is the simpler concurrency primitive for IO-heavy plugins.
- SDK-owned HTTP paths still use
requests, including file invocation and OpenAI-compatible model implementations. Examples also use requests, but those are plugin-local dependencies and can be handled as a separate migration surface if needed.
Target for v0.10.0
- Remove
gevent and greenlet from SDK runtime dependencies and lock data.
- Remove package import-time patching entirely.
- Keep sync plugin execution on
ThreadPoolExecutor; tune defaults/documentation around IO-heavy workloads instead of greenlet concurrency.
- Replace SDK-owned
requests usage with a small urllib.request based helper for JSON requests, streaming responses, headers, timeouts, upload payloads, and HTTP error mapping.
- Keep the existing protocol surfaces working:
- local stdio
- remote TCP
- serverless HTTP streaming
- Shape the runtime boundaries so later asyncio-native readers/writers can be introduced without another global monkey-patching layer.
Proposed work
- Delete the import side-effect module and remove the package-level import.
- Rewrite stdio reading around standard library primitives.
- Rewrite TCP reading/writing around native sockets and
selectors or select, with explicit timeout/reconnect behavior.
- Replace the serverless gevent WSGI branch with the simplest supported non-gevent server path for this SDK release.
- Replace model token-estimation isolation with either direct execution or a standard
ThreadPoolExecutor, depending on blocking impact.
- Add a small internal HTTP helper built on
urllib.request; migrate SDK runtime call sites away from requests.
- Update config defaults/descriptions that currently mention gevent worker behavior.
- Update docs and examples that describe the runtime model.
- Add regression tests for import side effects, stdio/TCP/serverless readers, HTTP error mapping, streaming, and thread-pool execution.
Acceptance criteria
rg -n "gevent|greenlet" src pyproject.toml returns no runtime references.
rg -n "requests" src pyproject.toml returns no SDK runtime references.
just check passes.
just test passes.
- Local, remote TCP, and serverless smoke paths still run without global monkey patching.
- The release notes for
0.10.0 call out this as a breaking runtime migration.
Background
0.9.0still depends ongeventand applies it at import time. For0.10.0, we should make this an explicit breaking cleanup: remove the greenlet-based runtime path, keep synchronous plugin execution simple, and prepare the SDK for an asyncio-first direction.This should be a root fix, not a compatibility shim around the current global patching model.
Current state
pyproject.tomldeclares direct runtime dependencies ongevent>=26.4.0andrequests>=2.33.1.src/dify_plugin/__init__.pyimportssrc/dify_plugin/_gevent.py, which runsmonkey.patch_all(sys=True)as a package import side effect.src/dify_plugin/core/server/stdio/request_reader.pyusesgevent.os.tp_read.src/dify_plugin/core/server/tcp/request_reader.pyusesgevent.socket,gevent.select.select, andgevent.sleep.src/dify_plugin/core/server/serverless/request_reader.pyusesgevent.pywsgi.WSGIServerwhen sockets are patched.src/dify_plugin/interfaces/model/ai_model.pydetects patched sockets and usesgevent.threadpool.ThreadPoolfor token estimation.src/dify_plugin/core/server/io_server.pyalready dispatches sync plugin work throughThreadPoolExecutor(max_workers=MAX_WORKER), which is the simpler concurrency primitive for IO-heavy plugins.requests, including file invocation and OpenAI-compatible model implementations. Examples also userequests, but those are plugin-local dependencies and can be handled as a separate migration surface if needed.Target for v0.10.0
geventandgreenletfrom SDK runtime dependencies and lock data.ThreadPoolExecutor; tune defaults/documentation around IO-heavy workloads instead of greenlet concurrency.requestsusage with a smallurllib.requestbased helper for JSON requests, streaming responses, headers, timeouts, upload payloads, and HTTP error mapping.Proposed work
selectorsorselect, with explicit timeout/reconnect behavior.ThreadPoolExecutor, depending on blocking impact.urllib.request; migrate SDK runtime call sites away fromrequests.Acceptance criteria
rg -n "gevent|greenlet" src pyproject.tomlreturns no runtime references.rg -n "requests" src pyproject.tomlreturns no SDK runtime references.just checkpasses.just testpasses.0.10.0call out this as a breaking runtime migration.