Self Checks
Dify version
1.14.1 (self-hosted, Docker)
Plugin version
langgenius/notion_datasource@0.1.18
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Steps to reproduce
- Have a Notion workspace where many pages / databases are shared with the integration (in our environment: several thousand items).
- In Dify, go to Settings → Data Source → Notion, add an Integration Token (API key).
- Go to Create Knowledge Base → Data Source → Notion → Sync.
- The page list never appears. The plugin-daemon kills the request after exactly 600 seconds (the SSE deadline), the API returns
httpx.ReadTimeout, and the UI eventually falls back to the “Notion is not connected” screen.
The same workspace works fine on smaller integrations (a few hundred pages), so this is a scale problem, not a credentials problem.
Why it scales badly
Looking at datasources/notion_datasource/datasources/utils/notion_client.py, get_authorized_pages() does the following serially:
- Loop
/v1/search with filter=page until has_more is false.
- Loop
/v1/search with filter=database until has_more is false.
- For every result, call
/v1/blocks/{id} once to resolve the parent page id.
- If the parent is itself a
block_id, recurse into another /v1/blocks/{id} call (no memoization).
There is no concurrency, no caching, and the three call sites do not go through the retry path in _make_request, so any transient 429 / 5xx fails the entire enumeration.
For an N-item workspace with average parent depth K, the parent-resolution phase alone issues ~N × K serial HTTP calls. At a few thousand items this exceeds the plugin-daemon's 600-second SSE deadline and the request is killed.
✔️ Error log
# plugin_daemon
ERROR dify-plugin-daemon factory.go:28 PluginDaemonInternalServerError
error="killed by timeout"
service.baseSSEService(... 0x258 ...) # 0x258 == 600 seconds
service.DatasourceGetOnlineDocumentPages(...)
HTTP request method=POST
path=/plugin/<tenant>/dispatch/datasource/get_online_document_pages
status=200 latency_ms=600003 # ← killed exactly at the SSE deadline
# api
ERROR app.py:875 Exception on /console/api/notion/pre-import/pages [GET]
httpcore.ReadTimeout: timed out
httpx.ReadTimeout: timed out
core.plugin.entities.plugin_daemon.PluginDaemonInnerError:
Request to Plugin Daemon Service failed
Self Checks
Dify version
1.14.1 (self-hosted, Docker)
Plugin version
langgenius/notion_datasource@0.1.18
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Steps to reproduce
httpx.ReadTimeout, and the UI eventually falls back to the “Notion is not connected” screen.The same workspace works fine on smaller integrations (a few hundred pages), so this is a scale problem, not a credentials problem.
Why it scales badly
Looking at
datasources/notion_datasource/datasources/utils/notion_client.py,get_authorized_pages()does the following serially:/v1/searchwithfilter=pageuntilhas_moreis false./v1/searchwithfilter=databaseuntilhas_moreis false./v1/blocks/{id}once to resolve the parent page id.block_id, recurse into another/v1/blocks/{id}call (no memoization).There is no concurrency, no caching, and the three call sites do not go through the retry path in
_make_request, so any transient 429 / 5xx fails the entire enumeration.For an N-item workspace with average parent depth K, the parent-resolution phase alone issues ~
N × Kserial HTTP calls. At a few thousand items this exceeds the plugin-daemon's 600-second SSE deadline and the request is killed.✔️ Error log