Skip to content

Commit 68879a7

Browse files
authored
docs: HITL Service API polish and 1.14 release follow-ups (#760)
* feat: add env var ignore-list consumed by the verifier * docs: add 1.14 env vars and correct drifted defaults * fix: correct knowledge pipeline datasource-plugins response in api references * chore: record 1.14 env var traces in deep-dive * docs: update HITL Service API documentation * docs: clarify /api-reference/ link convention in formatting guide * docs: update marketplace publish flow * fix: drop duplicate workflow_run_id from HITL event data schemas The three HITL event schemas (StreamEventHumanInputRequired, StreamEventHumanInputFormFilled, StreamEventHumanInputFormTimeout) declared `workflow_run_id` inside `data.properties`, but `HumanInputRequiredResponse` in `api/core/app/entities/task_entities.py` defines `workflow_run_id` only as a top-level field on the response (alongside `event` and `task_id`); its inner `Data` class doesn't carry one. The OpenAPI spec already provides top-level `workflow_run_id` via the `$ref: StreamEventBase` in the `allOf` composition, so the inline duplicate was a phantom field that doesn't exist in the actual payload. Remove the inline `workflow_run_id` from `data.properties` in all three HITL event schemas across all six spec files. This relies on `StreamEventBase` to provide `workflow_run_id` at the top level via composition, matching how every other event schema in this spec handles it (e.g., `StreamEventWorkflowStarted`). Reported by Copilot on PR #756. * fix: address Copilot's review comments * feat: skip api-reference paths in internal link checker The /api-reference/... URL convention (no language prefix, derived from OpenAPI tag/summary) generates pages that Mintlify auto-builds rather than from filesystem MDX files, so the existing filesystem resolution logic flags every such link as broken. Skip both filesystem-existence and anchor validation for any URL containing /api-reference/, mirroring the existing anchor_check_skipped behaviour. The convention is documented in writing-guides/formatting-guide.md. Caveat: typos in tag or summary slugs will now pass silently. A follow-up could parse the OpenAPI specs to validate against the real tag/summary kebab pairs if hand-written api-reference links become common enough to warrant it. * fix: correct CORS anchor in ja environments page Inline link [CORS設定](#cors 設定) used a literal space, but the heading "CORS 設定" slugifies to "cors-設定" (whitespace replaced with hyphen). Update the anchor to #cors-設定. * fix: align datasource-plugins is_published and 200 description with node framing * fix: clarify chatflow stream workflow events task_id origin * docs: address HITL Service API reader-test feedback
1 parent 856b5bc commit 68879a7

24 files changed

Lines changed: 4255 additions & 735 deletions

File tree

.claude/skills/dify-docs-env-vars/SKILL.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,16 @@ The script reports:
137137

138138
Use `.env.example` defaults (what Docker Compose users actually get), not Pydantic code defaults.
139139

140+
### Intentionally ignored variables
141+
142+
Some variables in `.env.example` are deliberately not documented (Cloud-only, experimental, or verifier false positives). The verifier reads these from `ignored-vars.md` (same directory) and filters them out. When you:
143+
144+
- Remove a variable from the docs as Cloud-only → add it under **Cloud-only (SaaS)** in `ignored-vars.md`.
145+
- Skip documenting an experimental or internal flag → add it under **Experimental / internal**.
146+
- Document a supported variable whose `.env.example` entry is commented out → add it under **Verifier false positives**.
147+
148+
Every entry must include a source reference (PR, commit, or audit date).
149+
140150
## Translation
141151

142152
The automated translation pipeline does not cover `en/self-host/configuration/environments.mdx`. After editing that English file, manually update `zh/self-host/configuration/environments.mdx` and `ja/self-host/configuration/environments.mdx` to match.

.claude/skills/dify-docs-env-vars/deep-dive.md

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1277,3 +1277,205 @@ Two modes: `basic` (username/password via http_auth) and `aws_managed_iam` (SigV
12771277
### OCEANBASE_ENABLE_HYBRID_SEARCH
12781278

12791279
Similar to Milvus—enables fulltext index creation for BM25 queries alongside vector search. Requires OceanBase >= 4.3.5.1. Collections must be recreated after enabling.
1280+
1281+
---
1282+
1283+
## 1.14 Additions (traced 2026-04-22)
1284+
1285+
### REDIS_KEY_PREFIX
1286+
1287+
**Default:** `""` (empty)
1288+
1289+
**What it actually does:** Prepends a string namespace to every Redis key that Dify writes, so multiple Dify deployments can safely share one Redis server. When set to `staging:`, a `get("session_token:abc")` call becomes `GET staging:session_token:abc` on the wire.
1290+
1291+
The prefix is threaded through `RedisClientWrapper` in `api/extensions/ext_redis.py` via helpers in `api/extensions/redis_names.py` (`serialize_redis_name`, `serialize_redis_name_arg`, `serialize_redis_name_args`, `normalize_redis_key_prefix`). Every wrapper method — `get`, `set`, `setex`, `delete`, `incr`, `expire`, `exists`, `ttl`, `lock`, `hset`, `zadd`, and so on — prefixes its name argument before forwarding. `delete(*names)` and `exists(*names)` prefix every name.
1292+
1293+
Beyond direct key operations, the prefix is also applied to:
1294+
1295+
- **Pub/Sub channels**`libs/broadcast_channel/redis/channel.py`, `sharded_channel.py`
1296+
- **Redis Streams**`libs/broadcast_channel/redis/streams_channel.py`
1297+
- **Celery Redis transport** — applied as Celery's `global_keyprefix` transport option in `api/extensions/ext_celery.py`, so broker queues and result-backend keys follow the same namespace
1298+
- **DB migration locks**`libs/db_migration_lock.py`
1299+
1300+
`normalize_redis_key_prefix()` strips whitespace; whitespace-only values are treated as empty (no prefixing).
1301+
1302+
**If left empty:** Keys are written unprefixed (backward-compatible with existing deployments). Correct choice when Dify has Redis to itself.
1303+
1304+
**If set:** Every key, channel, stream, and Celery artifact is namespaced. Existing data written without the prefix becomes invisible to the new client — plan a wipe or dual-run when switching.
1305+
1306+
**Key code locations:**
1307+
- Definition: `api/configs/middleware/cache/redis_config.py`
1308+
- Wrapper plumbing: `api/extensions/ext_redis.py`, `api/extensions/redis_names.py`
1309+
- Celery: `api/extensions/ext_celery.py`
1310+
- Broadcast channels: `api/libs/broadcast_channel/redis/{channel,sharded_channel,streams_channel}.py`
1311+
- Migration lock: `api/libs/db_migration_lock.py`
1312+
1313+
**Source:** PR #35139 (issue #35138), merged 2026-04-14.
1314+
1315+
---
1316+
1317+
### REDIS_RETRY_RETRIES / REDIS_RETRY_BACKOFF_BASE / REDIS_RETRY_BACKOFF_CAP
1318+
1319+
**Defaults:** `3`, `1.0`, `10.0`
1320+
1321+
**What they actually do:** `_get_retry_policy()` in `api/extensions/ext_redis.py` constructs a shared `redis.retry.Retry` object with `ExponentialWithJitterBackoff(base=BACKOFF_BASE, cap=BACKOFF_CAP)` and `retries=RETRIES`. The policy is attached to every standalone, Sentinel, and Cluster client (via `_get_connection_health_params()` / `_get_cluster_connection_health_params()`), and also to pub/sub clients built by `_create_pubsub_client()`.
1322+
1323+
When `redis-py` encounters transient failures (`ConnectionError`, `TimeoutError`, `socket.timeout`), it calls `Retry.call_with_retry()`, which sleeps `min(base * (2^attempt) + jitter, cap)` seconds between attempts, up to `retries` attempts. With the defaults, worst-case wait before surfacing the error is roughly `1s + 2s + 4s = 7s` plus jitter, capped at 10s per sleep.
1324+
1325+
**If left at default:** Most transient hiccups (master failover, brief DNS blip, half-open socket) are invisible to callers. Worst-case latency cost on a bad command is bounded.
1326+
1327+
**If `REDIS_RETRY_RETRIES=0`:** No retry; every transient error propagates immediately. Matches pre-1.14 behavior.
1328+
1329+
**If backoff values are raised:** Longer tails but more patience for slow failovers. Lowered: faster failure but less resilience.
1330+
1331+
**Key code locations:**
1332+
- Definition: `api/configs/middleware/cache/redis_config.py`
1333+
- Policy construction: `_get_retry_policy()` in `api/extensions/ext_redis.py`
1334+
- Applied via: `_get_connection_health_params()`, `_get_cluster_connection_health_params()`, `_get_base_redis_params()`, `_create_pubsub_client()`
1335+
1336+
**Source:** PR #34566 (issue #34557), merged 2026-04-09.
1337+
1338+
---
1339+
1340+
### REDIS_SOCKET_TIMEOUT / REDIS_SOCKET_CONNECT_TIMEOUT
1341+
1342+
**Defaults:** `5.0`, `5.0`
1343+
1344+
**What they actually do:** `socket_timeout` bounds how long each Redis command waits on a read/write on an already-established connection; `socket_connect_timeout` bounds how long the TCP handshake phase can take. Both are part of `RedisBaseParamsDict` in `_get_base_redis_params()` and flow into every client type — `redis.ConnectionPool`, `Sentinel.master_for()`, `RedisCluster`, and pub/sub clients all receive them.
1345+
1346+
Before PR #34566, the main backend clients built through `ConnectionPool(**redis_params)` / `sentinel.master_for(...)` / `RedisCluster.from_url(...)` used `redis-py`'s internal default (no socket timeout on standalone), which meant commands could block indefinitely on a silently-dropped connection.
1347+
1348+
**If left at default:** Stuck connections surface as timeouts after 5 seconds. Appropriate for most local or same-region deployments.
1349+
1350+
**If increased:** Necessary for cloud or WAN deployments where p99 network latency exceeds 5s under load. The existing `REDIS_SENTINEL_SOCKET_TIMEOUT` doc already notes this pattern for Sentinel; the same reasoning applies to the main client.
1351+
1352+
**Key code locations:**
1353+
- Definition: `api/configs/middleware/cache/redis_config.py`
1354+
- Used in: `_get_connection_health_params()`, `_get_cluster_connection_health_params()` in `api/extensions/ext_redis.py`
1355+
1356+
**Source:** PR #34566, merged 2026-04-09.
1357+
1358+
---
1359+
1360+
### REDIS_HEALTH_CHECK_INTERVAL
1361+
1362+
**Default:** `30` (seconds)
1363+
1364+
**What it actually does:** `redis-py`'s `Connection` class sends a PING on a connection if it has been idle longer than this many seconds before reusing it. Catches half-open sockets that the kernel hasn't noticed yet (e.g., after a NAT rebind or a silent LB timeout). Set to `0` to disable.
1365+
1366+
**Important asymmetry:** The parameter is passed only in `_get_connection_health_params()` (standalone + Sentinel). `_get_cluster_connection_health_params()` explicitly drops it — see the inline comment in `ext_redis.py`:
1367+
1368+
> "RedisCluster does not support `health_check_interval` as a constructor keyword (it is silently stripped by `cleanup_kwargs`), so it is excluded here. Only `retry`, `socket_timeout`, and `socket_connect_timeout` are passed through."
1369+
1370+
This is a known `redis-py` quirk. The doc row explicitly flags it so cluster users don't waste time tuning a no-op.
1371+
1372+
**If left at default:** Background PINGs every 30s on idle connections prevent stale-connection errors.
1373+
1374+
**If set to 0:** No background health checks. Saves a tiny bit of traffic; acceptable if load is high enough that every connection is used constantly.
1375+
1376+
**Key code locations:**
1377+
- Definition: `api/configs/middleware/cache/redis_config.py`
1378+
- Application: `_get_connection_health_params()` in `api/extensions/ext_redis.py`
1379+
- Cluster exclusion: `_get_cluster_connection_health_params()` in the same file
1380+
1381+
**Source:** PR #34566, merged 2026-04-09.
1382+
1383+
---
1384+
1385+
### BAIDU_VECTOR_DB_AUTO_BUILD_ROW_COUNT_INCREMENT / BAIDU_VECTOR_DB_AUTO_BUILD_ROW_COUNT_INCREMENT_RATIO
1386+
1387+
**Defaults:** `500`, `0.05`
1388+
1389+
**What they actually do:** Control when the Baidu Vector DB backend rebuilds its ANN index automatically. The Baidu SDK treats them as the "absolute row increase" and "relative row increase" thresholds; when either is exceeded, the index is rebuilt in the background.
1390+
1391+
Defined in `api/configs/middleware/vdb/baidu_vector_config.py` on `BaiduVectorDBConfig`; passed to the Baidu backend factory when initializing a collection. Only meaningful when `VECTOR_STORE=baidu`.
1392+
1393+
**If left at default:** Index rebuilds are triggered by 500 new rows OR a 5% increase, whichever happens first. Keeps search quality high for typical workloads.
1394+
1395+
**If raised:** Fewer rebuilds, lower CPU churn, but search quality degrades between rebuilds.
1396+
1397+
**If lowered:** More frequent rebuilds, higher background load, freshest index.
1398+
1399+
**Key code locations:**
1400+
- Definition: `api/configs/middleware/vdb/baidu_vector_config.py`
1401+
- Factory: `api/providers/vdb/vdb-baidu/src/dify_vdb_baidu/baidu_vector.py` (post 1.14 workspace refactor)
1402+
1403+
---
1404+
1405+
### BAIDU_VECTOR_DB_REBUILD_INDEX_TIMEOUT_IN_SECONDS
1406+
1407+
**Default:** `300`
1408+
1409+
**Code inconsistency to flag:** The Pydantic `Field` description in `baidu_vector_config.py` reads "default is 3600 seconds" but the actual `default=300`. `docker/.env.example` also uses 300. Document 300 (what users actually get); the description string is stale and should be flagged upstream.
1410+
1411+
**What it actually does:** Maximum wall-clock time the client waits for a Baidu VDB index rebuild before raising a timeout. 300 seconds (5 minutes) is adequate for small-to-medium collections; large collections (millions of rows) may need more.
1412+
1413+
**If it times out:** The client-side call fails, but the rebuild may still complete on the server. Re-querying after the rebuild succeeds typically resolves the error.
1414+
1415+
**Key code locations:**
1416+
- Definition: `api/configs/middleware/vdb/baidu_vector_config.py`
1417+
1418+
---
1419+
1420+
### COMPOSE_WORKER_HEALTHCHECK_DISABLED / _INTERVAL / _TIMEOUT
1421+
1422+
**Defaults:** `true`, `30s`, `30s`
1423+
1424+
**What they actually do:** Purely Docker Compose concerns. In `docker/docker-compose.yaml`, the `worker` service's `healthcheck:` block resolves to:
1425+
1426+
```yaml
1427+
test: ["CMD-SHELL", "celery -A celery_healthcheck.celery inspect ping"]
1428+
interval: ${COMPOSE_WORKER_HEALTHCHECK_INTERVAL:-30s}
1429+
timeout: ${COMPOSE_WORKER_HEALTHCHECK_TIMEOUT:-30s}
1430+
retries: 3
1431+
start_period: 60s
1432+
disable: ${COMPOSE_WORKER_HEALTHCHECK_DISABLED:-true}
1433+
```
1434+
1435+
`celery inspect ping` is a synchronous command that round-trips through the broker to ask every worker "are you alive?" and waits for replies. Under heavy load it can itself take significant time and contribute to broker contention, which is why the health check is **disabled by default**.
1436+
1437+
**If disabled (default):** Compose marks the worker container healthy based on process liveness only (PID 1 running). Lighter but won't detect a hung worker that's still alive at the process level.
1438+
1439+
**If enabled (`COMPOSE_WORKER_HEALTHCHECK_DISABLED=false`):** Compose runs `celery inspect ping` every `INTERVAL` with a `TIMEOUT` per attempt. Three consecutive failures mark the container unhealthy, which triggers Compose restart policies or orchestration reactions. Useful when operators have observed hung-worker incidents and the added broker traffic is acceptable.
1440+
1441+
`INTERVAL` and `TIMEOUT` accept Docker Compose duration strings (`30s`, `1m`, `1m30s`).
1442+
1443+
**Key code locations:**
1444+
- Definition: `docker/.env.example`, `docker/docker-compose.yaml`
1445+
- No Pydantic config; these are Compose-only, not read by Python code.
1446+
1447+
---
1448+
1449+
### ALLOW_INLINE_STYLES
1450+
1451+
**Default:** `false`
1452+
1453+
**What it actually does:** Frontend-only security toggle. `web/docker/entrypoint.sh` maps the operator-facing `ALLOW_INLINE_STYLES` (set in `docker/.env`) to `NEXT_PUBLIC_ALLOW_INLINE_STYLES` for the Next.js runtime:
1454+
1455+
```bash
1456+
export NEXT_PUBLIC_ALLOW_INLINE_STYLES=${ALLOW_INLINE_STYLES:-false}
1457+
```
1458+
1459+
The frontend's Markdown sanitizer reads `NEXT_PUBLIC_ALLOW_INLINE_STYLES` to decide whether to allow inline `style="..."` attributes and `<style>` tags in user-generated Markdown (chat responses, knowledge base content, and so on). Disabled by default because inline styles can be abused for phishing (e.g., hiding a malicious link behind a styled block that overlays trusted UI).
1460+
1461+
**If disabled (default):** Markdown rendering strips inline styles. User-authored content still renders, just without custom styling.
1462+
1463+
**If enabled:** Inline styles pass through. Enable only if your content pipeline is trusted and you need rich visual control from Markdown authors.
1464+
1465+
**Key code locations:**
1466+
- Mapping: `web/docker/entrypoint.sh`
1467+
- Default: `docker/.env.example` (root) and `web/.env.example` (source-code deployments)
1468+
1469+
---
1470+
1471+
### CELERY_WORKER_AMOUNT — default correction
1472+
1473+
The existing entry near line 937 describes behavior correctly, but the stated default ("1") no longer matches `docker/.env.example`, which sets `CELERY_WORKER_AMOUNT=4` (consumed by `docker-compose.yaml` via `${CELERY_WORKER_AMOUNT:-4}`). Docs updated to `4`.
1474+
1475+
**Why the change matters:** 4 is a better out-of-the-box baseline for a machine with a few cores; users with lighter workloads can still set it lower, and `CELERY_AUTO_SCALE=true` overrides it entirely.
1476+
1477+
---
1478+
1479+
### POSTGRES_MAX_CONNECTIONS — default correction
1480+
1481+
Covered in the "PostgreSQL / MySQL Performance Tuning Variables" section. `docker/.env.example` bumped the default from `100` to `200` upstream (`docker-compose.yaml` passes it as `-c max_connections=${POSTGRES_MAX_CONNECTIONS:-200}` to the Postgres container). The higher default is safer for Dify's multi-worker + Celery + async-task traffic shape; operators can still lower it on constrained hosts.
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Intentionally Ignored Environment Variables
2+
3+
Variables listed here appear in Dify's `docker/.env.example` or `api/configs/`, but are deliberately **not** documented in `en/self-host/configuration/environments.mdx`. The verifier script reads this file and skips matching variables when comparing docs against `.env.example`.
4+
5+
## When to update this list
6+
7+
Add an entry when you:
8+
9+
- Remove a variable from the docs because it only applies to Dify Cloud.
10+
- Skip documenting a new variable because it's experimental, internal, or not user-tunable.
11+
- Identify a verifier false positive (e.g., the variable is commented-out in `.env.example` but documented because the code supports it).
12+
13+
Remove an entry when the reason no longer holds (e.g., an experimental flag graduates to a stable, user-facing feature).
14+
15+
Every entry requires: variable name, category, reason, and a source reference (commit, PR, or issue). This enforces traceability so later maintainers can audit the decision.
16+
17+
## Format
18+
19+
The verifier parses the tables below. A line is treated as an ignore entry when it matches `| \`VARIABLE_NAME\` | ...`. Additional columns are informational.
20+
21+
---
22+
23+
## Cloud-only (SaaS)
24+
25+
Meaningful only on the hosted Dify Cloud deployment; self-hosted users cannot use or benefit from them. Removing these from the self-host docs prevents confusion.
26+
27+
| Variable | Reason | Source |
28+
|---|---|---|
29+
| `ENABLE_WEBSITE_JINAREADER` | Cloud UI feature flag for Jina Reader crawler. | PR #721, commit 9248032 |
30+
| `ENABLE_WEBSITE_FIRECRAWL` | Cloud UI feature flag for Firecrawl. | PR #721, commit 9248032 |
31+
| `ENABLE_WEBSITE_WATERCRAWL` | Cloud UI feature flag for WaterCrawl. | PR #721, commit 9248032 |
32+
| `NEXT_PUBLIC_ENABLE_SINGLE_DOLLAR_LATEX` | Cloud-specific UI toggle. | PR #721, commit 9248032 |
33+
| `TIDB_API_URL` | TiDB Cloud control plane. | PR #721, commit 9248032 |
34+
| `TIDB_IAM_API_URL` | TiDB Cloud IAM control plane. | PR #721, commit 9248032 |
35+
| `TIDB_PRIVATE_KEY` | TiDB Cloud credential. | PR #721, commit 9248032 |
36+
| `TIDB_PUBLIC_KEY` | TiDB Cloud credential. | PR #721, commit 9248032 |
37+
| `TIDB_PROJECT_ID` | TiDB Cloud project reference. | PR #721, commit 9248032 |
38+
| `TIDB_REGION` | TiDB Cloud region. | PR #721, commit 9248032 |
39+
| `TIDB_SPEND_LIMIT` | TiDB Cloud billing guard. | PR #721, commit 9248032 |
40+
| `TIDB_ON_QDRANT_URL` | Hybrid TiDB-Qdrant Cloud-only backend. | PR #721, commit 9248032 |
41+
| `TIDB_ON_QDRANT_API_KEY` | Hybrid TiDB-Qdrant Cloud-only backend. | PR #721, commit 9248032 |
42+
| `TIDB_ON_QDRANT_CLIENT_TIMEOUT` | Hybrid TiDB-Qdrant Cloud-only backend. | PR #721, commit 9248032 |
43+
| `TIDB_ON_QDRANT_GRPC_ENABLED` | Hybrid TiDB-Qdrant Cloud-only backend. | PR #721, commit 9248032 |
44+
| `TIDB_ON_QDRANT_GRPC_PORT` | Hybrid TiDB-Qdrant Cloud-only backend. | PR #721, commit 9248032 |
45+
| `CREATE_TIDB_SERVICE_JOB_ENABLED` | Cloud-side TiDB pre-provisioning job. | PR #721, commit 9248032 |
46+
| `AMPLITUDE_API_KEY` | Cloud product analytics integration. | PR #721, commit 9248032 |
47+
48+
## Experimental / internal
49+
50+
Feature flags for unfinished or staff-only features. Not yet meant for self-hosted tuning.
51+
52+
| Variable | Reason | Source |
53+
|---|---|---|
54+
| `EXPERIMENTAL_ENABLE_VINEXT` | Switches the web container to an experimental Vite-based server (`web/docker/entrypoint.sh`). Not a supported user-facing knob. | 1.14 sync audit, 2026-04-22 |
55+
56+
## Verifier false positives
57+
58+
The variable is documented in `environments.mdx` and supported by the backend, but the verifier reports it as missing from `.env.example` because the example entry is commented out.
59+
60+
| Variable | Reason | Source |
61+
|---|---|---|
62+
| `ALIYUN_CLOUDBOX_ID` | Commented-out `#ALIYUN_CLOUDBOX_ID=your-cloudbox-id` in `docker/.env.example`; backend field exists in `api/configs/middleware/storage/aliyun_oss_storage_config.py`. | 1.14 sync audit, 2026-04-22 |

0 commit comments

Comments
 (0)