Skip to content

Commit af83fa1

Browse files
authored
docs(enterprise): clarify Processing Engine in a multi-node cluster (#7185)
* docs(enterprise): clarify Processing Engine in a multi-node cluster Verified against InfluxDB 3 Enterprise 3.9.1 and the influxdb3-ref-network-telemetry reference architecture. - Reframe "process nodes" as "process-capable nodes" in clustering.md and document that --plugin-dir + --node-spec — not --mode — gate trigger execution. - Replace the "Dedicated process-only node" example with a --mode=process,query example so plugins can call influxdb3_local.query() locally; link the cross-node write-back pattern. - Document that every cluster node needs --plugin-dir configured because the catalog validates registered triggers cluster-wide. - Expand the --mode=process bullet in config-options.md and influxdb3-processing-engine.md to describe its actual semantics (no API surface; activates the engine; combine with another mode). - Add a "Pin a trigger to specific nodes in a cluster" section to the create trigger reference, including the verified error surfaces: invalid-node-name (HTTP 500) at create time, mode-mismatch failures at execution time, request-trigger 404 from unpinned nodes. - Rewrite the "Distributed cluster considerations" section in shared/influxdb3-plugins/_index.md with the verified WAL fan-out, schedule write-back, and request routing semantics. - Add a new admin page covering how to start a cluster and troubleshoot common Processing Engine misconfigurations. Closes influxdata/DAR#685 * fix(links): correct --plugin-dir and WAL anchor targets The --plugin-dir option is documented at config-options/#plugin-dir, not at cli/influxdb3/serve/#plugin-dir. The WAL section anchor is #write-ahead-log-wal-persistence, not #write-ahead-log-wal.
1 parent e9498d9 commit af83fa1

6 files changed

Lines changed: 348 additions & 35 deletions

File tree

content/influxdb3/enterprise/admin/clustering.md

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ cluster efficiency.
2828
- [Configure ingest nodes](#configure-ingest-nodes)
2929
- [Configure query nodes](#configure-query-nodes)
3030
- [Configure compactor nodes](#configure-compactor-nodes)
31-
- [Configure process nodes](#configure-process-nodes)
31+
- [Configure process-capable nodes](#configure-process-capable-nodes)
3232
- [Multi-mode configurations](#multi-mode-configurations)
3333
- [Cluster architecture examples](#cluster-architecture-examples)
3434
- [Scale your cluster](#scale-your-cluster)
@@ -45,8 +45,8 @@ In an {{% product-name %}} cluster, you can dedicate nodes to specific tasks:
4545
- **Ingest nodes**: Optimized for high-throughput data ingestion
4646
- **Query nodes**: Maximized for complex analytical queries
4747
- **Compactor nodes**: Dedicated to data compaction and optimization
48-
- **Process nodes**: Focused on data processing and transformations
49-
- **All-in-one nodes**: Balanced for mixed workloads
48+
- **Process-capable nodes**: Any node with `--plugin-dir` configured can execute Processing Engine plugins. Use [`--node-spec`](/influxdb3/enterprise/reference/cli/influxdb3/create/trigger/#options) when creating a trigger to pin its execution to specific nodes.
49+
- **All-in-one nodes**: Balanced for mixed workloads (single-node deployments only)
5050

5151
## Configure node modes
5252

@@ -69,7 +69,7 @@ Available modes:
6969
- `ingest`: Data ingestion and line protocol parsing
7070
- `query`: Query execution and data retrieval
7171
- `compact`: Background compaction and optimization
72-
- `process`: Data processing and transformations
72+
- `process`: Activates the Processing Engine. `process` has no API surface of its own — it activates the Python virtual machine that runs trigger plugins. Setting [`--plugin-dir`](/influxdb3/enterprise/reference/config-options/#plugin-dir) implies `process` mode, so you rarely need to set `process` explicitly. In a multi-node cluster, combine `process` with another mode (typically `query`, so plugins can call `influxdb3_local.query()` against the local engine) — see [Configure process-capable nodes](#configure-process-capable-nodes).
7373

7474
> [!Warning]
7575
> #### Don't use all mode in a multi-node cluster
@@ -87,11 +87,11 @@ Available modes:
8787
Every node has two thread pools that must be properly configured:
8888

8989
1. **IO threads**: Parse line protocol, handle HTTP requests
90-
2. **DataFusion threads**: Execute queries, create data snapshots (convert [WAL data](/influxdb3/enterprise/reference/internals/durability/#write-ahead-log-wal) to Parquet files), perform compaction
90+
2. **DataFusion threads**: Execute queries, create data snapshots (convert [WAL data](/influxdb3/enterprise/reference/internals/durability/#write-ahead-log-wal-persistence) to Parquet files), perform compaction
9191

9292
> [!Note]
9393
> Even specialized nodes need both thread types. Ingest nodes use DataFusion threads
94-
> for creating data snapshots that convert [WAL data](/influxdb3/enterprise/reference/internals/durability/#write-ahead-log-wal) to Parquet files, and query nodes use IO threads for handling requests.
94+
> for creating data snapshots that convert [WAL data](/influxdb3/enterprise/reference/internals/durability/#write-ahead-log-wal-persistence) to Parquet files, and query nodes use IO threads for handling requests.
9595
9696
## Configure ingest nodes
9797

@@ -244,11 +244,20 @@ You can adjust compaction strategies to balance performance and resource usage:
244244
--compaction-cleanup-wait=10m
245245
```
246246

247-
## Configure process nodes
247+
## Configure process-capable nodes
248248

249-
Process nodes handle data transformations and processing plugins.
250-
Setting `--plugin-dir` automatically adds `process` mode to any node, so you don't need to explicitly set `--mode=process`.
251-
If you do set `--mode=process`, you must also set `--plugin-dir`.
249+
Any node with [`--plugin-dir`](/influxdb3/enterprise/reference/config-options/#plugin-dir) configured can execute Processing Engine plugins.
250+
Setting `--plugin-dir` implicitly adds `process` mode regardless of the node's other modes; explicit `--mode=process` requires `--plugin-dir` to be set.
251+
252+
> [!Important]
253+
> #### Configure `--plugin-dir` on every cluster node
254+
>
255+
> The Enterprise catalog registers triggers cluster-wide.
256+
> Every node validates the registered triggers at startup, even nodes that don't execute them — for example, ingest-only and compact-only nodes.
257+
> If a plugin file referenced by a registered trigger is missing on a node, the engine panics on startup.
258+
>
259+
> Configure `--plugin-dir` on every node and make the same plugin files available to each one (for example, by mounting a shared directory in your container or pod spec).
260+
> Use [`--node-spec`](/influxdb3/enterprise/reference/cli/influxdb3/create/trigger/#options) on each trigger to control which nodes actually execute it.
252261
253262
### Enable the Processing Engine on any node
254263

@@ -263,9 +272,10 @@ influxdb3 \
263272
--cluster-id=prod-cluster
264273
```
265274

266-
### Dedicated process-only node (16 cores)
275+
### Process + query node (16 cores)
267276

268-
To create a node that only handles processing (no ingest, query, or compaction), set `--mode=process`:
277+
The recommended pattern for a node that hosts schedule plugins.
278+
Combining `process` with `query` lets plugins call `influxdb3_local.query()` against the local engine without an extra network hop:
269279

270280
```bash
271281
influxdb3 \
@@ -274,11 +284,19 @@ influxdb3 \
274284
--num-cores=16 \
275285
--datafusion-num-threads=12 \
276286
--plugin-dir=/path/to/plugins \
277-
--mode=process \
287+
--mode=process,query \
278288
--node-id=processor-01 \
279289
--cluster-id=prod-cluster
280290
```
281291

292+
A node in `process,query` mode doesn't accept writes locally.
293+
Schedule plugins running on it that need to write results back to the cluster must POST line protocol to an ingest node.
294+
295+
> [!Note]
296+
> #### Cross-node write-back example
297+
>
298+
> The [`influxdb3-ref-network-telemetry`](https://github.com/influxdata/influxdb3-ref-network-telemetry) reference architecture's [`plugins/_writeback.py`](https://github.com/influxdata/influxdb3-ref-network-telemetry/blob/main/plugins/_writeback.py) helper round-robins writes across configured ingest URLs with one fallback hop on connection error.
299+
282300
## Multi-mode configurations
283301

284302
Some deployments benefit from nodes handling multiple responsibilities.
Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
---
2+
title: Run the Processing Engine in a cluster
3+
description: >
4+
Configure, start, and troubleshoot Processing Engine plugins in a multi-node
5+
InfluxDB 3 Enterprise cluster — including how triggers fan out across nodes,
6+
how to pin triggers, and how to recognize common misconfiguration errors.
7+
menu:
8+
influxdb3_enterprise:
9+
name: Run the Processing Engine in a cluster
10+
parent: Administer InfluxDB
11+
weight: 102
12+
related:
13+
- /influxdb3/enterprise/admin/clustering/
14+
- /influxdb3/enterprise/plugins/
15+
- /influxdb3/enterprise/reference/processing-engine/
16+
- /influxdb3/enterprise/reference/cli/influxdb3/create/trigger/
17+
- /influxdb3/enterprise/reference/cli/influxdb3/serve/
18+
influxdb3/enterprise/tags: [processing engine, plugins, clustering, triggers, troubleshooting]
19+
---
20+
21+
This guide covers how the Processing Engine behaves in a multi-node {{% product-name %}} cluster and how to troubleshoot common misconfigurations.
22+
23+
For single-node deployments, defaults work as documented in [Set up the Processing Engine](/influxdb3/enterprise/plugins/#set-up-the-processing-engine).
24+
The cluster-specific behavior described here applies when you run more than one `influxdb3 serve` process against a shared catalog and object store.
25+
26+
- [How trigger execution works in a cluster](#how-trigger-execution-works-in-a-cluster)
27+
- [Start the cluster](#start-the-cluster)
28+
- [Worked example: 5-node reference architecture](#worked-example-5-node-reference-architecture)
29+
- [Troubleshoot misconfigurations](#troubleshoot-misconfigurations)
30+
31+
## How trigger execution works in a cluster
32+
33+
Three independent factors determine whether a Processing Engine trigger runs on a given node:
34+
35+
1. The node has [`--plugin-dir`](/influxdb3/enterprise/reference/config-options/#plugin-dir) configured.
36+
2. The trigger's [`--node-spec`](/influxdb3/enterprise/reference/cli/influxdb3/create/trigger/#options) includes the node — by default, `all` (every node with `--plugin-dir`).
37+
3. The trigger's plugin imports modules that are available in that node's per-node Python virtual environment.
38+
39+
`--mode` controls which APIs the node serves (writes, queries, compaction).
40+
**`--mode` does not gate trigger execution.**
41+
A trigger pinned to a `compact`-only node still fires on that node — it just fails per tick if the plugin needs APIs the node doesn't serve.
42+
43+
| Trigger type | Pin to | Why |
44+
|----------------|--------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
45+
| WAL (`table:`) | An ingest-capable node | Each ingester owns its own WAL; pinning to one ingester sees only that node's writes. |
46+
| Schedule (`every:` or `cron:`) | A node with `process,query` mode | The plugin reads via `influxdb3_local.query()` locally; results write back to an ingester via HTTP. |
47+
| Request (`request:`) | A node with `query` mode (the host-exposed port) | The HTTP route exists only on pinned nodes; unpinned nodes return `404 not found`. |
48+
49+
## Start the cluster
50+
51+
Each cluster node runs `influxdb3 serve` with a unique `--node-id`, the same `--cluster-id`, and a shared object store and catalog.
52+
Configure `--plugin-dir` on **every** node, even nodes that don't execute plugins — see [Configure `--plugin-dir` on every cluster node](/influxdb3/enterprise/admin/clustering/#configure-process-capable-nodes).
53+
54+
```bash { placeholders="CLUSTER_ID|DATA_DIR|PLUGINS_DIR|NODE_ID" }
55+
# Ingest node
56+
influxdb3 serve \
57+
--cluster-id CLUSTER_ID \
58+
--node-id NODE_ID \
59+
--mode ingest \
60+
--object-store file \
61+
--data-dir DATA_DIR \
62+
--plugin-dir PLUGINS_DIR
63+
64+
# Query node (host-exposed)
65+
influxdb3 serve \
66+
--cluster-id CLUSTER_ID \
67+
--node-id NODE_ID \
68+
--mode query \
69+
--object-store file \
70+
--data-dir DATA_DIR \
71+
--plugin-dir PLUGINS_DIR
72+
73+
# Compact node (one per cluster)
74+
influxdb3 serve \
75+
--cluster-id CLUSTER_ID \
76+
--node-id NODE_ID \
77+
--mode compact \
78+
--object-store file \
79+
--data-dir DATA_DIR \
80+
--plugin-dir PLUGINS_DIR
81+
82+
# Process,query node (hosts schedule plugins)
83+
influxdb3 serve \
84+
--cluster-id CLUSTER_ID \
85+
--node-id NODE_ID \
86+
--mode process,query \
87+
--object-store file \
88+
--data-dir DATA_DIR \
89+
--plugin-dir PLUGINS_DIR
90+
```
91+
92+
After all nodes are up, register triggers from any node and pin them with `--node-spec`:
93+
94+
```bash { placeholders="AUTH_TOKEN|DATABASE_NAME|NODE_ID" }
95+
# Schedule trigger pinned to the process,query node
96+
influxdb3 create trigger \
97+
--database DATABASE_NAME \
98+
--token AUTH_TOKEN \
99+
--path schedule_rollup.py \
100+
--trigger-spec "every:5s" \
101+
--node-spec "nodes:NODE_ID" \
102+
hourly_rollup
103+
```
104+
105+
## Worked example: 5-node reference architecture
106+
107+
The [`influxdata/influxdb3-ref-network-telemetry`](https://github.com/influxdata/influxdb3-ref-network-telemetry) repo provides a complete 5-node {{% product-name %}} cluster you can run locally with `docker compose`:
108+
109+
- 2 ingest nodes (`--mode=ingest`)
110+
- 1 query node (`--mode=query`, host-exposed on port 8181)
111+
- 1 compact node (`--mode=compact`)
112+
- 1 process,query node (`--mode=process,query`, hosts schedule plugins)
113+
114+
The repo demonstrates:
115+
116+
- Pinning schedule triggers to the process node and request triggers to the query node with `--node-spec nodes:<id>`.
117+
- Cross-node write-back from schedule plugins via HTTP — see [`plugins/_writeback.py`](https://github.com/influxdata/influxdb3-ref-network-telemetry/blob/main/plugins/_writeback.py).
118+
- Mounting the same plugin directory on every node (including ingest and compact) for catalog validation at startup.
119+
120+
Use this repo as a template when designing your own cluster.
121+
122+
## Troubleshoot misconfigurations
123+
124+
### `invalid node name (<id>)` when creating a trigger
125+
126+
The cluster validates `--node-spec nodes:<id>` against current cluster membership at create time.
127+
A typo or unknown node ID returns `HTTP 500: invalid node name (<id>)`.
128+
129+
To fix:
130+
131+
1. List current cluster members and their node IDs:
132+
133+
```bash { placeholders="AUTH_TOKEN" }
134+
influxdb3 show nodes --token AUTH_TOKEN
135+
```
136+
137+
The `mode` column shows the node's runtime modes — `process` is included automatically on any node that has `--plugin-dir` configured.
138+
139+
2. Reissue `influxdb3 create trigger` with the correct `--node-spec`.
140+
141+
### `HTTP 404 {error: "not found"}` when calling a request trigger
142+
143+
The `/api/v3/engine/<trigger_name>` route exists only on the node(s) the trigger is pinned to.
144+
There is no internal cross-node routing for request triggers.
145+
146+
To fix:
147+
148+
- Verify the node-spec on the trigger:
149+
150+
```bash { placeholders="AUTH_TOKEN|DATABASE_NAME" }
151+
influxdb3 query \
152+
--database DATABASE_NAME \
153+
--token AUTH_TOKEN \
154+
"SELECT trigger_name, trigger_specification FROM system.processing_engine_triggers"
155+
```
156+
157+
- Either pin the trigger to the node receiving the HTTP request (typically a `query`-mode node), or route the request to a node where the trigger is already pinned.
158+
159+
### Schedule trigger logs `ModuleNotFoundError` per tick
160+
161+
The trigger fired on its pinned node, but the plugin imports a module that's not in that node's per-node Python virtual environment.
162+
163+
To fix:
164+
165+
- Install the missing package on the pinned node:
166+
167+
```bash { placeholders="PACKAGE_NAME" }
168+
influxdb3 install package PACKAGE_NAME
169+
```
170+
171+
- Or pin the trigger to a node that has the required module already installed.
172+
173+
### Engine panics on startup with a missing plugin file
174+
175+
The Enterprise catalog registers triggers cluster-wide.
176+
Every node validates the registered triggers at startup, including nodes that don't execute them.
177+
If a plugin file referenced by a registered trigger is missing on a node, the engine panics on startup.
178+
179+
To fix:
180+
181+
- Configure `--plugin-dir` on every cluster node and copy or mount the same plugin files to each one.
182+
- If a plugin was deleted but the trigger still references it, drop the orphaned trigger:
183+
184+
```bash { placeholders="AUTH_TOKEN|DATABASE_NAME|TRIGGER_NAME" }
185+
influxdb3 delete trigger \
186+
--database DATABASE_NAME \
187+
--token AUTH_TOKEN \
188+
--force TRIGGER_NAME
189+
```
190+
191+
### Plugin operations fail in administrative tools
192+
193+
If an administrative tool reports a generic plugin error against your cluster, check whether any node satisfies the request:
194+
195+
1. Confirm at least one node has `--plugin-dir` configured and runs the plugin's required mode (typically `process,query` for schedule plugins, `query` for request plugins).
196+
2. Confirm the trigger's `--node-spec` includes a running, healthy node.
197+
3. Inspect the `system.processing_engine_logs` table on the pinned node for execution errors:
198+
199+
```bash { placeholders="AUTH_TOKEN|DATABASE_NAME" }
200+
influxdb3 query \
201+
--database DATABASE_NAME \
202+
--token AUTH_TOKEN \
203+
"SELECT event_time, trigger_name, log_level, log_text \
204+
FROM system.processing_engine_logs \
205+
ORDER BY event_time DESC LIMIT 20"
206+
```

content/shared/influxdb3-cli/config-options.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ This option supports the following values:
199199
- `ingest`: Enables only data ingest capabilities
200200
- `query`: Enables only query capabilities
201201
- `compact`: Enables only compaction processes
202-
- `process`: Enables only data processing capabilities
202+
- `process`: Activates the [Processing Engine](/influxdb3/enterprise/reference/processing-engine/) so the node can execute trigger plugins. `process` has no API surface of its own — it doesn't accept writes or serve queries. Setting [`--plugin-dir`](#plugin-dir) implicitly adds `process` mode regardless of `--mode`. Conversely, `--mode=process` requires `--plugin-dir`. In a multi-node cluster, combine `process` with another mode (typically `query`) so plugins can call `influxdb3_local.query()` locally.
203203

204204
You can specify multiple modes using a comma-delimited list (for example, `ingest,query`).
205205

content/shared/influxdb3-cli/create/trigger.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,3 +303,45 @@ influxdb3 create trigger \
303303
--error-behavior disable \
304304
TRIGGER_NAME
305305
```
306+
307+
{{% show-in "enterprise" %}}
308+
309+
### Pin a trigger to specific nodes in a cluster
310+
311+
In a multi-node {{% product-name %}} cluster, the default `--node-spec all` makes every node with [`--plugin-dir`](/influxdb3/enterprise/reference/config-options/#plugin-dir) configured try to execute the trigger.
312+
For schedule triggers, this causes duplicate execution on every plugin-capable node.
313+
For request triggers, the route exists only on the node receiving the HTTP request, and other nodes return `404 not found` — there's no internal cross-node routing.
314+
315+
To pin a trigger to specific node(s), pass `--node-spec nodes:<node-id>[,<node-id>...]`:
316+
317+
```bash { placeholders="AUTH_TOKEN|DATABASE_NAME|NODE_ID" }
318+
# Pin a schedule trigger to a process-capable node
319+
influxdb3 create trigger \
320+
--database DATABASE_NAME \
321+
--token AUTH_TOKEN \
322+
--path schedule_rollup.py \
323+
--trigger-spec "every:5s" \
324+
--node-spec "nodes:NODE_ID" \
325+
hourly_rollup
326+
327+
# Pin a request trigger to a query-capable node (the node that serves HTTP)
328+
influxdb3 create trigger \
329+
--database DATABASE_NAME \
330+
--token AUTH_TOKEN \
331+
--path request_top_n.py \
332+
--trigger-spec "request:top_n" \
333+
--node-spec "nodes:NODE_ID" \
334+
top_n
335+
```
336+
337+
The cluster validates the node IDs in `--node-spec` against current cluster membership at create time.
338+
A typo or unknown node ID is rejected with `HTTP 500: invalid node name (<id>)`.
339+
340+
The cluster doesn't validate the trigger type against the pinned node's mode at create time.
341+
Pinning a schedule trigger to a `compact`-only node, or a request trigger to an `ingest`-only node, succeeds — but the trigger fails or returns `404` at execution time.
342+
Choose the pinned node by what the trigger needs at execution:
343+
344+
- **Schedule trigger** — pin to a node with `process,query` mode if the plugin reads with `influxdb3_local.query()`; otherwise the call HTTP-hops to another query node.
345+
- **Request trigger** — pin to the node(s) you want to serve external HTTP traffic. The `/api/v3/engine/<trigger_name>` route only exists on pinned nodes; clients hitting any other node receive `404 not found`.
346+
347+
{{% /show-in %}}

0 commit comments

Comments
 (0)