Skip to content

Commit bbf911f

Browse files
author
Ignacio Van Droogenbroeck
committed
docs: add enterprise deployment patterns (shared vs local storage)
New page documenting the two Arc Enterprise cluster topologies: - Shared object storage (S3/MinIO/Azure) for cloud-native deployments - Local storage with peer replication for bare metal, VMs, and edge Side-by-side comparison, decision guide, and full Docker Compose examples for both patterns. Related updates: - Refreshed the Arc Enterprise architecture hero diagram - Added two new pattern-specific diagrams - Updated clustering.md to cross-link the new page and removed the outdated "local filesystem not suitable" best practice - Bumped sidebar positions so deployment patterns appears first in the Configuration section
1 parent bb01ce1 commit bbf911f

7 files changed

Lines changed: 270 additions & 8 deletions

File tree

docs-arc-enterprise/configuration/clustering.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
11
---
2-
sidebar_position: 2
2+
sidebar_position: 3
33
---
44

55
import Tabs from '@theme/Tabs';
66
import TabItem from '@theme/TabItem';
77

88
# Clustering & High Availability
99

10-
Scale Arc horizontally with multi-node clusters. Separate write, read, and compaction workloads across dedicated nodes with automatic failover and shared storage.
10+
Scale Arc horizontally with multi-node clusters. Separate write, read, and compaction workloads across dedicated nodes with automatic failover.
11+
12+
:::tip Choose a deployment pattern first
13+
Arc Enterprise supports two cluster topologies: **shared object storage** and **local storage with peer replication**. See [Deployment Patterns](/arc-enterprise/deployment-patterns) to choose the right one for your environment before configuring a cluster.
14+
:::
1115

1216
## Overview
1317

@@ -271,17 +275,19 @@ curl -H "Authorization: Bearer $TOKEN" \
271275

272276
## Best Practices
273277

274-
1. **Use shared object storage** — All nodes must share the same storage backend (S3, MinIO, Azure Blob). Local filesystem is not suitable for multi-node clusters.
278+
1. **Pick a deployment pattern**Use [shared object storage](/arc-enterprise/deployment-patterns) (S3, MinIO, Azure) for cloud-native deployments, or [local storage with peer replication](/arc-enterprise/deployment-patterns) for bare metal, VMs, and edge. Don't mix the two in the same cluster.
275279

276280
2. **Deploy at least 2 writers** — For automatic failover, run one primary and one standby writer.
277281

278282
3. **Scale readers independently** — Add reader nodes to handle increased query load without affecting write performance.
279283

280-
4. **Use dedicated compactors**For high-throughput deployments, run compaction on dedicated nodes to avoid impacting read/write performance.
284+
4. **Use one dedicated compactor**Run compaction on a single dedicated node to avoid duplicate outputs. Enable `ARC_CLUSTER_FAILOVER_ENABLED=true` for automatic compactor failover.
281285

282286
5. **Configure seed nodes** — Reader and compactor nodes should list writer nodes as seeds for cluster discovery.
283287

284-
6. **Monitor cluster health** — Use the `/api/v1/cluster/health` endpoint with your monitoring system (Prometheus, Grafana) to detect issues early.
288+
6. **Always set a shared secret**`ARC_CLUSTER_SHARED_SECRET` is required for peer authentication. Arc refuses to start replication without it.
289+
290+
7. **Monitor cluster health** — Use the `/api/v1/cluster/health` endpoint with your monitoring system (Prometheus, Grafana) to detect issues early.
285291

286292
## Next Steps
287293

Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
---
2+
sidebar_position: 1
3+
---
4+
5+
# Deployment Patterns
6+
7+
Arc Enterprise supports two clustering topologies, each optimized for a different operational environment. The choice is about **where the Parquet files live** — and that decision shapes durability, cost, and the operational model of your cluster.
8+
9+
## The Two Patterns
10+
11+
### Pattern A: Shared Object Storage
12+
13+
![Shared storage deployment](/img/arc-enterprise-shared-storage.jpg)
14+
15+
All nodes read and write to the **same object store** — S3, MinIO, or Azure Blob. The bucket is the source of truth for Parquet files. Nodes are stateless from a data perspective: any reader can serve any query because every file is one API call away.
16+
17+
**Best for:**
18+
- Cloud deployments (AWS, GCP, Azure)
19+
- Teams that already operate object storage
20+
- Workloads where scaling readers elastically matters more than query latency
21+
- Kubernetes-native deployments with object storage
22+
23+
### Pattern B: Local Storage with Peer Replication
24+
25+
![Local-storage deployment](/img/arc-enterprise-local-storage.jpg)
26+
27+
Each node has its own **local disks** (NVMe, SSD, or attached block storage). Parquet files are replicated peer-to-peer over the cluster protocol, verified via SHA-256, and kept on every node that needs them. A Raft-backed file manifest is the cluster-wide source of truth for which files exist.
28+
29+
**Best for:**
30+
- Bare metal and virtual machine deployments
31+
- Edge, on-premises, and air-gapped environments
32+
- Defense, aerospace, industrial, and regulated workloads where shared object storage is not available
33+
- Deployments that need the lowest possible query latency (local NVMe beats network-attached storage every time)
34+
35+
## Side-by-Side Comparison
36+
37+
| Aspect | Shared Object Storage | Local Storage + Peer Replication |
38+
|--------|----------------------|----------------------------------|
39+
| **Storage layout** | Single bucket, all nodes read/write | Per-node local disks, replicated peer-to-peer |
40+
| **Source of truth** | The bucket itself | Raft-backed file manifest (FSM) |
41+
| **Durability** | Relies on S3/MinIO/Azure replication | Replicated across N cluster nodes |
42+
| **Query latency** | Network fetch from object store | Local disk I/O |
43+
| **New-node bootstrap** | Instant (no data transfer needed) | Startup catch-up pulls bytes from peers |
44+
| **Compactor outputs** | Written once to bucket, visible to all | Compactor writes locally, Raft announces, peers pull |
45+
| **Compactor failover** | Any healthy node can take over | Any healthy node can take over |
46+
| **Best deployment** | Kubernetes, cloud-native | Bare metal, VMs, edge |
47+
| **Cost model** | Object storage API calls + egress | Local disk capacity × nodes |
48+
| **Network requirements** | Reliable path to object store | Reliable path between cluster nodes |
49+
50+
## Choosing a Pattern
51+
52+
Start here:
53+
54+
1. **Do you already run S3/MinIO/Azure in production?** → Pattern A (shared).
55+
2. **Do your nodes have fast local disks and you want minimum query latency?** → Pattern B (local).
56+
3. **Is shared object storage unavailable (edge, air-gap, defense)?** → Pattern B (local).
57+
4. **Do you expect to scale readers elastically based on demand?** → Pattern A (shared).
58+
5. **Do you need a single-digit-ms query path?** → Pattern B (local).
59+
60+
You can also mix — a cluster can use shared object storage for cold data (tiered storage to S3 Glacier) while keeping hot data on local disks. See [Tiered Storage](/arc-enterprise/tiered-storage).
61+
62+
## Pattern A — Shared Storage Setup
63+
64+
### Minimal 3-node cluster (1 writer, 1 reader, 1 compactor) on MinIO
65+
66+
```yaml
67+
# docker-compose.yml
68+
services:
69+
minio:
70+
image: minio/minio
71+
command: server /data --console-address ":9001"
72+
environment:
73+
MINIO_ROOT_USER: minioadmin
74+
MINIO_ROOT_PASSWORD: minioadmin123
75+
ports: ["9001:9001"]
76+
77+
arc-writer:
78+
image: basekick/arc:latest
79+
environment:
80+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
81+
ARC_STORAGE_BACKEND: minio
82+
ARC_STORAGE_S3_BUCKET: arc-data
83+
ARC_STORAGE_S3_ENDPOINT: minio:9000
84+
ARC_STORAGE_S3_ACCESS_KEY: minioadmin
85+
ARC_STORAGE_S3_SECRET_KEY: minioadmin123
86+
ARC_STORAGE_S3_USE_SSL: "false"
87+
ARC_STORAGE_S3_PATH_STYLE: "true"
88+
ARC_CLUSTER_ENABLED: "true"
89+
ARC_CLUSTER_NODE_ID: writer-01
90+
ARC_CLUSTER_ROLE: writer
91+
ARC_CLUSTER_CLUSTER_NAME: production
92+
ARC_CLUSTER_RAFT_BOOTSTRAP: "true"
93+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
94+
ARC_CLUSTER_REPLICATION_ENABLED: "false" # not needed on shared storage
95+
ports: ["8001:8000"]
96+
97+
arc-reader:
98+
image: basekick/arc:latest
99+
environment:
100+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
101+
ARC_STORAGE_BACKEND: minio
102+
ARC_STORAGE_S3_BUCKET: arc-data
103+
ARC_STORAGE_S3_ENDPOINT: minio:9000
104+
ARC_STORAGE_S3_ACCESS_KEY: minioadmin
105+
ARC_STORAGE_S3_SECRET_KEY: minioadmin123
106+
ARC_STORAGE_S3_USE_SSL: "false"
107+
ARC_STORAGE_S3_PATH_STYLE: "true"
108+
ARC_CLUSTER_ENABLED: "true"
109+
ARC_CLUSTER_NODE_ID: reader-01
110+
ARC_CLUSTER_ROLE: reader
111+
ARC_CLUSTER_CLUSTER_NAME: production
112+
ARC_CLUSTER_SEEDS: arc-writer:9200
113+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
114+
ports: ["8002:8000"]
115+
116+
arc-compactor:
117+
image: basekick/arc:latest
118+
environment:
119+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
120+
ARC_STORAGE_BACKEND: minio
121+
ARC_STORAGE_S3_BUCKET: arc-data
122+
ARC_STORAGE_S3_ENDPOINT: minio:9000
123+
ARC_STORAGE_S3_ACCESS_KEY: minioadmin
124+
ARC_STORAGE_S3_SECRET_KEY: minioadmin123
125+
ARC_STORAGE_S3_USE_SSL: "false"
126+
ARC_STORAGE_S3_PATH_STYLE: "true"
127+
ARC_CLUSTER_ENABLED: "true"
128+
ARC_CLUSTER_NODE_ID: compactor-01
129+
ARC_CLUSTER_ROLE: compactor
130+
ARC_CLUSTER_CLUSTER_NAME: production
131+
ARC_CLUSTER_SEEDS: arc-writer:9200
132+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
133+
ARC_CLUSTER_FAILOVER_ENABLED: "true"
134+
ARC_COMPACTION_ENABLED: "true"
135+
ports: ["8003:8000"]
136+
```
137+
138+
### Key points
139+
140+
- **All nodes point to the same bucket.** The writer flushes to the bucket; readers query directly from it; the compactor reads source files, writes compacted outputs back, and deletes the sources.
141+
- **`ARC_CLUSTER_REPLICATION_ENABLED=false`** is the right choice on shared storage — there's no peer-to-peer file transfer needed because the bucket is already shared.
142+
- **Exactly one compactor node.** Multiple compactors against a shared bucket produce duplicate outputs. Arc warns you via the cluster health check if it sees more than one.
143+
- **Compactor failover** (`ARC_CLUSTER_FAILOVER_ENABLED=true`) lets the Raft leader automatically reassign the compactor lease to another healthy node if the current compactor dies. No restart required.
144+
145+
## Pattern B — Local Storage Setup
146+
147+
### Minimal 3-node cluster (1 writer, 1 reader, 1 compactor) on local disks
148+
149+
```yaml
150+
# docker-compose.yml
151+
services:
152+
arc-writer:
153+
image: basekick/arc:latest
154+
environment:
155+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
156+
ARC_STORAGE_BACKEND: local
157+
ARC_STORAGE_LOCAL_PATH: /app/data
158+
ARC_CLUSTER_ENABLED: "true"
159+
ARC_CLUSTER_NODE_ID: writer-01
160+
ARC_CLUSTER_ROLE: writer
161+
ARC_CLUSTER_CLUSTER_NAME: production
162+
ARC_CLUSTER_RAFT_BOOTSTRAP: "true"
163+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
164+
ARC_CLUSTER_REPLICATION_ENABLED: "true" # CRITICAL for local storage
165+
volumes:
166+
- writer-data:/app/data
167+
ports: ["8001:8000"]
168+
169+
arc-reader:
170+
image: basekick/arc:latest
171+
environment:
172+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
173+
ARC_STORAGE_BACKEND: local
174+
ARC_STORAGE_LOCAL_PATH: /app/data
175+
ARC_CLUSTER_ENABLED: "true"
176+
ARC_CLUSTER_NODE_ID: reader-01
177+
ARC_CLUSTER_ROLE: reader
178+
ARC_CLUSTER_CLUSTER_NAME: production
179+
ARC_CLUSTER_SEEDS: arc-writer:9200
180+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
181+
ARC_CLUSTER_REPLICATION_ENABLED: "true"
182+
volumes:
183+
- reader-data:/app/data
184+
ports: ["8002:8000"]
185+
186+
arc-compactor:
187+
image: basekick/arc:latest
188+
environment:
189+
ARC_LICENSE_KEY: "ARC-XXXX-XXXX-XXXX-XXXX"
190+
ARC_STORAGE_BACKEND: local
191+
ARC_STORAGE_LOCAL_PATH: /app/data
192+
ARC_CLUSTER_ENABLED: "true"
193+
ARC_CLUSTER_NODE_ID: compactor-01
194+
ARC_CLUSTER_ROLE: compactor
195+
ARC_CLUSTER_CLUSTER_NAME: production
196+
ARC_CLUSTER_SEEDS: arc-writer:9200
197+
ARC_CLUSTER_SHARED_SECRET: "your-cluster-secret"
198+
ARC_CLUSTER_REPLICATION_ENABLED: "true"
199+
ARC_CLUSTER_FAILOVER_ENABLED: "true"
200+
ARC_COMPACTION_ENABLED: "true"
201+
volumes:
202+
- compactor-data:/app/data
203+
ports: ["8003:8000"]
204+
205+
volumes:
206+
writer-data:
207+
reader-data:
208+
compactor-data:
209+
```
210+
211+
### How peer replication works
212+
213+
1. **Writer flushes a Parquet file locally.** The file hash (SHA-256) is computed and included in the flush.
214+
2. **The writer registers the file in the Raft manifest** via a `CommandRegisterFile` entry. This commits cluster-wide — every node now knows the file exists and where to find it.
215+
3. **Readers and compactors observe the FSM callback.** A background puller enqueues a byte-level pull from the origin peer (or any healthy peer that has a copy).
216+
4. **The puller fetches over the cluster protocol**, streams bytes, verifies the SHA-256 against the manifest, and writes to local storage. Checksum mismatches trigger retries; failed pulls fall back to other peers.
217+
5. **On node restart**, a startup catch-up walker reconciles the local manifest against the Raft FSM and pulls any files the node missed.
218+
219+
### Key points
220+
221+
- **`ARC_CLUSTER_REPLICATION_ENABLED=true`** is required — this enables the file manifest and peer puller.
222+
- **Each node has its own volume.** No shared volume, no NFS, no clustered filesystem — the replication is the primary data-plane mechanism.
223+
- **Shared secret is mandatory.** Peer fetch requests are HMAC-authenticated with the shared secret; Arc refuses to start replication without one.
224+
- **Raft leader is the writer by default.** `ARC_CLUSTER_RAFT_BOOTSTRAP=true` on the writer makes it bootstrap Raft; other nodes join via the seed. Non-leader nodes forward manifest commands to the leader transparently.
225+
226+
### Compacted file distribution
227+
228+
Compaction on local storage works the same way as ingest:
229+
230+
1. The compactor reads source Parquet files (from local storage, pulling from peers if missing).
231+
2. It produces a compacted output, writes it to its own local disk.
232+
3. It registers the new file in the Raft manifest and marks the source files as deleted.
233+
4. Every other node sees the manifest change: readers pull the compacted bytes from the compactor, and delete their local copies of the source files.
234+
235+
## Security Notes
236+
237+
Both patterns share the same security posture:
238+
239+
- **Shared secret authentication** (`cluster.shared_secret`) — required for peer discovery and, in Pattern B, for all peer file fetches. Arc refuses to boot if replication is enabled without a shared secret.
240+
- **TLS encryption** (`cluster.tls_enabled`) — optional but recommended. Encrypts the inter-node coordinator protocol, Raft transport, and peer file transfers.
241+
- **Role-based authorization on manifest mutations** — only nodes with `CanIngest` (writers) or `CanCompact` (compactors) can forward `RegisterFile` / `DeleteFile` commands to the leader. Reader nodes are rejected.
242+
243+
See [Cluster Security](/arc-enterprise/security) for full details.
244+
245+
## Common Mistakes
246+
247+
- **Multiple compactor nodes on shared storage.** This produces duplicate compacted outputs and double-counted query results. Use exactly one `ARC_CLUSTER_ROLE=compactor` and enable `ARC_CLUSTER_FAILOVER_ENABLED=true` for automatic failover.
248+
- **Mixing shared and local storage in the same cluster.** All nodes must agree on the storage model. Pick one per cluster.
249+
- **Forgetting `ARC_CLUSTER_REPLICATION_ENABLED=true` on local storage.** Without it, readers will query empty local directories.
250+
- **Using a shared volume (NFS, EFS) as "local" storage.** Don't — the concurrent-write semantics of a shared POSIX filesystem aren't what Arc expects, and you lose the durability guarantees of either pattern. Either go full shared object storage or full per-node local disks.
251+
252+
## Next Steps
253+
254+
- [Clustering Configuration Reference](/arc-enterprise/clustering) — full list of cluster config options
255+
- [Tiered Storage](/arc-enterprise/tiered-storage) — combine local hot storage with cold object storage
256+
- [Cluster Security](/arc-enterprise/security) — TLS and shared secret configuration

docs-arc-enterprise/configuration/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
sidebar_position: 1
2+
sidebar_position: 2
33
---
44

55
import Tabs from '@theme/Tabs';

docs-arc-enterprise/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,10 +65,10 @@ Upgrading from Arc OSS to Enterprise requires no data migration. Add your licens
6565
<h3>Clustering & High Availability</h3>
6666
</div>
6767
<div className="card__body">
68-
<p>Multi-node clusters with dedicated writer, reader, and compactor roles. Automatic writer failover with sub-30-second recovery.</p>
68+
<p>Multi-node clusters with dedicated writer, reader, and compactor roles. Deploy on shared object storage (S3/MinIO/Azure) or local disks with peer replication. Automatic writer and compactor failover.</p>
6969
</div>
7070
<div className="card__footer">
71-
<a className="button button--primary button--block" href="/arc-enterprise/clustering">Learn more</a>
71+
<a className="button button--primary button--block" href="/arc-enterprise/deployment-patterns">Learn more</a>
7272
</div>
7373
</div>
7474
</div>
94.8 KB
Loading
158 KB
Loading
156 KB
Loading

0 commit comments

Comments
 (0)