Skip to content

Commit 9e05891

Browse files
committed
docs: node placement
1 parent 744e08c commit 9e05891

1 file changed

Lines changed: 14 additions & 0 deletions

File tree

docs/guides/ha/overview.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,20 @@ Raft transport is **plain TCP** with no built-in encryption. Before deploying:
5353
- **Never expose the Raft port to the public internet.** An attacker with access to the Raft port can send forged messages that disrupt or hijack cluster consensus.
5454
- Ensure low-latency connectivity between nodes. Timeouts must be sized larger than the worst-case round-trip time (RTT) between any two nodes in the cluster.
5555

56+
### Node Placement
57+
58+
**Run all nodes in the same region, spread across different availability zones.**
59+
60+
This is the single most important infrastructure decision for cluster stability. All nodes must have roughly the same RTT to each other. The timing parameters (heartbeat timeout, election timeout) are sized for a single `RTT_MAX` value — if one node has materially higher latency than its peers, it degrades the entire cluster's ability to detect failures and elect leaders reliably.
61+
62+
Specifically:
63+
- **Same region, different AZs** gives uniform 5–30ms RTT and is the validated production topology. Nodes are isolated from AZ-level failures while keeping latency uniform.
64+
- **Cross-region nodes** introduce higher and asymmetric RTT (100ms+). Even a single high-latency node can destabilize the cluster under network stress.
65+
66+
This was observed directly in load testing: a 3-node cluster where one node averaged 99ms RTT (2× higher than its peers at 45–49ms) showed election times up to 284 seconds, three undetected leader elections, and one skipped cycle when 200–500ms of additional latency was injected — the same disruption level where the two lower-latency nodes recovered in under 55 seconds. Moving to a 5-node cluster with uniform ~45ms RTT across all nodes eliminated all undetected elections, reduced the worst-case election time from 284s to 66s, and reduced cascade risk from 10% of cycles to 3%.
67+
68+
If your deployment requires nodes in different regions, increase `heartbeat_timeout` and `election_timeout` to at least 4–5× the worst-case inter-node RTT, and expect slower failover. See the [timing parameters](#timing-parameters) section for tuning formulas.
69+
5670
---
5771

5872
## Configuration Reference

0 commit comments

Comments
 (0)