You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/databases/cassandra/deploy-cassandra-multi-datacenters/index.md
+48-36Lines changed: 48 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,8 @@
2
2
slug: deploy-cassandra-multi-datacenters
3
3
title: "Deploying Apache Cassandra Across Multiple Data Centers"
4
4
description: 'Deploy Apache Cassandra across multiple data centers (MDC) on Akamai cloud computing services. This guide covers VM based, containerized, and Kubernetes (LKE) deployment models; multi data center replication; monitoring with Prometheus and Grafana; cluster security; backup and recovery; and operational best practices for scaling and long term maintenance.'
Apache Cassandra is a distributed NoSQL database designed for low-latency replication across geographically separated data centers. Multi-data center deployments reduce latency for your global users, support regional failover, and help you meet data residency requirements.
24
24
25
-
This guide shows you how to deploy Apache Cassandra across multiple data centers on Akamai cloud computing services. You'll configure multi-data center replication, implement production-ready monitoring and security, and learn operational procedures for scaling and maintenance.
25
+
This guide shows you how to deploy Apache Cassandra across multiple data centers on Akamai cloud computing services. You'll configure multi-data center replication, integrate monitoring with Prometheus and Grafana, apply essential security settings, and learn operational procedures for scaling and maintenance.
26
26
27
27
## Why Multi-Data Center Cassandra on Akamai Cloud Computing Services
28
28
@@ -41,7 +41,7 @@ This guide covers three deployment approaches:
You'll configure multi-datacenter replication, implement monitoring with Prometheus and Grafana, secure your cluster with encryption and authentication, and establish backup procedures using Akamai Object Storage.
44
+
Your final deployment will span multiple data centers, surface metrics to Prometheus and Grafana, enforce essential security controls, and maintain backups in Akamai Object Storage.
- Familiarity with database concepts and networking
51
51
- Working knowledge of Apache Cassandra architecture (nodes, datacenters, replication, and gossip)
52
52
53
-
```command
54
53
{{< note >}}
55
54
**Choosing a database**
56
55
Akamai supports multiple database engines. This guide includes embedded Cassandra installation and configuration steps to support users who choose Cassandra, as it requires more setup than other options. Before proceeding, take a moment to "confirm" that Cassandra aligns with your application’s requirements and your operational experience.
57
56
{{</ note>}}
58
-
```
59
57
60
58
## Architecture and Planning
61
59
@@ -81,11 +79,13 @@ For optimal performance, attach NVMe-backed Block Storage volumes to Dedicated o
81
79
82
80
Cluster Topology:
83
81
84
-
Deploy a minimum of 3 nodes per data center (odd numbers preferred to prevent split-brain scenarios) with a replication factor of 3 for production environments. Designate 2+ stable nodes per data center as seed nodes. For capacity planning guidelines and topology best practices, see the [Cassandra Planning](https://cassandra.apache.org/doc/latest/cassandra/managing/operating/hardware.html) documentation.
82
+
Use three nodes per data center (odd numbers reduce splitbrain risk) with a replication factor of three. Assign at least two stable nodes in each data center as seed nodes. For capacity planning guidelines and topology best practices, see the [Cassandra Planning](https://cassandra.apache.org/doc/latest/cassandra/managing/operating/hardware.html) documentation.
85
83
86
84
### Multi-Region Network Planning
87
85
88
-
**Akamai Data Center Selection**
86
+
#### Akamai Data Center Selection
87
+
88
+
Before configuring multi-datacenter replication, you need to choose Akamai data centers that meet Cassandra’s latency expectations for cross region communication.
89
89
90
90
Choose data center pairs based on your latency requirements.
91
91
@@ -102,7 +102,7 @@ Choose data center pairs based on your latency requirements.
102
102
-**Asia-Pacific**: Singapore and Tokyo
103
103
-**Global**: Newark, London, Singapore
104
104
105
-
Network Architecture
105
+
#### Network Architecture
106
106
107
107
Configure the following Akamai networking features for cluster communication:
108
108
@@ -111,7 +111,7 @@ Configure the following Akamai networking features for cluster communication:
111
111
-**Private IP addresses**: Use private IPs for all node-to-node communication.
112
112
-**Network interfaces**: 1Gbps+ interfaces (standard on Dedicated and Premium CPU instances).
113
113
114
-
For complete port requirements and security configuration, see the [Cassandra Security](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html) documentation.
114
+
Review the [Cassandra Security](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html) documentation for complete port requirements and security configuration details, because these settings must be in place for your cluster to operate correctly.
115
115
116
116
### Software Versions
117
117
@@ -140,7 +140,7 @@ This section covers provisioning the underlying infrastructure for your multi-da
140
140
141
141
### Compute Instances
142
142
143
-
Provision compute instances in each selected Akamai data center according to your planning worksheet.
143
+
Provision compute instances according to the specifications in your planning worksheet. If you have not yet finalized instance types, OS selection, or deployment order, review the considerations below before proceeding.
144
144
145
145
#### Instance Selection Guidelines
146
146
@@ -159,13 +159,15 @@ Use **Ubuntu 22.04 LTS** across all nodes for consistency in maintenance and tro
159
159
160
160
#### Example topology
161
161
162
+
Use the example below to validate your instance counts and regional layout while you visualize and finalize your planning worksheet.
163
+
162
164
- Newark: 3 nodes
163
165
- London: 3 nodes
164
166
-**Total**: 6 nodes across 2 data centers
165
167
166
168
### VLAN Architecture
167
169
168
-
VLANs provide private, low‑latency connectivity between Cassandra nodes.
170
+
VLANs provide private, lowlatency connectivity between Cassandra nodes, and the VLAN design captured in your planning worksheet defines how nodes communicate within and across regions.
169
171
170
172
#### Within a data center
171
173
@@ -184,12 +186,16 @@ VLANs provide private, low‑latency connectivity between Cassandra nodes.
184
186
185
187
#### Example VLAN scheme
186
188
189
+
Use the example below to validate your VLAN assignments as you document your network design in the planning worksheet.
190
+
187
191
- Newark VLAN: 10.0.1.0/24
188
192
- London VLAN: 10.0.2.0/24
189
193
194
+
Cross-data-center VLAN behavior was not tested in this environment but reflects required Akamai Cloud architecture.
195
+
190
196
### Block Storage
191
197
192
-
Cassandra requires dedicated storage for data and commit logs.
198
+
Cassandra requires dedicated storage for data and commit logs, and documenting your block storage plan in the planning worksheet helps ensure each node is provisioned with the correct capacity and performance.
193
199
194
200
#### Volume specifications
195
201
@@ -207,14 +213,14 @@ Cassandra requires dedicated storage for data and commit logs.
207
213
208
214
#### Example mount points
209
215
216
+
The example below shows how to validate your directory layout to document block storage configuration for the planning worksheet.
217
+
210
218
- Data directory: `/var/lib/cassandra/data`
211
219
- Commit log directory: `/var/lib/cassandra/commitlog` (if using separate volume)
212
220
213
-
Cross-data-center VLAN behavior was not tested in this environment but reflects required Akamai Cloud architecture.
214
-
215
221
### Cloud Firewall Strategy
216
222
217
-
Cassandra requires specific ports for internode communication, client access, and monitoring. For complete port reference and security considerations, see the [Cassandra Security documentation](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html).
223
+
Cassandra requires specific ports for internode communication, client access, and monitoring. Use the following firewall rules and connectivity requirements to validate your network design and document the necessary settings in your planning worksheet before deployment. For complete port reference and security considerations, see the [Cassandra Security documentation](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html).
218
224
219
225
#### Cluster communication
220
226
@@ -261,8 +267,6 @@ Confirm network connectivity meets your requirements before installing Cassandra
261
267
- Same‑region multi‑DC: < 50ms
262
268
- Cross‑region: < 300ms
263
269
264
-
Cross-data-center firewall behavior and latency validation were not tested in this environment but reflect Akamai Cloud architecture.
265
-
266
270
### Infrastructure Readiness Checklist
267
271
268
272
Before proceeding to Cassandra installation, confirm:
@@ -289,6 +293,8 @@ Begin by installing and validating Cassandra on a single node first. Once the in
289
293
290
294
### Add the Apache Cassandra Repository
291
295
296
+
Add the Apache Cassandra repository as part of your installation preparation so each node can retrieve the correct 4.1.x packages; document this step in your planning worksheet to ensure consistency across regions.
{{< note>}}: Debian and Ubuntu are transitioning away from apt-key. For background about how APT verifies repository signatures or how to manage repository keys using modern gpg‑based methods, refer to the Debian documentation topic [SecureApt](https://wiki.debian.org/SecureApt). If this link becomes unavailable, search the Debian documentation for “SecureApt” or “APT repository key management”.
308
+
{{< note>}}
309
+
Debian and Ubuntu are transitioning away from apt-key. For background about how APT verifies repository signatures or how to manage repository keys using modern gpg‑based methods, refer to the Debian documentation topic [SecureApt](https://wiki.debian.org/SecureApt). If this link becomes unavailable, search the Debian documentation for “SecureApt” or “APT repository key management”.
303
310
{{< /note >}}
304
311
305
312
#### Update Package Lists
@@ -320,7 +327,7 @@ Cassandra automatically installs OpenJDK 11 as a dependency, so no separate Java
320
327
321
328
### Optional: Environment Checks
322
329
323
-
After installation, confirm your kernel and package versions match expected compatibility. Ubuntu 22.04.5 LTS is generally safe, but minimal images or custom build may vary.
330
+
After installation, confirm your kernel and package versions match expected compatibility. Ubuntu 22.04.5 LTS is generally safe, but minimal images or custom builds may vary.
324
331
```command
325
332
sudo uname -r # Check kernel version
326
333
sudo java -version # Confirm Java runtime
@@ -380,11 +387,9 @@ sudo nodetool status
380
387
```
381
388
What you should see:
382
389
383
-
-**Single-node environment:
384
-
One node listed in the **UN (Up/Normal)** state.
390
+
-**Single-node environment**: One node listed in the **UN (Up/Normal)** state.
385
391
386
-
-**Multi-node environment:
387
-
You should see each node listed with its IP address and state Nodes may briefly appear as UJ (Up/Joining) while they bootstrap
392
+
-**Multi-node environment**: You should see each node listed with its IP address and state Nodes may briefly appear as UJ (Up/Joining) while they bootstrap
388
393
389
394
#### If you expect multiple nodes but only see one
390
395
@@ -412,17 +417,19 @@ This section focuses on Akamai‑specific requirements, and cloud networking con
412
417
413
418
### Upstream Cassandra Documentation
414
419
415
-
Cassandra maintains version‑specific documentation under `/doc/<version>/`. These links point to the Apache Cassandra 4.1 documentation. Content may evolve over time, but the URLs remain stable.
420
+
Cassandra maintains version‑specific documentation under `/doc/<version>/`. These links point to the Apache Cassandra 4.1 documentation.
Cassandra no longer provides a dedicated Multi-DC initialization walkthrough. Multi-datacenter behavior is not a separate procedure: it emerges from the replication model and the configuration of snitches and `NetworkTopologyStrategy`. New users should rely on the upstream conceptual and configuration documentation rather than expecting a step-by-step Multi-DC guide.
420
423
421
-
You can deploy Cassandra nodes within the same region or across multiple regions. Multi‑region deployments are typically used for geographic redundancy, global applications, or regulatory data‑residency requirements. For guidance on designing multi‑datacenter topologies and understanding how distance affects replication and consistency, refer to the official [Cassandra documentation](https://cassandra.apache.org/doc/4.1/cassandra/architecture/dynamo.html).
424
+
1.[Multi-master Replication: Versioned Data and Tunable Consistency](https://cassandra.apache.org/doc/4.1/cassandra/architecture/dynamo.html#multi-master-replication:-versioned-data-and-tunable-consistency)explains the replication model that underpins multi-DC behavior.
425
+
2.[`cassandra.yaml` Configuration Reference](https://cassandra.apache.org/doc/4.1/cassandra/configuration/cass_yaml_file.html) covers the settings used when configuring datacenter and rack placement.
426
+
3.[Snitch Architecture](https://cassandra.apache.org/doc/4.1/cassandra/architecture/snitch.html)describes how Cassandra identifies datacenters and routes requests.
427
+
428
+
You can deploy Cassandra nodes within a single region or across multiple regions. Multi‑region deployments are typically used for geographic redundancy, global applications, or regulatory data‑residency requirements. For guidance on designing multi‑datacenter topologies and understanding how distance affects replication and consistency, refer to the official Cassandra documentation.
422
429
423
430
### Akamai‑Specific Multi‑DC Requirements
424
431
425
-
When deploying Cassandra on Akamai Cloud Computing Services:
432
+
These requirements supplement the upstream Cassandra documentation and describe only the Akamai specific networking and instance level considerations for multi datacenter deployments.
426
433
427
434
- Use each node’s private IP address from the Akamai instance details page.
428
435
- Ensure nodes are reachable across regions using VPC peering or shared private networks.
@@ -532,13 +539,13 @@ These are the correct defaults.
532
539
533
540
If they are missing or incorrect, Cassandra may start but clients (including `cqlsh`) will not be able to connect to the node.
534
541
535
-
#### Save the File
542
+
Save the File.(Ctrl X, Y, Enter)
536
543
537
544
### Setting Data Center and Rack Topology
538
545
539
546
Each node must declare its datacenter and rack so Cassandra can place replicas correctly in a multi-DC cluster. These values must match the snitch configuration you set earlier.
540
547
541
-
{{< note>}}: The `dc` and `rack` values in this file are your choice. They don't need to match your VM names or cloud regions-–they only need to be consistent across nodes in the same datacenter.
548
+
{{< note>}} The `dc` and `rack` values in this file are your choice. They don't need to match your VM names or cloud regions-–they only need to be **consistent across nodes** in the same datacenter.
Cassandra reads the datacenter and rack labels from `/etc/cassandra/cassandra-rackdc.properties`. Edit this file and ensure the following entries are present or updated to match your deployment:
560
+
552
561
```properties
553
562
dc=datacenter_name
554
563
rack=rack_name
555
564
```
565
+
These values must align with the datacenter and rack names you use in your `NetworkTopologyStrategy` configuration.
556
566
557
-
{{< note>}}:
567
+
{{< note>}}
558
568
If Cassandra has ever been started on this node before you changed the datacenter name, the node will not start because the stored datacenter value won't match the new one.
559
569
{{< /note >}}
560
570
@@ -616,10 +626,10 @@ These values are read at startup, so any changes require restarting the node.
616
626
617
627
Start nodes in the correct order to ensure proper cluster formation.
618
628
619
-
{{< note >}}: The "primary seed" is the first IP address in your seed list. Start that node (whose IP appears first in the seed list) first before starting the other nodes. This ensures clean cluster formation.
629
+
{{< note >}} The "primary seed" is the first IP address in your seed list. Start that node (whose IP appears first in the seed list) first before starting the other nodes. This ensures clean cluster formation.
620
630
{{< /note >}}
621
631
622
-
1.Start the primary seed in the first data center:
632
+
1.Start the primary seed in the first data center:
623
633
624
634
```command
625
635
sudo systemctl start cassandra
@@ -646,7 +656,7 @@ From any node:
646
656
sudo nodetool status
647
657
```
648
658
649
-
You should see:
659
+
In the output you should see:
650
660
651
661
- each datacenter listed
652
662
- nodes marked **UN (Up / Normal)**
@@ -697,3 +707,5 @@ For guidance on creating keyspaces, tables, and working with data, refer to the
0 commit comments