Skip to content

Commit fc52384

Browse files
committed
Last pass review edits
1 parent f6a35e1 commit fc52384

1 file changed

Lines changed: 48 additions & 36 deletions

File tree

  • docs/guides/databases/cassandra/deploy-cassandra-multi-datacenters

docs/guides/databases/cassandra/deploy-cassandra-multi-datacenters/index.md

Lines changed: 48 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
slug: deploy-cassandra-multi-datacenters
33
title: "Deploying Apache Cassandra Across Multiple Data Centers"
44
description: 'Deploy Apache Cassandra across multiple data centers (MDC) on Akamai cloud computing services. This guide covers VM based, containerized, and Kubernetes (LKE) deployment models; multi data center replication; monitoring with Prometheus and Grafana; cluster security; backup and recovery; and operational best practices for scaling and long term maintenance.'
5-
authors: ["Diana Hoober"]
6-
contributors: ["Diana Hoober"]
5+
authors: ["Akamai"]
6+
contributors: ["Akamai"]
77
published: 2026-01-31
88
keywords: ['cassandra', 'apache cassandra', 'nosql', 'database', 'distributed database', 'multi-datacenter', 'replication', 'high availability', 'docker', 'kubernetes', 'k8ssandra', 'containers', 'prometheus', 'grafana', 'monitoring', 'security', 'encryption', 'authentication', 'backup', 'disaster recovery', 'scaling', 'block storage', 'object storage', 'vlan', 'lke', 'linode kubernetes engine']
99
license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
@@ -22,7 +22,7 @@ external_resources:
2222

2323
Apache Cassandra is a distributed NoSQL database designed for low-latency replication across geographically separated data centers. Multi-data center deployments reduce latency for your global users, support regional failover, and help you meet data residency requirements.
2424

25-
This guide shows you how to deploy Apache Cassandra across multiple data centers on Akamai cloud computing services. You'll configure multi-data center replication, implement production-ready monitoring and security, and learn operational procedures for scaling and maintenance.
25+
This guide shows you how to deploy Apache Cassandra across multiple data centers on Akamai cloud computing services. You'll configure multi-data center replication, integrate monitoring with Prometheus and Grafana, apply essential security settings, and learn operational procedures for scaling and maintenance.
2626

2727
## Why Multi-Data Center Cassandra on Akamai Cloud Computing Services
2828

@@ -41,7 +41,7 @@ This guide covers three deployment approaches:
4141
- Container-based deployment - Docker containers
4242
- Kubernetes orchestration - Linode Kubernetes Engine (LKE)
4343

44-
You'll configure multi-datacenter replication, implement monitoring with Prometheus and Grafana, secure your cluster with encryption and authentication, and establish backup procedures using Akamai Object Storage.
44+
Your final deployment will span multiple data centers, surface metrics to Prometheus and Grafana, enforce essential security controls, and maintain backups in Akamai Object Storage.
4545

4646
### Prerequisites and Preparation
4747

@@ -50,12 +50,10 @@ You'll configure multi-datacenter replication, implement monitoring with Prometh
5050
- Familiarity with database concepts and networking
5151
- Working knowledge of Apache Cassandra architecture (nodes, datacenters, replication, and gossip)
5252

53-
```command
5453
{{< note >}}
5554
**Choosing a database**
5655
Akamai supports multiple database engines. This guide includes embedded Cassandra installation and configuration steps to support users who choose Cassandra, as it requires more setup than other options. Before proceeding, take a moment to "confirm" that Cassandra aligns with your application’s requirements and your operational experience.
5756
{{</ note>}}
58-
```
5957

6058
## Architecture and Planning
6159

@@ -81,11 +79,13 @@ For optimal performance, attach NVMe-backed Block Storage volumes to Dedicated o
8179

8280
Cluster Topology:
8381

84-
Deploy a minimum of 3 nodes per data center (odd numbers preferred to prevent split-brain scenarios) with a replication factor of 3 for production environments. Designate 2+ stable nodes per data center as seed nodes. For capacity planning guidelines and topology best practices, see the [Cassandra Planning](https://cassandra.apache.org/doc/latest/cassandra/managing/operating/hardware.html) documentation.
82+
Use three nodes per data center (odd numbers reduce split brain risk) with a replication factor of three. Assign at least two stable nodes in each data center as seed nodes. For capacity planning guidelines and topology best practices, see the [Cassandra Planning](https://cassandra.apache.org/doc/latest/cassandra/managing/operating/hardware.html) documentation.
8583

8684
### Multi-Region Network Planning
8785

88-
**Akamai Data Center Selection**
86+
#### Akamai Data Center Selection
87+
88+
Before configuring multi-datacenter replication, you need to choose Akamai data centers that meet Cassandra’s latency expectations for cross region communication.
8989

9090
Choose data center pairs based on your latency requirements.
9191

@@ -102,7 +102,7 @@ Choose data center pairs based on your latency requirements.
102102
- **Asia-Pacific**: Singapore and Tokyo
103103
- **Global**: Newark, London, Singapore
104104

105-
Network Architecture
105+
#### Network Architecture
106106

107107
Configure the following Akamai networking features for cluster communication:
108108

@@ -111,7 +111,7 @@ Configure the following Akamai networking features for cluster communication:
111111
- **Private IP addresses**: Use private IPs for all node-to-node communication.
112112
- **Network interfaces**: 1Gbps+ interfaces (standard on Dedicated and Premium CPU instances).
113113

114-
For complete port requirements and security configuration, see the [Cassandra Security](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html) documentation.
114+
Review the [Cassandra Security](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html) documentation for complete port requirements and security configuration details, because these settings must be in place for your cluster to operate correctly.
115115

116116
### Software Versions
117117

@@ -140,7 +140,7 @@ This section covers provisioning the underlying infrastructure for your multi-da
140140

141141
### Compute Instances
142142

143-
Provision compute instances in each selected Akamai data center according to your planning worksheet.
143+
Provision compute instances according to the specifications in your planning worksheet. If you have not yet finalized instance types, OS selection, or deployment order, review the considerations below before proceeding.
144144

145145
#### Instance Selection Guidelines
146146

@@ -159,13 +159,15 @@ Use **Ubuntu 22.04 LTS** across all nodes for consistency in maintenance and tro
159159

160160
#### Example topology
161161

162+
Use the example below to validate your instance counts and regional layout while you visualize and finalize your planning worksheet.
163+
162164
- Newark: 3 nodes
163165
- London: 3 nodes
164166
- **Total**: 6 nodes across 2 data centers
165167

166168
### VLAN Architecture
167169

168-
VLANs provide private, lowlatency connectivity between Cassandra nodes.
170+
VLANs provide private, low latency connectivity between Cassandra nodes, and the VLAN design captured in your planning worksheet defines how nodes communicate within and across regions.
169171

170172
#### Within a data center
171173

@@ -184,12 +186,16 @@ VLANs provide private, low‑latency connectivity between Cassandra nodes.
184186

185187
#### Example VLAN scheme
186188

189+
Use the example below to validate your VLAN assignments as you document your network design in the planning worksheet.
190+
187191
- Newark VLAN: 10.0.1.0/24
188192
- London VLAN: 10.0.2.0/24
189193

194+
Cross-data-center VLAN behavior was not tested in this environment but reflects required Akamai Cloud architecture.
195+
190196
### Block Storage
191197

192-
Cassandra requires dedicated storage for data and commit logs.
198+
Cassandra requires dedicated storage for data and commit logs, and documenting your block storage plan in the planning worksheet helps ensure each node is provisioned with the correct capacity and performance.
193199

194200
#### Volume specifications
195201

@@ -207,14 +213,14 @@ Cassandra requires dedicated storage for data and commit logs.
207213

208214
#### Example mount points
209215

216+
The example below shows how to validate your directory layout to document block storage configuration for the planning worksheet.
217+
210218
- Data directory: `/var/lib/cassandra/data`
211219
- Commit log directory: `/var/lib/cassandra/commitlog` (if using separate volume)
212220

213-
Cross-data-center VLAN behavior was not tested in this environment but reflects required Akamai Cloud architecture.
214-
215221
### Cloud Firewall Strategy
216222

217-
Cassandra requires specific ports for internode communication, client access, and monitoring. For complete port reference and security considerations, see the [Cassandra Security documentation](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html).
223+
Cassandra requires specific ports for internode communication, client access, and monitoring. Use the following firewall rules and connectivity requirements to validate your network design and document the necessary settings in your planning worksheet before deployment. For complete port reference and security considerations, see the [Cassandra Security documentation](https://cassandra.apache.org/doc/4.1/cassandra/operating/security.html).
218224

219225
#### Cluster communication
220226

@@ -261,8 +267,6 @@ Confirm network connectivity meets your requirements before installing Cassandra
261267
- Same‑region multi‑DC: < 50ms
262268
- Cross‑region: < 300ms
263269

264-
Cross-data-center firewall behavior and latency validation were not tested in this environment but reflect Akamai Cloud architecture.
265-
266270
### Infrastructure Readiness Checklist
267271

268272
Before proceeding to Cassandra installation, confirm:
@@ -289,6 +293,8 @@ Begin by installing and validating Cassandra on a single node first. Once the in
289293

290294
### Add the Apache Cassandra Repository
291295

296+
Add the Apache Cassandra repository as part of your installation preparation so each node can retrieve the correct 4.1.x packages; document this step in your planning worksheet to ensure consistency across regions.
297+
292298
#### Create a Repository Definition
293299

294300
```command
@@ -299,7 +305,8 @@ echo "deb https://debian.cassandra.apache.org 41x main" | sudo tee /etc/apt/sour
299305
```command
300306
sudo curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
301307
```
302-
{{< note>}}: Debian and Ubuntu are transitioning away from apt-key. For background about how APT verifies repository signatures or how to manage repository keys using modern gpg‑based methods, refer to the Debian documentation topic [SecureApt](https://wiki.debian.org/SecureApt). If this link becomes unavailable, search the Debian documentation for “SecureApt” or “APT repository key management”.
308+
{{< note>}}
309+
Debian and Ubuntu are transitioning away from apt-key. For background about how APT verifies repository signatures or how to manage repository keys using modern gpg‑based methods, refer to the Debian documentation topic [SecureApt](https://wiki.debian.org/SecureApt). If this link becomes unavailable, search the Debian documentation for “SecureApt” or “APT repository key management”.
303310
{{< /note >}}
304311

305312
#### Update Package Lists
@@ -320,7 +327,7 @@ Cassandra automatically installs OpenJDK 11 as a dependency, so no separate Java
320327

321328
### Optional: Environment Checks
322329

323-
After installation, confirm your kernel and package versions match expected compatibility. Ubuntu 22.04.5 LTS is generally safe, but minimal images or custom build may vary.
330+
After installation, confirm your kernel and package versions match expected compatibility. Ubuntu 22.04.5 LTS is generally safe, but minimal images or custom builds may vary.
324331
```command
325332
sudo uname -r # Check kernel version
326333
sudo java -version # Confirm Java runtime
@@ -380,11 +387,9 @@ sudo nodetool status
380387
```
381388
What you should see:
382389

383-
- **Single-node environment:
384-
One node listed in the **UN (Up/Normal)** state.
390+
- **Single-node environment**: One node listed in the **UN (Up/Normal)** state.
385391

386-
- **Multi-node environment:
387-
You should see each node listed with its IP address and state Nodes may briefly appear as UJ (Up/Joining) while they bootstrap
392+
- **Multi-node environment**: You should see each node listed with its IP address and state Nodes may briefly appear as UJ (Up/Joining) while they bootstrap
388393

389394
#### If you expect multiple nodes but only see one
390395

@@ -412,17 +417,19 @@ This section focuses on Akamai‑specific requirements, and cloud networking con
412417

413418
### Upstream Cassandra Documentation
414419

415-
Cassandra maintains version‑specific documentation under `/doc/<version>/`. These links point to the Apache Cassandra 4.1 documentation. Content may evolve over time, but the URLs remain stable.
420+
Cassandra maintains version‑specific documentation under `/doc/<version>/`. These links point to the Apache Cassandra 4.1 documentation.
416421

417-
1. [Multi‑DC Cluster Initialization](https://cassandra.apache.org/doc/latest/cassandra/getting_started/initialize_cluster_multi_dc.html)
418-
2. [`cassandra.yaml` Configuration Reference](https://cassandra.apache.org/doc/4.1/cassandra/configuration/cass_yaml_file.html)
419-
3. [Snitch Architecture](https://cassandra.apache.org/doc/4.1/cassandra/architecture/snitch.html)
422+
Cassandra no longer provides a dedicated Multi-DC initialization walkthrough. Multi-datacenter behavior is not a separate procedure: it emerges from the replication model and the configuration of snitches and `NetworkTopologyStrategy`. New users should rely on the upstream conceptual and configuration documentation rather than expecting a step-by-step Multi-DC guide.
420423

421-
You can deploy Cassandra nodes within the same region or across multiple regions. Multi‑region deployments are typically used for geographic redundancy, global applications, or regulatory data‑residency requirements. For guidance on designing multi‑datacenter topologies and understanding how distance affects replication and consistency, refer to the official [Cassandra documentation](https://cassandra.apache.org/doc/4.1/cassandra/architecture/dynamo.html).
424+
1. [Multi-master Replication: Versioned Data and Tunable Consistency](https://cassandra.apache.org/doc/4.1/cassandra/architecture/dynamo.html#multi-master-replication:-versioned-data-and-tunable-consistency)explains the replication model that underpins multi-DC behavior.
425+
2. [`cassandra.yaml` Configuration Reference](https://cassandra.apache.org/doc/4.1/cassandra/configuration/cass_yaml_file.html) covers the settings used when configuring datacenter and rack placement.
426+
3. [Snitch Architecture](https://cassandra.apache.org/doc/4.1/cassandra/architecture/snitch.html)describes how Cassandra identifies datacenters and routes requests.
427+
428+
You can deploy Cassandra nodes within a single region or across multiple regions. Multi‑region deployments are typically used for geographic redundancy, global applications, or regulatory data‑residency requirements. For guidance on designing multi‑datacenter topologies and understanding how distance affects replication and consistency, refer to the official Cassandra documentation.
422429

423430
### Akamai‑Specific Multi‑DC Requirements
424431

425-
When deploying Cassandra on Akamai Cloud Computing Services:
432+
These requirements supplement the upstream Cassandra documentation and describe only the Akamai specific networking and instance level considerations for multi datacenter deployments.
426433

427434
- Use each node’s private IP address from the Akamai instance details page.
428435
- Ensure nodes are reachable across regions using VPC peering or shared private networks.
@@ -532,13 +539,13 @@ These are the correct defaults.
532539

533540
If they are missing or incorrect, Cassandra may start but clients (including `cqlsh`) will not be able to connect to the node.
534541

535-
#### Save the File
542+
Save the File.(Ctrl X, Y, Enter)
536543

537544
### Setting Data Center and Rack Topology
538545

539546
Each node must declare its datacenter and rack so Cassandra can place replicas correctly in a multi-DC cluster. These values must match the snitch configuration you set earlier.
540547

541-
{{< note>}}: The `dc` and `rack` values in this file are your choice. They don't need to match your VM names or cloud regions-–they only need to be consistent across nodes in the same datacenter.
548+
{{< note>}} The `dc` and `rack` values in this file are your choice. They don't need to match your VM names or cloud regions-–they only need to be **consistent across nodes** in the same datacenter.
542549
{{< /note >}}
543550

544551
#### Edit the Topology file on each Node
@@ -549,12 +556,15 @@ sudo nano /etc/cassandra/cassandra-rackdc.properties
549556

550557
#### Specify the Data Center and Rack
551558

559+
Cassandra reads the datacenter and rack labels from `/etc/cassandra/cassandra-rackdc.properties`. Edit this file and ensure the following entries are present or updated to match your deployment:
560+
552561
```properties
553562
dc=datacenter_name
554563
rack=rack_name
555564
```
565+
These values must align with the datacenter and rack names you use in your `NetworkTopologyStrategy` configuration.
556566

557-
{{< note>}}:
567+
{{< note>}}
558568
If Cassandra has ever been started on this node before you changed the datacenter name, the node will not start because the stored datacenter value won't match the new one.
559569
{{< /note >}}
560570

@@ -616,10 +626,10 @@ These values are read at startup, so any changes require restarting the node.
616626

617627
Start nodes in the correct order to ensure proper cluster formation.
618628

619-
{{< note >}}: The "primary seed" is the first IP address in your seed list. Start that node (whose IP appears first in the seed list) first before starting the other nodes. This ensures clean cluster formation.
629+
{{< note >}} The "primary seed" is the first IP address in your seed list. Start that node (whose IP appears first in the seed list) first before starting the other nodes. This ensures clean cluster formation.
620630
{{< /note >}}
621631

622-
1.Start the primary seed in the first data center:
632+
1. Start the primary seed in the first data center:
623633

624634
```command
625635
sudo systemctl start cassandra
@@ -646,7 +656,7 @@ From any node:
646656
sudo nodetool status
647657
```
648658

649-
You should see:
659+
In the output you should see:
650660

651661
- each datacenter listed
652662
- nodes marked **UN (Up / Normal)**
@@ -697,3 +707,5 @@ For guidance on creating keyspaces, tables, and working with data, refer to the
697707
- [CQL Reference](https://cassandra.apache.org/doc/latest/cassandra/developing/cql/cql_singlefile.html)
698708
- [Data Definition (DDL) - includes replication strategies](https://cassandra.apache.org/doc/3.11.13/cassandra/cql/ddl.html)
699709
- [Schema Design](https://cassandra.apache.org/doc/latest/cassandra/developing/data-modeling/data-modeling_schema.html)
710+
711+
[Cassandra documentation](https://cassandra.apache.org/_/index.html) provides in depth support.

0 commit comments

Comments
 (0)