Migrate data with {sstable-sideloader}

You can use {sstable-sideloader} to migrate data to {astra-db} from {cass-reg}, {dse}, or {hcd}.

Prerequisites

Before you use {sstable-sideloader} for a migration, learn about the {sstable-sideloader} process and prepare your environments for {sstable-sideloader}.

Create snapshots

On each node in your origin cluster, use nodetool to create a backup of the data that you want to migrate, including all keyspaces and CQL tables that you want to migrate.

Prepare to create snapshots

Due to {sstable-sideloader} limitations related to materialized views, secondary indexes, and encrypted data, you might need to modify the data model on your origin cluster to prepare for the migration. For more information, see Origin cluster requirements.
Optional: Before you create snapshots, consider running nodetool cleanup to remove data that no longer belongs to your nodes. This command is particularly useful after adding more nodes to a cluster because it helps ensure that each node only contains the data that it is responsible for, according to the current cluster configuration and partitioning scheme.

If you run nodetool cleanup before you take a snapshot, you can ensure that the snapshot only includes relevant data, potentially reducing the size of the snapshot. Smaller snapshots can lead to lower overall migration times and lower network transfer costs.

However, take adequate precautions before you run this command because the cleanup operations can introduce additional load on your origin cluster.

Run nodetool snapshot

Use nodetool snapshot to create snapshots for the tables that you want to migrate.

Don’t create snapshots of system tables or tables that you don’t want to migrate. The migration can fail if you attempt to migrate snapshots that don’t have a matching schema in the target database. {sstable-sideloader} ignores system keyspaces.

The structure of the nodetool snapshot command depends on the keyspaces and tables that you want to migrate.

Snapshot all keyspaces

Create a snapshot of all tables in all keyspaces:

nodetool snapshot -t SNAPSHOT_NAME

Replace the following:

SNAPSHOT_NAME: A descriptive name for the snapshot. Use the same snapshot name for each node’s snapshot; this makes it easier to programmatically upload the snapshots to the migration directory.

Snapshot specific keyspaces

Create a snapshot of all tables in one or more specified keyspaces:

Snapshot one keyspace

nodetool snapshot -t SNAPSHOT_NAME KEYSPACE_NAME

Snapshot multiple keyspaces

nodetool snapshot -t SNAPSHOT_NAME KEYSPACE_NAME_1 KEYSPACE_NAME_2

Replace the following:

SNAPSHOT_NAME: A descriptive name for the snapshot. Use the same snapshot name for each node’s snapshot; this makes it easier to programmatically upload the snapshots to the migration directory.
KEYSPACE_NAME: The name of the keyspace that you want to migrate.

To snapshot multiple keyspaces, pass a space-separated list of keyspace names. For example, customer_data product_data purchase_history specifies three keyspaces.

Snapshot specific tables

Create a snapshot of one or more specified tables:

Snapshot one table

nodetool snapshot -kt KEYSPACE_NAME.TABLE_NAME -t SNAPSHOT_NAME

Snapshot multiple tables

nodetool snapshot -kt KEYSPACE_NAME_1.TABLE_NAME_A KEYSPACE_NAME_1.TABLE_NAME_B KEYSPACE_NAME_2.TABLE_NAME_X -t SNAPSHOT_NAME

Replace the following:

KEYSPACE_NAME.TABLE_NAME: The name of the table that you want to migrate and the keyspace that it belongs to, separated by a period. For example, product_data.appliances specifies the appliances table in the product_data keyspace.

To snapshot multiple tables, pass a space-separated list of keyspace-table pairs. For example, product_data.appliances purchase_history.nevada purchase_history.wisconsin specifies the appliances table in the product_data keyspace and the nevada and wisconsin tables in the purchase_history keyspace.
SNAPSHOT_NAME: A descriptive name for the snapshot. Use the same snapshot name for each node’s snapshot; this makes it easier to programmatically upload the snapshots to the migration directory.

Verify snapshot creation with nodetool listsnapshots

Use nodetool listsnapshots to verify that the snapshots were created:

nodetool listsnapshots

Important

Snapshots have a specific directory structure, such as KEYSPACE_NAME/TABLE_NAME/snapshots/SNAPSHOT_NAME/…. {sstable-sideloader} relies on this fixed structure to properly interpret the SSTable components. Don’t modify the snapshot’s directory structure; this can cause your migration to fail.

Optional: Use `for` loops for snapshot creation and validation

If the nodes in your origin cluster are named in a predictable way (for example, dse0, dse1, dse2, etc.), you can use a for loop to simplify snapshot creation. For example:

Use a for loop to snapshot all keyspaces

To snapshot all keyspaces on each node, append the nodetool command to your for loop:

for i in 0 1 2; do ssh dse${i} nodetool snapshot -t SNAPSHOT_NAME; done

Use a for loop to snapshot specific keyspaces

To snapshot one keyspace on each node, append the nodetool command to your for loop:

for i in 0 1 2; do ssh dse${i} nodetool snapshot -t SNAPSHOT_NAME KEYSPACE_NAME; done

To snapshot multiple specific keyspaces on each node, use commas (not spaces) to separate the keyspace names:

for i in 0 1 2; do ssh dse${i} nodetool snapshot -t SNAPSHOT_NAME KEYSPACE_NAME_1,KEYSPACE_NAME_2; done

Use a for loop to snapshot specific tables

To snapshot one table on each node, append the nodetool command to your for loop:

for i in 0 1 2; do ssh dse${i} nodetool snapshot -kt KEYSPACE_NAME.TABLE_NAME -t SNAPSHOT_NAME; done

To snapshot multiple specific tables on each node, use commas (not spaces) to separate the keyspace-table pairs:

for i in 0 1 2; do ssh dse${i} nodetool snapshot -kt KEYSPACE_NAME_1.TABLE_NAME_A,KEYSPACE_NAME_1.TABLE_NAME_B -t SNAPSHOT_NAME; done

You can use the same for loop structure to verify that each snapshot was successfully created:

for i in 0 1 2; do ssh dse${i} nodetool listsnapshots; done

Configure the target database

To prepare your target database for the migration, you must record the schema for each table in your origin cluster that you want to migrate, re-create these schemas in your target database, and then set environment variables required to connect to your database.

Warning

For the migration to succeed, your target database must meet the schema requirements described in this section. Additionally, your snapshots must contain compatible data and directories, as described in Origin cluster requirements and Create snapshots. For example, {astra-db} doesn’t support materialized views, and {sstable-sideloader} cannot migrate encrypted data.

However, indexes don’t need to match. You can define indexes in your target database independently from the origin cluster because {sstable-sideloader} ignores Storage Attached Indexes (SAI) defined on the origin cluster. During the migration, {sstable-sideloader} automatically populates any SAI defined in your target database, even if those SAI weren’t present in your origin cluster.

Get the following schema properties for each table that you want to migrate:
- Exact keyspace name.
- Exact table name.
- Exact column names, data types, and the order in which they appear in the table creation DDL.
- Exact primary key definition as defined in your origin cluster, including the partition key, clustering columns, and ascending/descending ordering clauses. You must define partition key columns and clustering columns in the exact order that they are defined on your origin cluster.
  
  To retrieve schema properties, you can run the DESCRIBE KEYSPACE command on your origin cluster:
  DESCRIBE KEYSPACE_NAME;
  Replace KEYSPACE_NAME with the name of the keyspace that contains the tables you want to migrate, such as DESCRIBE smart_home;.
  
  Then, get the schema properties from the result:
  CREATE TABLE smart_home.sensor_readings ( device_id UUID, room_id UUID, reading_type TEXT, reading_value DOUBLE, reading_timestamp TIMESTAMP, PRIMARY KEY (device_id, room_id, reading_timestamp) ) WITH CLUSTERING ORDER BY (room_id ASC, reading_timestamp DESC);
Re-create the schemas in your target database:
1. In the {astra-ui-link} navigation menu, click Databases, and then click the name of your {astra-db} database.
2. Create a keyspace with the exact same name as your origin cluster’s keyspace.
3. In your database’s {cql-console}, create tables with the exact same names and schemas as your origin cluster.
  
  {astra-db} rejects or ignores some table properties, such as compaction strategy. See astra-db-serverless:databases:database-limits.adoc for more information.

In your terminal, set environment variables for your target database:

export dbID=DATABASE_ID
export token=APPLICATION_TOKEN

Replace the following:

DATABASE_ID: The database ID of your target {astra-db} database. * APPLICATION_TOKEN: An application token with a role that has the required permissions for {sstable-sideloader}, which are {create-db-permission} and {view-db-permission}. You can use a built-in role, such as the {database-administrator-role} role, or a custom role with the required permissions.

Tip

Later, you will add another environment variable for the migration ID.

The curl commands in this guide assume that you have set environment variables for token, database ID, and migration ID. Running the commands without these environment variables causes error messages like <a href="/v2/databases/migrations/">Moved Permanently</a> and 404 page not found.

Additionally, the curl command use jq to format the JSON responses. If you don’t have jq installed, remove | jq . from the end of each command.

Initialize the migration

Use the {devops-api} to initialize the migration and get your migration directory path and credentials.

To learn more about the initialization process, see About {sstable-sideloader}: Initialize a migration.

The initialization process can take several minutes to complete, especially if the migration bucket doesn’t already exist.

Get a migration ID

In your terminal, use the {devops-api} to initialize the data migration:

curl -X POST \
    -H "Authorization: Bearer ${token}" \
    https://api.astra.datastax.com/v2/databases/${dbID}/migrations/initialize \
    | jq .

Get the migrationID from the response:

{
  "migrationID": "272eac1d-df8e-4d1b-a7c6-71d5af232182",
  "dbID": "b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d",
  "status": "Initializing",
  "progressInfo": "",
  "uploadBucketDir": "",
  "uploadCredentials": {
    "name": "",
    "keys": null,
    "credentialExpiration": null
  },
  "expectedCleanupTime": "2025-03-04T15:14:38Z"
}

The migrationID is a unique identifier (UUID) for the migration.

The response also includes the migration status. You will refer to this status multiple times throughout the migration process.

Assign the migration ID to an environment variable:
```
export migrationID=MIGRATION_ID
```
Replace MIGRATION_ID with the migrationID returned by the initialize endpoint.

Check the migration status to verify initialization

Check the migration status:

sideloader:partial$check-status.adoc
Check the status field in the response:
- "status": "ReceivingFiles": Initialization is complete and your upload credentials are available. Proceed to the next step.
- "status": "Initializing": The migration is still initializing. Wait a few minutes before you check the status again.

Get migration directory path and upload credentials

Get your migration directory path and upload credentials from the response. You need these values to upload snapshots to the migration directory.

Get AWS credentials from MigrationStatus

Securely store the uploadBucketDir, accessKeyID, secretAccessKey, and sessionToken from the response:

{
  "migrationID": "272eac1d-df8e-4d1b-a7c6-71d5af232182",
  "dbID": "b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d",
  "status": "ReceivingFiles",
  "progressInfo": "",
  "uploadBucketDir": "s3://ds-mig-b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d/272eac1d-df8e-4d1b-a7c6-71d5af232182/sstables/",
  "uploadCredentials": {
    "name": "sessionToken",
    "keys": {
      "accessKeyID": "ASXXXXXXXXXXXXXXXXXX",
      "secretAccessKey": "2XXXXXXXXXXXXXXXWqcdV519ZubYbyfuNxbZg1Rw",
      "sessionToken": "XXXXXXXXXX"
    },
    "credentialExpiration": "2024-01-18T19:45:09Z",
    "hint": "\nexport AWS_ACCESS_KEY_ID=ASXXXXXXXXXXXXXXXXXX\nexport AWS_SECRET_ACCESS_KEY=2XXXXXXXXXXXXXXXWqcdV519ZubYbyfuNxbZg1Rw\nexport AWS_SESSION_TOKEN=XXXXXXXXXXXXXX\n"
  },
  "expectedCleanupTime": "2024-01-25T15:14:38Z"
}

uploadBucketDir is the migration directory URL. Note the trailing slash.

uploadCredentials contains the AWS credentials that authorize uploads to the migration directory, namely accessKeyID, secretAccessKey, and sessionToken.

Important

The sessionToken expires after one hour. If your total migration takes longer than one hour, generate new credentials, and then resume the migration with the fresh credentials.

If you use automation to handle {sstable-sideloader} migrations, you might need to script a pause every hour so you can generate new credentials without unexpectedly interrupting the migration.

Get Google Cloud credentials from MigrationStatus

Find the uploadBucketDir and the uploadCredentials in the response:

{
  "migrationID": "272eac1d-df8e-4d1b-a7c6-71d5af232182",
  "dbID": "b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d",
  "status": "ReceivingFiles",
  "progressInfo": "",
  "uploadBucketDir": "gs://ds-mig-b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d/272eac1d-df8e-4d1b-a7c6-71d5af232182/sstables/",
  "uploadCredentials": {
    "name": "TYPE_GOOGLE_CREDENTIALS_FILE",
    "keys": {
      "file": "CREDENTIALS_FILE"
    },
    "credentialExpiration": "2024-08-07T18:51:39Z"
  },
  "expectedCleanupTime": "2024-08-14T15:14:38Z"
}

uploadBucketDir is the migration directory URL. Note the trailing slash.

uploadCredentials contains a base64-encoded file containing Google Cloud credentials that authorize uploads to the migration directory.

Pipe the Google Cloud credentials file to a creds.json file:

curl -X GET \
    -H "Authorization: Bearer ${token}" \
    https://api.astra.datastax.com/v2/databases/${dbID}/migrations/${migrationID} \
    | jq -r '.uploadCredentials.keys.file' \
    | base64 -d > creds.json

Securely store the uploadBucketDir and creds.json.

Get Azure credentials from MigrationStatus

Securely store the uploadBucketDir and urlSignature from the response:

{
  "migrationID": "456ca4a9-0551-46c4-b8bb-90fcd136a0c3",
  "dbID": "ccefd141-8fda-4e4d-a746-a102a96657bc",
  "status": "ReceivingFiles",
  "progressInfo": "",
  "uploadBucketDir": "https://muztx5cqmp3jhe3j2guebksz.blob.core.windows.net/mig-upload-456ca4a9-0551-46c4-b8bb-90fcd136a0c3/sstables/",
  "uploadCredentials": {
    "name": "URL signature",
    "keys": {
      "url": "https://UPLOAD_BUCKET_DIR/?si=AZURE_SAS_TOKEN",
      "urlSignature": "si=AZURE_SAS_TOKEN"
    },
    "credentialExpiration": "2025-04-02T15:14:31Z"
  },
  "expectedCleanupTime": "2025-03-04T15:14:38Z"
}

uploadBucketDir is the migration directory URL. Note the trailing slash.

uploadCredentials contains url and urlSignature keys that represent an Azure Shared Access Signature (SAS) token. You need the urlSignature to upload snapshots to the migration directory. In the preceding example, these strings are truncated for readability.

Upload snapshots to the migration directory

Use your cloud provider’s CLI and your upload credentials to upload snapshots for each origin node into the migration directory.

Important

Be aware of the following requirements for the upload commands:

You must include the asterisk (*) character as shown in the commands, otherwise the commands won’t work properly.
With the exception of the leading :// in the migration directory path, your paths must not include double slashes (//).
Use the CLI that corresponds with your target database’s cloud provider. For more information, see sideloader:prepare-sideloader.adoc.
These commands assume that you installed the cloud provider’s CLI on the nodes in your origin cluster. For more information, see sideloader:prepare-sideloader.adoc.
You might need to modify these commands depending on your environment, node names, directory structures, and other variables.

Upload snapshots to AWS

Set environment variables for the AWS credentials that were generated when you initialized the migration:

export AWS_ACCESS_KEY_ID=ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=SECRET_ACCESS_KEY
export AWS_SESSION_TOKEN=SESSION_TOKEN

Use the AWS CLI to upload one snapshot from one node into the migration directory:

du -sh CASSANDRA_DATA_DIR/KEYSPACE_NAME/*/snapshots/*SNAPSHOT_NAME*; \
aws s3 sync --only-show-errors --exclude '*' --include '*/snapshots/SNAPSHOT_NAME*' CASSANDRA_DATA_DIR/ MIGRATION_DIRNODE_NAME

Replace the following:

sideloader:partial$command-placeholders-common.adoc

Example: Upload a snapshot with AWS CLI

# Set environment variables
export AWS_ACCESS_KEY_ID=XXXXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXXXX
export AWS_SESSION_TOKEN=XXXXXXXXXX

# Upload "sensor_readings" snapshot from "dse0" node
du -sh /var/lib/cassandra/data/smart_home/*/snapshots/*sensor_readings*; \
aws s3 sync --only-show-errors --exclude '*' --include '*/snapshots/sensor_readings*' /var/lib/cassandra/data/ s3://ds-mig-b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d/272eac1d-df8e-4d1b-a7c6-71d5af232182/sstables/dse0

Monitor upload progress:
1. Use the AWS CLI to get a list of cloud storage keys for the files that have been successfully uploaded to the migration directory:
  aws s3 ls --human-readable --summarize --recursive MIGRATION_DIR
  Replace MIGRATION_DIR with the uploadBucketDir that was generated when you initialized the migration.
2. Compare the returned list against the files in your snapshot directory. When the lists match, the upload is complete.
  
  You can potentially increase upload speeds by adjusting the max_concurrent_requests, multipart_threshold, and multipart_chunksize parameters in your AWS CLI S3 configuration. However, upload time primarily depends on the snapshot size, network throughput from your origin cluster to the migration bucket, and whether the origin cluster and migration bucket are in the same region.

Repeat the upload process for each snapshot (SNAPSHOT_NAME) and node (NODE_NAME) in your origin cluster.

If your credentials expire, see Get new upload credentials.

Tip

Use a for loop to simplify snapshot uploads

If the nodes in your origin cluster have predictable names (for example, dse0, dse1, and dse2), then you can use a for loop to streamline the execution of the upload commands. For example:

# Set environment variables
export AWS_ACCESS_KEY_ID=ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=SECRET_ACCESS_KEY
export AWS_SESSION_TOKEN=SESSION_TOKEN

# Loop over the sync command for all nodes
for i in 0 1 2; do ssh dse${i} \
"du -sh CASSANDRA_DATA_DIR/KEYSPACE_NAME/*/snapshots/*SNAPSHOT_NAME*; \
aws s3 sync --only-show-errors --exclude '*' --include '*/snapshots/SNAPSHOT_NAME*' CASSANDRA_DATA_DIR/ MIGRATION_DIRdse${i}" & done

sideloader:partial$staged-snapshots-need-import-ph.adoc

sideloader:partial$idle-migration-directories-note.adoc

Upload snapshots to Google Cloud Storage

Authenticate to Google Cloud with the creds.json file that you created when you initialized the migration:
```
gcloud auth activate-service-account --key-file=creds.json
```
If necessary, modify the --key-file path to match the location of your creds.json file, such as --key-file=~/.gcloud_credentials/creds.json.

You can also use gcloud auth login --cred-file creds.json.

Use gsutil to upload one snapshot from one node into the migration directory:

gsutil -m rsync -r -d CASSANDRA_DATA_DIR/KEYSPACE_NAME/**/snapshots/SNAPSHOT_NAME/ MIGRATION_DIRNODE_NAME/

Replace the following:

sideloader:partial$command-placeholders-common.adoc

Example: Upload a snapshot with gcloud and gsutil

# Authenticate
gcloud auth activate-service-account --key-file=creds.json

# Upload "sensor_readings" snapshot from "dse0" node
gsutil -m rsync -r -d /var/lib/cassandra/data/smart_home/**/snapshots/sensor_readings/ gs://ds-mig-b7e7761f-6f7f-4116-81a5-e8eefcf0cc1d/272eac1d-df8e-4d1b-a7c6-71d5af232182/sstables/dse0

Monitor upload progress:
1. Use gsutil to get a list of objects that have been successfully uploaded to the migration directory:
  gsutil ls -r MIGRATION_DIR
  Replace MIGRATION_DIR with the uploadBucketDir that was generated when you initialized the migration.
2. Compare the returned list against the files in your snapshot directory. When the lists match, the upload is complete.
  
  The -m flag in gsutil -m rsync enables parallel synchronization, which can improve upload speed. However, upload time primarily depends on the snapshot size, network throughput from your origin cluster to the migration bucket, and whether the origin cluster and migration bucket are in the same region.

Repeat the upload process for each snapshot (SNAPSHOT_NAME) and node (NODE_NAME) in your origin cluster.

Tip

Use a for loop to simplify snapshot uploads

If the nodes in your origin cluster have predictable names (for example, dse0, dse1, and dse2), then you can use a for loop to streamline the execution of the gsutil rsync commands. For example:

for i in 0 1 2; do ssh dse${i} \
du -sh CASSANDRA_DATA_DIR/KEYSPACE_NAME/*/snapshots/*SNAPSHOT_NAME*; \
gsutil -m rsync -r -d CASSANDRA_DATA_DIR/KEYSPACE_NAME/**/snapshots/SNAPSHOT_NAME/ MIGRATION_DIRdse${i} & done

sideloader:partial$staged-snapshots-need-import-ph.adoc

sideloader:partial$idle-migration-directories-note.adoc

Upload snapshots to Azure

Set environment variables for the following values:
- AZURE_SAS_TOKEN: The urlSignature key that was generated when you initialized the migration.
- CASSANDRA_DATA_DIR: The absolute file system path to where {cass-short} data is stored on the node, including the trailing slash. For example, /var/lib/cassandra/data/.
- SNAPSHOT_NAME: The name of the snapshot backup that you created with nodetool snapshot.
- MIGRATION_DIR: The entire uploadBucketDir value that was generated when you initialized the migration, including the trailing slash.
- NODE_NAME: The host name of the node that your snapshots are from. It is important to use the specific node name to ensure that each node has a unique directory in the migration bucket.
```
export AZURE_SAS_TOKEN="AZURE_CREDENTIALS_URL"
export CASSANDRA_DATA_DIR="CASSANDRA_DATA_DIR"
export SNAPSHOT_NAME="SNAPSHOT_NAME"
export MIGRATION_DIR="MIGRATION_DIR"
export NODE_NAME="NODE_NAME"
```

Use the Azure CLI to upload one snapshot from one node into the migration directory:

for dir in $(find "$CASSANDRA_DATA_DIR" -type d -path "*/snapshots/${SNAPSHOT_NAME}*"); do
    REL_PATH="${dir#"$CASSANDRA_DATA_DIR"}"  # Remove the base path
    DEST_PATH="${MIGRATION_DIR}${NODE_NAME}/${REL_PATH}/?${AZURE_SAS_TOKEN}"

    azcopy sync "$dir" "$DEST_PATH" --recursive
done

Monitor upload progress:
1. Use the Azure CLI to get the curent contents of the migration directory:
  azcopy list ${MIGRATION_DIR}?${AZURE_SAS_TOKEN}
2. Compare the returned list against the files in your snapshot directory. When the lists match, the upload is complete.
  
  Upload time primarily depends on the snapshot size, network throughput from your origin cluster to the migration bucket, and whether the origin cluster and migration bucket are in the same region.
Repeat the upload process for each snapshot and node in your origin cluster. Be sure to change the SNAPSHOT_NAME and NODE_NAME environment variables as needed.

sideloader:partial$staged-snapshots-need-import-ph.adoc

sideloader:partial$idle-migration-directories-note.adoc

Import data

After you completely upload snapshots for each origin node, import the data into your target database.

Data import is a multi-step operation that requires complete success. If one step fails, then the entire import operation stops and the migration fails.

To learn more about the data import process, see About {sstable-sideloader}: Import data.

Warning

Before you start the import process, make sure all snapshots are completely uploaded. For commands to monitor upload progress and compare uploaded data against the original snapshots, see Upload snapshots to the migration directory.
If necessary, you can pause or abort the migration during the import process. sideloader:partial$no-return.adoc

Use the {devops-api} to launch the data import:

curl -X POST \
    -H "Authorization: Bearer ${token}" \
    https://api.astra.datastax.com/v2/databases/${dbID}/migrations/${migrationID}/launch \
    | jq .

Although this call returns immediately, the import process takes time.

Check the migration status periodically:

sideloader:partial$check-status.adoc
Check the status field in the response:
- "status": "ImportInProgress": The data is still being imported. Wait a few minutes before you check the status again.
- "status": "MigrationDone": The import is complete, and you can proceed to Validate the migrated data.
If the migration takes more than a few days, manually reschedule the cleanup to avoid automatic cleanup.
If the migration fails, see sideloader:troubleshoot-sideloader.adoc.

Validate the migrated data

sideloader:partial$validate.adoc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate data with {sstable-sideloader}

Prerequisites

Create snapshots

Prepare to create snapshots

Run nodetool snapshot

Snapshot all keyspaces

Snapshot specific keyspaces

Snapshot specific tables

Verify snapshot creation with nodetool listsnapshots

Optional: Use `for` loops for snapshot creation and validation

Configure the target database

Initialize the migration

Get a migration ID

Check the migration status to verify initialization

Get migration directory path and upload credentials

Get AWS credentials from MigrationStatus

Get Google Cloud credentials from MigrationStatus

Get Azure credentials from MigrationStatus

Upload snapshots to the migration directory

Upload snapshots to AWS

Upload snapshots to Google Cloud Storage

Upload snapshots to Azure

Import data

Validate the migrated data

See also

FilesExpand file tree

migrate-sideloader.adoc

Latest commit

History

migrate-sideloader.adoc

File metadata and controls

Migrate data with {sstable-sideloader}

Prerequisites

Create snapshots

Prepare to create snapshots

Run nodetool snapshot

Snapshot all keyspaces

Snapshot specific keyspaces

Snapshot specific tables

Verify snapshot creation with nodetool listsnapshots

Optional: Use for loops for snapshot creation and validation

Configure the target database

Initialize the migration

Get a migration ID

Check the migration status to verify initialization

Get migration directory path and upload credentials

Get AWS credentials from MigrationStatus

Get Google Cloud credentials from MigrationStatus

Get Azure credentials from MigrationStatus

Upload snapshots to the migration directory

Upload snapshots to AWS

Upload snapshots to Google Cloud Storage

Upload snapshots to Azure

Import data

Validate the migrated data

See also

Optional: Use `for` loops for snapshot creation and validation