|
| 1 | +A [*Classic Bulk Load Migration*]({% link molt/migration-approach-classic-bulk-load.md %}) is the simplest way of [migrating data to CockroachDB]({% link molt/migration-overview.md %}). In this approach, you stop application traffic to the source database and migrate data to the target cluster using [MOLT Fetch]({% link molt/molt-fetch.md %}) during a **significant downtime window**. Application traffic is then cut over to the target after schema finalization and data verification. |
| 2 | + |
| 3 | +- All source data is migrated to the target [at once]({% link molt/migration-considerations-granularity.md %}). |
| 4 | + |
| 5 | +- This approach does not utilize [continuous replication]({% link molt/migration-considerations-replication.md %}). |
| 6 | + |
| 7 | +- [Rollback]({% link molt/migration-considerations-rollback.md %}) is manual, but in most cases it's simple, as the source database is preserved and write traffic begins on the target all at once. If you wish to roll back before the target has received any writes that are not present on the source database, nothing needs to be done. If you wish to roll back after the target has received writes that are not present on the source database, you must manually replicate these new rows on the source. |
| 8 | + |
| 9 | +This approach is best for small databases (<100 GB), internal tools, dev/staging environments, and production environments that can handle business disruption. It's a simple approach that guarantees full data consistency and is easy to execute with limited resources, but it can only be performed if your system can handle significant downtime. |
| 10 | + |
| 11 | +This page describes an example scenario. While the commands provided can be copy-and-pasted, they may need to be altered or reconsidered to suit the needs of your specific environment. |
| 12 | + |
| 13 | +<div style="text-align: center;"> |
| 14 | +<img src="{{ 'images/molt/molt_classic_bulk_load_flow.svg' | relative_url }}" alt="Classic Bulk Load Migration flow" style="max-width:100%" /> |
| 15 | +</div> |
| 16 | + |
| 17 | +## Example scenario |
| 18 | + |
| 19 | +You have a small (50 GB) database that provides the data store for a web application. You want to migrate the entirety of this database to a new CockroachDB cluster. You schedule a maintenance window for Saturday from 2 AM to 6 AM, and announce it to your users several weeks in advance. |
| 20 | + |
| 21 | +The application runs on a Kubernetes cluster. |
| 22 | + |
| 23 | +**Estimated system downtime:** 4 hours. |
| 24 | + |
| 25 | +## Before the migration |
| 26 | + |
| 27 | +- Install the [MOLT (Migrate Off Legacy Technology)]({% link molt/molt-fetch-installation.md %}#installation) tools. |
| 28 | +- Review the [MOLT Fetch]({% link molt/molt-fetch-best-practices.md %}) documentation. |
| 29 | +- [Develop a migration plan]({% link molt/migration-strategy.md %}#develop-a-migration-plan) and [prepare for the migration]({% link molt/migration-strategy.md %}#prepare-for-migration). |
| 30 | +- **Recommended:** Perform a dry run of this full set of instructions in a development environment that closely resembles your production environment. This can help you get a realistic sense of the time and complexity it requires. |
| 31 | +- Announce the maintenance window to your users. |
| 32 | +- Understand the prerequisites and limitations of the MOLT tools: |
| 33 | + |
| 34 | +<section class="filter-content" markdown="1" data-scope="oracle"> |
| 35 | +{% include molt/oracle-migration-prerequisites.md %} |
| 36 | +</section> |
| 37 | + |
| 38 | +{% include molt/molt-limitations.md %} |
| 39 | + |
| 40 | +## Step 1: Prepare the source database |
| 41 | + |
| 42 | +In this step, you will: |
| 43 | + |
| 44 | +- [Create a dedicated migration user on your source database](#create-migration-user-on-source-database). |
| 45 | + |
| 46 | +{% include molt/migration-prepare-database.md %} |
| 47 | + |
| 48 | +## Step 2: Prepare the target database |
| 49 | + |
| 50 | +In this step, you will: |
| 51 | + |
| 52 | +- [Provision and run a new CockroachDB cluster](#provision-a-cockroachdb-cluster). |
| 53 | +- [Define the tables on the target cluster](#define-the-target-tables) to match those on the source. |
| 54 | +- [Create a SQL user on the target cluster](#create-the-sql-user) with the necessary write permissions. |
| 55 | + |
| 56 | +### Provision a CockroachDB cluster |
| 57 | + |
| 58 | +Use one of the following options to create and run a new CockroachDB cluster. This is your migration **target**. |
| 59 | + |
| 60 | +#### Option 1: Create a secure cluster locally |
| 61 | + |
| 62 | +If you have the CockroachDB binary installed locally, you can manually deploy a multi-node, self-hosted CockroachDB cluster on your local machine. |
| 63 | + |
| 64 | +Learn how to [deploy a CockroachDB cluster locally]({% link {{ site.versions["stable"] }}/secure-a-cluster.md %}). |
| 65 | + |
| 66 | +#### Option 2: Create a CockroachDB Self-Hosted cluster on AWS |
| 67 | + |
| 68 | +You can manually deploy a multi-node, self-hosted CockroachDB cluster on Amazon's AWS EC2 platform, using AWS's managed load-balancing service to distribute client traffic. |
| 69 | + |
| 70 | +Learn how to [deploy a CockroachDB cluster on AWS]({% link {{ site.versions["stable"] }}/deploy-cockroachdb-on-aws.md %}). |
| 71 | + |
| 72 | +#### Option 3: Create a CockroachDB Cloud cluster |
| 73 | + |
| 74 | +CockroachDB Cloud is a fully-managed service run by Cockroach Labs, which simplifies the deployment and management of CockroachDB. |
| 75 | + |
| 76 | +[Sign up for a CockroachDB Cloud account](https://cockroachlabs.cloud) and [create a cluster]({% link cockroachcloud/create-your-cluster.md %}) using [trial credits]({% link cockroachcloud/free-trial.md %}). |
| 77 | + |
| 78 | +### Define the target tables |
| 79 | + |
| 80 | +{% include molt/migration-prepare-schema.md %} |
| 81 | + |
| 82 | +### Create the SQL user |
| 83 | + |
| 84 | +{% include molt/migration-create-sql-user.md %} |
| 85 | + |
| 86 | +## Step 3: Stop application traffic |
| 87 | + |
| 88 | +With both the source and target databases prepared for the data load, it's time to stop application traffic to the source. At the start of the maintenance window, scale down the Kubernetes cluster to zero pods. |
| 89 | + |
| 90 | +{% include_cached copy-clipboard.html %} |
| 91 | +~~~shell |
| 92 | +kubectl scale deployment app --replicas=0 |
| 93 | +~~~ |
| 94 | + |
| 95 | +{{ site.data.alerts.callout_danger }} |
| 96 | +Application downtime begins now. |
| 97 | + |
| 98 | +It is strongly recommended that you perform a dry run of this migration in a test environment. This will allow you to practice using the MOLT tools in real time, and it will give you an accurate sense of how long application downtime might last. |
| 99 | +{{ site.data.alerts.end }} |
| 100 | + |
| 101 | +## Step 4: Load data into CockroachDB |
| 102 | + |
| 103 | +In this step, you will: |
| 104 | + |
| 105 | +- [Configure MOLT Fetch with the flags needed for your migration](#configure-molt-fetch). |
| 106 | +- [Run MOLT Fetch](#run-molt-fetch). |
| 107 | +- [Understand how to continue a load after an interruption](#continue-molt-fetch-after-an-interruption). |
| 108 | + |
| 109 | +### Configure MOLT Fetch |
| 110 | + |
| 111 | +The [MOLT Fetch documentation]({% link molt/molt-fetch.md %}) includes detailed information about how to [configure MOLT Fetch]({% link molt/molt-fetch.md %}#run-molt-fetch), and how to [monitor MOLT Fetch metrics]({% link molt/molt-fetch-monitoring.md %}). |
| 112 | + |
| 113 | +When you run `molt fetch`, you can configure the following options for data load: |
| 114 | + |
| 115 | +<a id="schema-and-table-filtering"></a> |
| 116 | +<a id="source-connection-string"></a> |
| 117 | +<a id="table-handling-mode"></a> |
| 118 | +<a id="target-connection-string"></a> |
| 119 | +<a id="cloud-storage-authentication"></a> |
| 120 | +<a id="secure-connections"></a> |
| 121 | +<a id="intermediate-file-storage"></a> |
| 122 | +<a id="data-load-mode"></a> |
| 123 | +<a id="connection-strings"></a> |
| 124 | + |
| 125 | +- [Specify source and target databases]({% link molt/molt-fetch.md %}#specify-source-and-target-databases): Specify URL‑encoded source and target connections. |
| 126 | +- [Select data to migrate]({% link molt/molt-fetch.md %}#select-data-to-migrate): Specify schema and table names to migrate. |
| 127 | +- [Define intermediate file storage]({% link molt/molt-fetch.md %}#define-intermediate-storage): Export data to cloud storage or a local file server. |
| 128 | +- [Define fetch mode]({% link molt/molt-fetch.md %}#define-fetch-mode): Specifies whether data will only be loaded into/from intermediate storage. |
| 129 | +- [Shard tables]({% link molt/molt-fetch.md %}#shard-tables-for-concurrent-export): Divide larger tables into multiple shards during data export. |
| 130 | +- [Data load mode]({% link molt/molt-fetch.md %}#import-into-vs-copy-from): Choose between `IMPORT INTO` and `COPY FROM`. |
| 131 | +- [Table handling mode]({% link molt/molt-fetch.md %}#handle-target-tables): Determine how existing target tables are initialized before load. |
| 132 | +- [Define data transformations]({% link molt/molt-fetch.md %}#define-transformations): Define any row-level transformations to apply to the data before it reaches the target. |
| 133 | +- [Monitor fetch metrics]({% link molt/molt-fetch-monitoring.md %}): Configure metrics collection during initial data load. |
| 134 | + |
| 135 | +Read through the documentation to understand how to configure your `molt fetch` command and its flags. Follow [best practices]({% link molt/molt-fetch-best-practices.md %}), especially those related to security. |
| 136 | + |
| 137 | +At minimum, the `molt fetch` command should include the source, target, data path, and [`--ignore-replication-check`]({% link molt/molt-fetch-commands-and-flags.md %}#ignore-replication-check) flags: |
| 138 | + |
| 139 | +{% include_cached copy-clipboard.html %} |
| 140 | +~~~ shell |
| 141 | +molt fetch \ |
| 142 | +--source $SOURCE \ |
| 143 | +--target $TARGET \ |
| 144 | +--bucket-path 's3://bucket/path' \ |
| 145 | +--ignore-replication-check |
| 146 | +~~~ |
| 147 | + |
| 148 | +However, depending on the needs of your migration, you may have many more flags set, and you may need to prepare some accompanying .json files. |
| 149 | + |
| 150 | +### Run MOLT Fetch |
| 151 | + |
| 152 | +Perform the bulk load of the source data. |
| 153 | + |
| 154 | +1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data into CockroachDB. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It limits the migration to a single schema and filters for three specific tables. The [data load mode]({% link molt/molt-fetch.md %}#import-into-vs-copy-from) defaults to `IMPORT INTO`. Include the `--ignore-replication-check` flag to skip replication checkpoint queries, which eliminates the need to configure the source database for logical replication. |
| 155 | + |
| 156 | + <section class="filter-content" markdown="1" data-scope="postgres"> |
| 157 | + {% include_cached copy-clipboard.html %} |
| 158 | + ~~~ shell |
| 159 | + molt fetch \ |
| 160 | + --source $SOURCE \ |
| 161 | + --target $TARGET \ |
| 162 | + --schema-filter 'migration_schema' \ |
| 163 | + --table-filter 'employees|payments|orders' \ |
| 164 | + --bucket-path 's3://migration/data/cockroach' \ |
| 165 | + --table-handling truncate-if-exists \ |
| 166 | + --ignore-replication-check |
| 167 | + ~~~ |
| 168 | + </section> |
| 169 | + |
| 170 | + <section class="filter-content" markdown="1" data-scope="mysql"> |
| 171 | + {% include_cached copy-clipboard.html %} |
| 172 | + ~~~ shell |
| 173 | + molt fetch \ |
| 174 | + --source $SOURCE \ |
| 175 | + --target $TARGET \ |
| 176 | + --table-filter 'employees|payments|orders' \ |
| 177 | + --bucket-path 's3://migration/data/cockroach' \ |
| 178 | + --table-handling truncate-if-exists \ |
| 179 | + --ignore-replication-check |
| 180 | + ~~~ |
| 181 | + </section> |
| 182 | + |
| 183 | + <section class="filter-content" markdown="1" data-scope="oracle"> |
| 184 | + The command assumes an Oracle Multitenant (CDB/PDB) source. [`--source-cdb`]({% link molt/molt-fetch-commands-and-flags.md %}#source-cdb) specifies the container database (CDB) connection string. |
| 185 | + |
| 186 | + {% include_cached copy-clipboard.html %} |
| 187 | + ~~~ shell |
| 188 | + molt fetch \ |
| 189 | + --source $SOURCE \ |
| 190 | + --source-cdb $SOURCE_CDB \ |
| 191 | + --target $TARGET \ |
| 192 | + --schema-filter 'migration_schema' \ |
| 193 | + --table-filter 'employees|payments|orders' \ |
| 194 | + --bucket-path 's3://migration/data/cockroach' \ |
| 195 | + --table-handling truncate-if-exists \ |
| 196 | + --ignore-replication-check |
| 197 | + ~~~ |
| 198 | + </section> |
| 199 | + |
| 200 | +{% include molt/fetch-data-load-output.md %} |
| 201 | + |
| 202 | +### Continue MOLT Fetch after an interruption |
| 203 | + |
| 204 | +{% include molt/fetch-continue-after-interruption.md %} |
| 205 | + |
| 206 | +## Step 5: Verify the data |
| 207 | + |
| 208 | +In this step, you will use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful. |
| 209 | + |
| 210 | +### Run MOLT Verify |
| 211 | + |
| 212 | +{% include molt/verify-output.md %} |
| 213 | + |
| 214 | +## Step 6: Finalize the target schema |
| 215 | + |
| 216 | +### Add constraints and indexes |
| 217 | + |
| 218 | +{% include molt/migration-modify-target-schema.md %} |
| 219 | + |
| 220 | +## Step 7: Cut over application traffic |
| 221 | + |
| 222 | +With the target cluster verified and finalized, it's time to resume application traffic. |
| 223 | +
|
| 224 | +### Modify application code |
| 225 | +
|
| 226 | +In the application back end, make sure that the application now directs traffic to the CockroachDB cluster. For example: |
| 227 | +
|
| 228 | +~~~yml |
| 229 | +env: |
| 230 | + - name: DATABASE_URL |
| 231 | + value: postgres://root@localhost:26257/defaultdb?sslmode=verify-full |
| 232 | +~~~ |
| 233 | +
|
| 234 | +### Resume application traffic |
| 235 | +
|
| 236 | +Scale up the Kubernetes deployment to the original number of replicas: |
| 237 | +
|
| 238 | +{% include_cached copy-clipboard.html %} |
| 239 | +~~~shell |
| 240 | +kubectl scale deployment app --replicas=3 |
| 241 | +~~~ |
| 242 | +
|
| 243 | +This ends downtime. |
| 244 | +
|
| 245 | +## Troubleshooting |
| 246 | +
|
| 247 | +{% include molt/molt-troubleshooting-fetch.md %} |
| 248 | +
|
| 249 | +## See also |
| 250 | +
|
| 251 | +- [MOLT Fetch]({% link molt/molt-fetch.md %}) |
| 252 | +- [MOLT Verify]({% link molt/molt-verify.md %}) |
| 253 | +- [Migration Overview]({% link molt/migration-overview.md %}) |
| 254 | +- [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) |
0 commit comments