Skip to content

mysql_cdc: single-table snapshots are not parallelised, making very large tables the bottleneck #4341

@ankit481

Description

@ankit481

Problem

mysql_cdc reads each table's initial snapshot through a single goroutine iterating rows with keyset pagination. Even after #4320 adds inter-table parallelism, the snapshot of any individual table is still strictly sequential, so pipelines dominated by a single very large table see essentially no speedup.

Concretely, on a representative message-style table:

  • 80M rows observed throughput: ~30M rows/hour
  • Extrapolated 400M row production table: ~12 hours
  • AWS DMS completes the same 400M rows in ~45 minutes

The ~16x gap vs DMS is almost entirely intra-table. DMS and Debezium's incremental snapshot both split the primary-key range of a single table across multiple workers.

Root cause

readSnapshotTable does one keyset-paginated SELECT per batch:

SELECT * FROM t ORDER BY pk LIMIT N
SELECT * FROM t WHERE (pk) > (last) ORDER BY pk LIMIT N
...

Only one goroutine ever drives that loop per table. With #4320 a different table can run on a different worker, but inside any one table the reads are strictly serial.

For workloads where 90%+ of snapshot time is one table, the effective parallelism is 1 regardless of how snapshot_max_parallel_tables is set.

Desired behaviour

Operators should be able to partition a single table's primary-key space into N chunks and have the existing worker pool consume chunks in parallel. The consistency guarantee (every worker reads under one FLUSH TABLES WITH READ LOCK window at the same binlog position) must be preserved.

Target throughput: match AWS DMS on the 400M row reference workload — ~45 minutes, with 1 hour as an acceptable ceiling. That requires meaningful concurrency (~16 workers x reasonable chunk count) on a single table.

Proposed solution

A new advanced field snapshot_chunks_per_table that, when set above 1:

  1. Under the existing consistent-snapshot window, probes MIN(pk) and MAX(pk) on the first primary-key column.
  2. Splits [MIN, MAX] into N half-open chunks.
  3. Emits one work unit per chunk and fans them out across the worker pool controlled by snapshot_max_parallel_tables.

Scope of the first cut:

  • Single-column integer PKs: fully supported.
  • Composite PKs: chunk on the leading PK column; per-chunk keyset pagination continues to respect the full PK ordering. This is the common shape for (tenant_id, id) / (shard, ts) composites.
  • Non-numeric first PK columns (VARCHAR, UUID, binary, etc.): fall back to a single whole-table read with an informational log line. Supporting these cleanly requires boundary discovery via sampling or OFFSET-based probing and is deliberately out of scope here.

Skewed leading-column distributions (e.g. 90% of rows under one tenant_id) will produce uneven chunks; operators hitting that can simply leave snapshot_chunks_per_table at 1 and rely on table-level parallelism alone. No correctness impact.

PR implementing this: will be linked below.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions