Skip to content

ldr: require cursor parameter when starting transactional LDR #169338

@jeffswenson

Description

@jeffswenson

Transactional LDR cannot safely perform an initial scan because it reorders rows by MVCC timestamp, which can violate foreign key constraints.

Consider the following writes:

INSERT INTO parent (id, number) VALUES ('parent-a', 1);
INSERT INTO children (id, parent_id) VALUES ('child', 'parent-a');
UPDATE parent SET number = 2 WHERE id = 'parent-a';

When replicating from MVCC history this replicates correctly. But during an initial scan we only have the most recent MVCC values. Ordering by MVCC time produces:

INSERT INTO children (id, parent_id) VALUES ('child', 'parent-a'); -- fails because of missing fk
INSERT INTO parent (id, number) VALUES ('parent-a', 2); 

Row-based LDR handles this because the child row triggers a backoff until the parent replicates. Transactional LDR expects to replicate with no retries, so this fails.

We should require a cursor when starting transactional LDR and validate it against the GC history. This avoids the initial scan code path entirely. Customers would create logically replicated tables for initial backfill and use pause/resume instead of cancelling replication. For repair workflows, they could use row-based LDR for the backfill then switch to transactional LDR once caught up.

This keeps our options open to implement a smarter backfill in the future without needing to support broken backfill behavior now.

Epic: CRDB-60163

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions