Skip to content

ptstorage: lock the meta row during protect and release#170655

Merged
trunk-io[bot] merged 1 commit into
cockroachdb:masterfrom
stevendanna:ptstorage-meta-for-update
May 22, 2026
Merged

ptstorage: lock the meta row during protect and release#170655
trunk-io[bot] merged 1 commit into
cockroachdb:masterfrom
stevendanna:ptstorage-meta-for-update

Conversation

@stevendanna

@stevendanna stevendanna commented May 20, 2026

Copy link
Copy Markdown
Collaborator

Both protect and release read the singleton system.protected_ts_meta
row via currentMetaCTE, then upsert a new version of it. Concurrent
callers all read the row at the same HLC, all try to write a new
version, and the losers retry on WriteTooOld. Beyond ~100 retries the
call errors out. Successful retries still leave MVCC versions piled up
on the meta row, so subsequent reads scan more history and slow down
over the life of the cluster.

This PR acquires an exclusive lock on the meta row during the CTE
read. Concurrent callers queue at the lock instead of colliding on the
upsert. FOR UPDATE is bound to the real-table leg of the CTE so the
synthetic zero-row fallback (used when no meta row exists yet) does
not lock anything. getMetadataQuery is read-only and keeps using the
non-locking CTE.

BenchmarkProtect and BenchmarkRelease, added alongside, measure
throughput at 128 concurrent writers on an in-process server:

name                    old sec/op     new sec/op     delta
Protect/workers=128-10  44.744m ± 25%   9.911m ± 42%  -77.85% (p=0.000 n=10)
Release/workers=128-10  55.683m ± 41%   9.768m ± 52%  -82.46% (p=0.000 n=10)

The baseline runs also produced retry-budget-exhaustion errors during
ramp-up (errs/op 0.13–0.32 in the first trial); after the change
errs/op is 0 across all trials.

Epic: none

Release note (performance improvement): Concurrent protected timestamp
protect and release calls (used heavily by backup and changefeed) now
serialize on the meta row rather than racing into WriteTooOld retries.
Workloads that create or release many protected timestamp records at
once see substantially higher throughput.

@stevendanna stevendanna requested a review from a team as a code owner May 20, 2026 16:09
@trunk-io

trunk-io Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

😎 Merged successfully - details.

@cockroach-teamcity

Copy link
Copy Markdown
Member

This change is Reviewable

@stevendanna stevendanna force-pushed the ptstorage-meta-for-update branch from 6df8dad to 8fa9718 Compare May 20, 2026 16:11
@stevendanna stevendanna changed the title ptstorage: take FOR UPDATE lock on meta row in protect and release ptstorage: lock the meta row during protect and release May 20, 2026
@stevendanna stevendanna requested review from dt and tbg May 21, 2026 08:59
@stevendanna

Copy link
Copy Markdown
Collaborator Author

I'm going to follow up by ripping this write out hopefully, but the FOR UPDATE does seem to help in high concurrency situations.

Both protect and release read the singleton system.protected_ts_meta
row via currentMetaCTE, then upsert a new version of it. Concurrent
callers all read the row at the same HLC, all try to write a new
version, and the losers retry on WriteTooOld. Beyond ~100 retries the
call errors out. Successful retries still leave MVCC versions piled up
on the meta row, so subsequent reads scan more history and slow down
over the life of the cluster.

Acquire an exclusive lock on the meta row during the CTE read.
Concurrent callers queue at the lock instead of colliding on the
upsert. FOR UPDATE is bound to the real-table leg of the CTE so the
synthetic zero-row fallback (used when no meta row exists yet) does
not lock anything. getMetadataQuery is read-only and keeps using the
non-locking CTE.

BenchmarkProtect and BenchmarkRelease, added alongside, measure
throughput at 128 concurrent writers on an in-process server:

    name                    old sec/op     new sec/op     delta
    Protect/workers=128-10  44.744m ± 25%   9.911m ± 42%  -77.85% (p=0.000 n=10)
    Release/workers=128-10  55.683m ± 41%   9.768m ± 52%  -82.46% (p=0.000 n=10)

The baseline runs also produced retry-budget-exhaustion errors during
ramp-up (errs/op 0.13-0.32 in the first trial); after the change
errs/op is 0 across all trials.

Epic: none

Release note (performance improvement): Concurrent protected timestamp
protect and release calls (used heavily by backup and changefeed) now
serialize on the meta row rather than racing into WriteTooOld retries.
Workloads that create or release many protected timestamp records at
once see substantially higher throughput.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@stevendanna stevendanna force-pushed the ptstorage-meta-for-update branch from 50ade81 to eae678c Compare May 22, 2026 09:40
@stevendanna

Copy link
Copy Markdown
Collaborator Author

/trunk merge

TFTR!

@trunk-io trunk-io Bot merged commit 17f2d53 into cockroachdb:master May 22, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants