Skip to content

fix(core): Update CVCUponInsert#7068

Merged
dranikpg merged 15 commits into
dragonflydb:mainfrom
dranikpg:serializer-base-fix
Apr 8, 2026
Merged

fix(core): Update CVCUponInsert#7068
dranikpg merged 15 commits into
dragonflydb:mainfrom
dranikpg:serializer-base-fix

Conversation

@dranikpg
Copy link
Copy Markdown
Contributor

@dranikpg dranikpg commented Apr 4, 2026

Don't skip buckets with versions that are less than snapshort version in CVCUponInsert. This is because we set the bucket version just when we start serializing. As a result, a modification operation doesn't wait for serializatin to finish before proceeding - because the version is already equal to snapshot version. This bug would occur if the global mutex would be absent - so this PR reintroduces function scope locking for it in OnChange.

It also introduces BucketSet as a utility to get rid of the callback

@dranikpg dranikpg force-pushed the serializer-base-fix branch from 7ae7a06 to 420d9fb Compare April 4, 2026 15:30
@dranikpg
Copy link
Copy Markdown
Contributor Author

dranikpg commented Apr 4, 2026

augment review

@dranikpg dranikpg changed the title fix(server): Hold big_value_mu_ across new key cvc fix(core): Don't skip empty buckets with CVCOnInsert Apr 5, 2026
@dranikpg dranikpg force-pushed the serializer-base-fix branch from cf0c8a4 to 8ac61cd Compare April 6, 2026 16:28
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the serializer-base-fix branch from 8ac61cd to 7642082 Compare April 6, 2026 21:27
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the serializer-base-fix branch from 4bfea25 to 181f8e7 Compare April 7, 2026 07:18
@dranikpg dranikpg changed the title fix(core): Don't skip empty buckets with CVCOnInsert fix(core): Update CVCUponInsert Apr 7, 2026
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the serializer-base-fix branch from c9c4155 to 58884f9 Compare April 7, 2026 08:09
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the serializer-base-fix branch from 28b72b0 to a0c46f4 Compare April 7, 2026 11:34
@dranikpg
Copy link
Copy Markdown
Contributor Author

dranikpg commented Apr 7, 2026

augment review

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 7, 2026

🤖 Augment PR Summary

Summary: This PR refines CVC (Capture Version Change) handling for inserts/bump-ups to improve snapshot/serialization correctness and simplify callers.

Changes:

  • Refactors DashTable::CVCUponInsert to return a BucketSet (iterable view) instead of using a callback-based interface.
  • Updates Segment::CVCOnInsert to return std::optional (nullopt when segment is full) and to report potentially-touched buckets without version/emptiness filtering.
  • Updates Segment::CVCOnBump/CVCUponBump to ignore version thresholds and always report the involved buckets for stash bump-up scenarios.
  • Adjusts DbSlice::FlushSlotsFb to use the new CVCUponInsert iterable API.
  • Removes moved-item callback registration/invocation paths (RegisterOnMove/CallMovedCallbacks/UnregisterOnMoved).
  • Refactors serializer logic into ProcessIfNeeded + ProcessBucketInternal, and updates snapshot/restore traversal code to call the new entry point.
  • Adds additional DCHECKs/comments around snapshot versions and bucket processing invariants.

Technical Notes: Insert-triggered change handling now walks the bucket set returned by CVC to ensure relevant buckets are considered consistently across serializer and db-slice flows (including full-segment cases).

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

ProcessBucketInternal(db_index, it, true);
}

} // namespace dfly
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CVCUponInsert now returns empty buckets too, but ProcessBucketInternal still treats it.is_done() as an unconditional skip, and the delayed-entry flush in the skip-path is also gated on !it.is_done(). If delayed tiered reads are keyed by bucket identity even after the bucket becomes empty, this insert path may fail to wait/flush them despite the comment about ensuring delayed work is finished.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread src/server/serializer_base.cc
@dranikpg dranikpg marked this pull request as ready for review April 7, 2026 17:10
Copilot AI review requested due to automatic review settings April 7, 2026 17:10
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Dragonfly’s snapshot/replication serialization flow to more conservatively process buckets affected by inserts/bumps (including refactoring the CVC APIs) in order to address replication discrepancies (issue #7006), especially around delayed (tiered) entries.

Changes:

  • Refactor DashTable::CVCUponInsert to return a BucketSet iterable (instead of callback-based) and adjust CVCUponBump to ignore version thresholds.
  • Refactor SerializerBase bucket processing into ProcessIfNeeded + ProcessBucketInternal, and update snapshot/restore codepaths to use the new API.
  • Remove unused “moved callbacks” plumbing from DbSlice, and update affected call sites and tests.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/server/snapshot.cc Switch snapshot traversal to call ProcessIfNeeded instead of the old ProcessBucket.
src/server/serializer_base.h Rename/refactor bucket processing API to ProcessIfNeeded + internal helper.
src/server/serializer_base.cc Implement the refactor; update insert-change handling to iterate buckets from CVCUponInsert.
src/server/journal/streamer.cc Update restore streamer loop to use ProcessIfNeeded.
src/server/db_slice.h Remove moved-callback API; add SnapshotVersions() view helper.
src/server/db_slice.cc Remove moved-callback invocations; update insert/bump CVC call sites for new APIs.
src/core/dash.h Change CVCUponInsert to return BucketSet; update CVCUponBump signature; add BucketSet type.
src/core/dash_internal.h Refactor segment-level CVC helpers (CVCOnInsert returns optional; CVCOnBump ignores version threshold).
src/core/dash_test.cc Update unit tests to match the new CVC APIs.

Comment thread src/core/dash.h Outdated
return std::views::iota(0u, limit) |
std::views::transform([=, ids = ids, owner = owner, seg_id = seg_id](uint8_t i) {
uint8_t index = is_all ? i : ids[i];
return bucket_iterator{owner, seg_id, index};
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BucketSet::buckets() constructs bucket_iterator{owner, seg_id, index} which runs Seek2Occupied() and sets owner_ = nullptr for empty buckets; this makes empty buckets indistinguishable from end-iterators and prevents callers (e.g. SerializerBase::OnChange(key)) from getting bucket identity/version for empty buckets as intended. Consider yielding a bucket iterator that does not Seek2Occupied (e.g. construct with slot=0) or returning a dedicated bucket reference type that remains valid even when empty.

Suggested change
return bucket_iterator{owner, seg_id, index};
return bucket_iterator{owner, seg_id, index, 0};

Copilot uses AI. Check for mistakes.
Comment thread src/server/serializer_base.cc Outdated
Comment on lines +172 to +176
// Assert the version is equal to a snapshot version so no other modifications have happened
DCHECK_GE(it.GetVersion(), snapshot_version_);
auto current_snapshots = db_slice_->SnapshotVersions();
DCHECK(std::ranges::find(current_snapshots, it.GetVersion()) != current_snapshots.end())
<< absl::StrJoin(current_snapshots, " ") << " does not contain " << it.GetVersion();
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::ranges::find is used here but this file does not include <algorithm>, which is where ranges algorithms are declared; this can break builds depending on transitive includes. Add the proper header (or switch to an absl helper already included) to make this TU self-contained.

Copilot uses AI. Check for mistakes.
Comment thread src/core/dash_internal.h Outdated
Comment thread src/server/db_slice.cc
Comment thread src/core/dash.h
Comment thread src/core/dash.h Outdated
Comment thread src/core/dash.h
Comment thread src/core/dash.h Outdated
Comment thread src/core/dash.h Outdated

// Assert the version is equal to a snapshot version so no other modifications have happened
DCHECK_GE(it.GetVersion(), snapshot_version_);
auto current_snapshots = db_slice_->SnapshotVersions();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, what is that? what scenario it guards against?

Please note that db_slice_->SnapshotVersions(); is called and executed in release mode as well.

Comment thread src/server/serializer_base.cc Outdated
it.SetVersion(snapshot_version_);
MarkBucketSerializing(it.bucket_address());

stats_.keys_serialized += SerializeBucket(db_index, it, on_update);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg dranikpg requested a review from romange April 8, 2026 07:58
Comment thread src/core/dash.h Outdated
#pragma once

#include <array>
#include <optional>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not need optional now

Comment thread src/core/dash_internal.h Outdated
Comment thread src/core/dash_internal.h Outdated
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the serializer-base-fix branch from 32b44ab to c612383 Compare April 8, 2026 08:35
@dranikpg dranikpg requested a review from romange April 8, 2026 08:36
Comment thread src/server/serializer_base.cc Outdated
Comment thread src/server/serializer_base.cc Outdated
if (ProcessBucket(db_index, it, true))
++stats_.buckets_on_change;

std::lock_guard guard(big_value_mu_);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be replaced with ProcessIfNeeded which I find a bad name as it always calls ProcessBucketInternal so there is no if needed part

@dranikpg dranikpg requested a review from romange April 8, 2026 15:11
@romange
Copy link
Copy Markdown
Collaborator

romange commented Apr 8, 2026

The PR description is outdated now

@dranikpg dranikpg merged commit 5c8a424 into dragonflydb:main Apr 8, 2026
12 checks passed
@dranikpg dranikpg deleted the serializer-base-fix branch April 8, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants