Skip to content

feat(server): Implement tagged chunk loading#7070

Merged
abhijat merged 7 commits into
mainfrom
abhijat/feat/implement-chunk-tag-load
Apr 30, 2026
Merged

feat(server): Implement tagged chunk loading#7070
abhijat merged 7 commits into
mainfrom
abhijat/feat/implement-chunk-tag-load

Conversation

@abhijat
Copy link
Copy Markdown
Contributor

@abhijat abhijat commented Apr 5, 2026

Changes to rdb loader to handle tagged chunks. A data stream read by the loader now looks like:

{chunk1:[stream-id]chunk-size][data...]}{untagged data...}{chunk2:[stream-id]chunk-size][data...]}

If the sender does not turn on tagging, then all the data is untagged.

where

  • stream-id is 4 byte id, same for all chunks of same db+key
  • chunk size is 4 byte size of data in chunk

Code changes:

  • parse RDB_OPCODE_TAGGED_CHUNK and track the active chunk payload budget
  • save and restore per-stream partial object state across continuation chunks
  • add loader-only tests that feed handcrafted tagged byte streams, including split SBF cases and interleaving

New state in loader:

  • current chunk: stream id and payload size
  • stream states: partially read objects which can be resumed when next chunk for them is seen

Some of the new tests can be removed once the saver PR is merged as they will become redundant. But for now these are necessary to test the loader code in isolation by feeding it artificial byte stream containing chunks.

Example of loading one set split across multiple tagged chunks

  • First chunk:
    • chunk budget is set from the tagged payload size
    • call chain: LoadKeyValPair -> ReadAndDispatchObject -> ReadObj -> ReadSet
    • reading stops when either the set is complete or the chunk budget is exhausted
  • If the chunk ends first (budget exhausted):
    • the partially built PrimeValue is kept in now_chunked_
    • parser continuation state, including remaining items, is saved in stream_states_
  • Continuation chunk:
    • chunk budget is set again from the new tagged payload size
    • call chain: LoadValueChunk -> ReadAndDispatchObject -> ReadObj -> ReadSet
    • parser state is restored from stream_states_
    • newly read elements are appended to the partial value in now_chunked_
  • This repeats until the object is fully read, at which point:
    • continuation state is removed
    • the final value is written to the DB

@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch 9 times, most recently from afce4b9 to 7a1d609 Compare April 6, 2026 05:51
@abhijat abhijat marked this pull request as ready for review April 6, 2026 06:32
Copilot AI review requested due to automatic review settings April 6, 2026 06:32

This comment was marked as outdated.

@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch from b0a1338 to 5d2fb32 Compare April 6, 2026 06:52
@abhijat abhijat changed the title [WIP] feat(server): Implement tagged chunk loading feat(server): Implement tagged chunk loading Apr 6, 2026
@abhijat
Copy link
Copy Markdown
Contributor Author

abhijat commented Apr 6, 2026

augment review

@augmentcode

This comment was marked as outdated.

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc Outdated
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_extensions.h
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc Outdated
Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc Outdated
Comment thread src/server/rdb_load.cc Outdated
@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch from 98a4f93 to f7e973b Compare April 10, 2026 08:33
@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch 3 times, most recently from b885e93 to 54114b7 Compare April 23, 2026 06:45
@abhijat abhijat requested a review from Copilot April 23, 2026 07:23
@romange
Copy link
Copy Markdown
Collaborator

romange commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/server/rdb_load.cc
@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch from c7efb29 to 801814a Compare April 29, 2026 11:04
@abhijat abhijat requested a review from Copilot April 29, 2026 11:05
@romange
Copy link
Copy Markdown
Collaborator

romange commented Apr 29, 2026

@abhijat is it possible to catch in rdb_test.cc ?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc Outdated
.key = std::move(key),
.db_index = cur_db_index_,
.type = type,
.pending_read = pending_read_,
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saving StreamState copies pending_read_ (which can include large partially-filled buffers like SBF filter_data), causing an extra full-buffer copy and memory spike exactly on chunk boundaries. Prefer moving pending_read_ into StreamState (and then resetting pending_read_) to avoid duplicating potentially huge data.

Suggested change
.pending_read = pending_read_,
.pending_read = std::move(pending_read_),

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@abhijat
Copy link
Copy Markdown
Contributor Author

abhijat commented Apr 29, 2026

@abhijat is it possible to catch in rdb_test.cc ?

yes I am trying to update the test to cover this

@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch 2 times, most recently from 027fd14 to 336180e Compare April 29, 2026 11:35
@abhijat
Copy link
Copy Markdown
Contributor Author

abhijat commented Apr 29, 2026

@abhijat is it possible to catch in rdb_test.cc ?

I added in existing InterleavedLoad

  • create key and hash which is on shard 1, while test runs on shard 0 (because bug was in non inlined path)
  • split hash into chunks, first chunk goes in db 0
  • add selectdb opcode to switch db to 1 before second chunk
  • assert that hash is reassembled in db 0

with the old code it fails with

/home/abhijat/dev/cc/dragonfly/src/server/rdb_test.cc:1516: Failure
Expected equality of these values:
  Run({"HGET", key, "f1"})
    Which is: nil
  "v1"
    Which is: 0x1ab9b90

/home/abhijat/dev/cc/dragonfly/src/server/rdb_test.cc:1517: Failure
Expected equality of these values:
  Run({"HGET", key, "f2"})
    Which is: nil
  "v2"
    Which is: 0x1ab9b96

ie key, f1=v1 and f2=v2 created on db=1 which was active when last chunk was received, and not db=0, which was supposed to hold the object

with the changes it passes

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/server/rdb_load.cc
abhijat added 5 commits April 29, 2026 18:22
Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>
Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>
Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>
Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>
@abhijat abhijat force-pushed the abhijat/feat/implement-chunk-tag-load branch from 336180e to 5f1e22a Compare April 29, 2026 12:53
@abhijat abhijat enabled auto-merge (squash) April 30, 2026 08:42
@abhijat abhijat merged commit 01e8e60 into main Apr 30, 2026
13 checks passed
@abhijat abhijat deleted the abhijat/feat/implement-chunk-tag-load branch April 30, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants