feat: range aware file write#717
Conversation
fea540f to
981ba18
Compare
Three new APIs to support range-aware writes: 1. Client::get_file_chunk_hashes(file_id) -> ChunkHashList 2. FileUploadSession::register_composed_file(file_info) 3. upload_ranges(config, cas_client, original_hash, original_size, dirty_ranges, dirty_source, total_size) See https://gist.github.com/XciD/198f7e6cfd68f4a0f19c0c4a37c14b61 for a visual explanation. Supporting changes: - SingleFileCleaner::finish() now returns ChunkHashList (eliminates CAS round-trip) - FileUploadSession::file_info_list() for checkpoint-based flow (single session) - XorbObject::chunk_hash_sizes() helper - ChunkHashList type alias in xet_core_structures::merklehash - get_file_chunk_hashes implemented for MemoryClient/LocalClient
… hash When two dirty regions generate identical content after CDC processing, they will have the same Merkle hash. Previously, the mdb_by_hash HashMap would silently drop the second region's MDBFileInfo due to hash collision, causing the second region to incorrectly use the first region's segments. The fix changes mdb_by_hash from HashMap<hash, MDBFileInfo> to HashMap<hash, Vec<MDBFileInfo>>, allowing it to handle multiple regions with the same hash. Regions are matched in order of upload, preserving the expected composition. Added regression test test_two_regions_identical_hash_collision that verifies two dirty regions with identical content don't cause file corruption.
- Unify duplicated chunk extraction logic in MemoryClient using Cow for borrowed/owned variants - Use custom serde deserializer for type-safe MerkleHash deserialization in API responses - Move hash validation from business logic to deserialization boundary - Simplify response processing by eliminating manual hash parsing
Four bugs found via Codex review: 1. Truncation boundary leak: boundary suffix fed CAS bytes beyond total_size into the cleaner, producing wrong chunks and hash. Fix: cap boundary_end at total_size. 2. Identical hash panic: shard manager deduplicates MDBFileInfo by file_hash (BTreeMap), so two regions with the same content only produce one entry. The old remove(0) would panic on the second region. Fix: clone instead of remove (same hash = same segments). 3. Append hash mismatch: last original chunk (EOF-terminated) was reused verbatim instead of being re-chunked with appended data, producing a different hash than a clean upload. Fix: back up first_chunk to include the last original chunk, download its bytes from CAS via the boundary prefix mechanism. 4. Append gap loss: a dirty range starting after original_size (seek-past-EOF) left bytes [original_size, dirty_start) uncovered. Fix: pull dirty range start to original_size during append merge. All tests now verify hash equality with a clean upload of the same content. Added sparse staging file test to catch the boundary prefix vs staging file data source issue.
The dedup_manager.finalize() now returns ChunkHashList as a second element. Destructure it as _chunk_hashes to fix the WASM build.
- Move DirtyRegion, UploadedRegion, ComposedRegion to module level - Make register_composed_file and file_info_list pub(crate) - Add TODO for blocking I/O in async context - Use pseudo-random data in all tests for reliable multi-chunk CDC - Replace if-guard with assert! in hash collision test (512KB data) - Clean up redundant comments
Replace the single buffered boundary download with two targeted FileReconstructor streams (prefix + suffix) that feed directly into the cleaner. This eliminates the Vec<u8> intermediate buffer and the offset arithmetic for slicing prefix/suffix from it. Each stream covers exactly the bytes needed (typically one CDC chunk), and most cases only need one stream (prefix or suffix, not both).
Replace debug_assert! with real errors for dirty_ranges validation (sorted, non-overlapping, non-empty intervals). Move the no-op early return before validation so empty ranges skip the checks. Add tests for overlapping, empty, and unsorted dirty ranges.
5be3af3 to
7eb1169
Compare
- Extract helpers: build_dirty_regions, stream_cas_range, merge_or_push - Replace silent unwrap_or/clamps with explicit debug_assert or runtime errors - Add precondition validation (dirty_range > total_size) - Add test: truncation on chunk boundary, no-op, dirty range past total_size, build_dirty_regions coalescing, inconsistent chunk data - Remove duplicate tests (truncation/append hash-only variants) - Improve comments: ASCII diagrams on tests, clearer doc on truncation/append - Document SHA-256 limitation in upload_ranges doc - Move test helpers to bottom of test module
7eb1169 to
e0db8b9
Compare
…hashes-and-compose (#717) Combined branch for hf-mount that needs both CAS client factory (PR 675) and range upload/compose APIs (PR 717). Chunk cache integration from 675 is stubbed out (param accepted but unused) since the struct shapes diverged.
- Extract caller-input validation into `validate_dirty_ranges` helper, now
reused by `upload_ranges` and `upload_fresh_file`.
- Replace `DataError::InternalError` with `DataError::ParameterError` for
parameter-shape errors (sortedness, bounds, append coverage). The
remaining `InternalError`s are for genuine internal failures
(missing reconstruction info, MerkleHashSubtree::merge error, etc.).
- Drop the redundant `covered_up_to >= original_size` guard: `covered_up_to`
starts at `original_size` and only grows via `.max(end)`, so it is always
`>= original_size` inside the loop body.
- Use `ParameterError("file not found")` when reconstruction info is missing
for the requested `original_hash`.
… replace)
Switch `DirtyInput` from output-coordinate ranges (where the reader had to yield
exactly `range.end - range.start` bytes) to original-coordinate edits with an
explicit `new_length`. This expresses pure inserts, pure deletes, and arbitrary
in-place replaces in a single uniform shape.
API:
- `DirtyInput { original_range: Range<u64>, reader, new_length: u64 }`
- `upload_ranges(...)` no longer takes `total_size`; it derives it from the
inputs.
Mapping legacy callers:
- in-place edit: `original_range: a..b, new_length: b - a`
- pure insert: `original_range: X..X, new_length: N`
- pure delete: `original_range: a..b, new_length: 0`
- append: `original_range: original_size..original_size, new_length: N`
- truncate to N: `original_range: N..original_size, new_length: 0`
Implementation:
- Snap each edit's `original_range` to enclosing segment boundaries before
hitting the server. Pure inserts at `original_size` snap to the last segment;
inserts elsewhere snap to the segment containing the position. Server windows
come back equal to the snapped ranges.
- Per-window loop walks `&mut dirty_inputs[input_idx..edits_end]` once;
`edits_end` is found via `take_while` with a closure that tracks the EOF-insert
boundary. Single fold for `removed`/`added` totals; streaming buffer hoisted
out of the inner loop.
- Hash composition (`MerkleHashSubtree::merge`) and segment-swap composition
unchanged.
- `validate_dirty_ranges` now runs at the very top of `upload_ranges`, before
the empty-original short-circuit. Fixes a real bug in the previously pushed
code where `original_size == 0` bypassed validation entirely (Adrien spotted
it on review): the empty-original path's gap-only check accepted overlapping /
oversize ranges and would feed mismatched bytes to the cleaner.
Tests:
- 6 new resize tests (replace grow, replace shrink, mid-file insert, mid-file
delete, multi-edit mix, insert at segment boundary).
- `test_resize_edits_abc` covers the three motivating one-liners
(`abc -> foobc`, `abc -> fooabc`, `abc -> bc`).
- `test_empty_original_validates_ranges` locks the validation order against the
bug Adrien reported.
- New `assert_edits` helper consumed by all the resize tests; obsolete
`make_dummy_inputs` validation tests for empty-range / append-without-coverage
removed (the new model accepts pure inserts and has no implicit coverage rule).
27/27 tests pass; CI clippy clean.
|
@lhoestq I've modified the api to answer to your need. We can't test right know over the prod as the code is not deployed yet. |
…h_chunks The chunk hash list returned by SingleFileCleaner::finish is only consumed by upload_ranges (to build partial MerkleHashSubtree nodes for newly- uploaded windows). Every other caller threw it away as _chunks / _chunk_hashes. Keep the 3-tuple variant under finish_with_chunks for the one real consumer and let everyone else use the simpler 2-tuple finish.
debug_assert_eq! is compiled out in release builds, so a malformed get_file_chunk_hashes response (wrong number of windows or hash_ranges) would silently truncate the merge sequence via .pop()/.zip() and produce an incorrect composed file hash with no error to the caller. Replace the three correctness-critical checks (server window count, server hash_ranges count, segment-aligned windows, plus the file_size sanity check on reconstruction info) with real Err returns so a misbehaving server is detected in release rather than silently corrupting the composed reconstruction.
Every dirty edit must land in exactly one returned window. If the server ever returns narrower dirty_byte_ranges than requested, leftover edits would silently drop and the composed file would be corrupt with no error. Promote that invariant to a runtime error, matching the existing response-shape checks above.
rajatarya
left a comment
There was a problem hiding this comment.
Re-reviewing this since the design has shifted substantially since my March approval (per-edit AsyncRead, resize-edit support, MerkleHashSubtree::merge composition, multi-range API adoption). The redesign is a clear improvement — particularly:
- Per-edit
AsyncRead+ reader ownership: structurally fixes the prior truncation-corruption hazard (callers literally cannot supply zero bytes for a region not in their edit set). - Defensive contract checks on the server response (
response.windows.len() != n_windows,hash_ranges.len() != n_windows + 1, plus the post-loopinput_idx != dirty_inputs.len()check) — these turn what would otherwise be silent file-hash corruption into loud errors. - Validation runs before the empty-original short-circuit — the regression test
test_empty_original_validates_rangesis exactly the right shape. - Test matrix is excellent. Round-trip + clean-upload-hash equivalence on every content-modifying test is the gold standard, and the codex regressions are captured (
test_mid_edit_plus_append,test_empty_original_append,test_truncate_to_empty_matches_clean_empty).
My approval still stands; logging a few items here that came up on this pass:
- Orphan window file entries (still present from my March review). Each window's
cleaner.finish_with_chunks()ultimately callsregister_single_file_clean_completion→add_file_reconstruction_info(fi)(file_upload_session.rs:494), and thenregister_composed_fileregisters the composed MDB on top. So a singleupload_rangescall writes N+1 file reconstruction entries to the session shard: one per window plus the composed one. Functionally fine — segments are reachable via the composed file — but the per-window entries are unreferenced and pollute the catalog, and v1 GC isn't designed to spot them. Worth confirming this is intentional for v1 and tracking the cleanup, or filing an issue if not yet logged. compute_total_size(range_upload.rs:437) does plain arithmetic on caller-suppliednew_length. Since validation only checksoriginal_range.end <= original_size(not totals), a malicious or buggy caller passing hugenew_lengthvalues overflowsoriginal_size + added. Achecked_add(+ParameterError) would close that.- nit:
hash_ranges.first().is_some_and(Option::is_none)(range_upload.rs:285) reads obliquely — a single-line comment like// leading gap is empty ⇒ first window starts at byte 0would help future readers parse the double-Option intent. - nit:
register_composed_file's lookup-by-hash path (collectingmdb_by_hashfromsession.file_info_list()) implicitly assumes window content-hashes are unique within the session. If two windows happened to produce identical content (unlikely but possible with synthetic data), the HashMap would dedupe them and the.get(&middle_hash)would still resolve, but the segment splice would reuse the samemiddle_mdbtwice — which is actually correct since the content is identical. Worth a brief comment that this is intentional.
Nothing here changes my approve verdict — items 2-4 are small. Item 1 is the same flag from March; happy to proceed if the call is to ship and clean up later.
44815c9 to
1ba9082
Compare
upload_ranges was emitting composed shards without a verification section, which
cas-server now rejects with `MDBShard("Shard verification failure, missing
verification section")`. Root cause: the May 1 MerkleHashSubtree refactor
(a9d9cfe) dropped client-side computation of FileVerificationEntry for stable
original segments, expecting `original_mdb.verification` to be populated — but
GET /v1/reconstructions has always returned an empty verification list, so the
composed shard always went out without verification.
Fix: extend GET /v2/file-chunk-hashes/{file_id} to return one
FileVerificationEntry per stable original segment (segments wholly outside the
dirty windows), in segment order. range_upload pops these in lockstep with the
segment walk to populate the composed shard's verification section.
This is the minimum data needed: window segments are re-uploaded fresh so their
verification comes from the per-window MDB; only stable segments need the
server's help. ~32 bytes per stable segment, much cheaper than the legacy
"return the whole chunk hash list" endpoint that was replaced by the multi-range
API in 78de688.
Backwards-compat: when the original file has no verification entries
(legacy / test files registered before verification was introduced), the
server returns an empty `gap_verification` and the composed shard is emitted
with `with_verification=false` as before.
Pairs with a xetcas-side change to populate `gap_verification` in the response.
Sim clients (local_client, memory_client) populate it locally via
build_file_chunk_hashes_response so the existing 27 range_upload tests pass
unchanged.
1ba9082 to
45f5937
Compare
The CAS server rejects any shard without a verification section. When the entire file falls within dirty windows (no stable segments), gap_verification is empty and original_mdb.verification is stripped, causing original_has_verification to be false and the composed shard to omit verification. Since the cleaner always produces verification entries for its segments, unconditionally set has_verification=true.
- Use checked arithmetic in compute_total_size to prevent overflow - Add clarifying comments on hash_ranges double-Option intent - Add comment noting mdb_by_hash dedup-by-content is intentional
Add finish_with_chunks_detached() to SingleFileCleaner that uploads xorb data but returns the MDBFileInfo directly instead of registering it in the session shard. upload_ranges now uses the MDBFileInfo from each window directly for composition, so only the final composed file gets registered. This eliminates N unreferenced per-window entries that previously polluted the shard.
…ation The server always returns this field; no backward compatibility needed.
… improve error msg - Extract MDB assembly (segment splicing + verification) into compose_mdb() - Remove hardcoded original_has_verification=true and its dead branches - Include file hash in "file not found" error for easier debugging
…se error nightly-2026-05-06 broke __heap_base export needed by wasm-bindgen for threading support. Pin to the last known-good nightly and remove +nightly from build_wasm.sh so the CI-controlled toolchain is used.
…_heap_base error" This reverts commit 16022bd.
rajatarya
left a comment
There was a problem hiding this comment.
Re-review (post xetcas#987 merge)
This PR has come a long way since my approval on March 27. I went back and re-read the whole thing end-to-end against the new wire shape, and the rework is genuinely good — what shipped is cleaner than what I approved.
What's new since last pass
- Wire shape pivot. Composition is now driven by
windows[] + hash_ranges[]of opaqueMerkleHashSubtreesummaries (xetcas#987), not flat per-chunk hashes. Per-chunk hashes never cross the wire, so the response stays O(windows + gaps) regardless of file size. Final hash is rebuilt withMerkleHashSubtree::merge— noChunkHashList-glue layer, no duplicated tree-folding logic on the client. - Resize edits.
DirtyInput { original_range, reader, new_length }collapses in-place edit / replace / pure-insert / pure-delete / append / truncate into a single shape. Theoriginal_range.len() != new_lengthbranch falls out cleanly thanks to per-edit readers (noReadSeekover a sparse staging file — that bug class is structurally gone). - My GC concern is addressed at the client layer. The new
register_single_file_clean_completion_detached+register_composed_filesplit means window-uploads no longer pollute the session shard with orphanMDBFileInfoentries — only the final composed file is registered. Combined with sirahd's GC tracking issue (huggingface-internal/xet-garbage-collection#2), the design now does the right thing: window xorbs are findable through the composed file, and there are no dangling shard references. - Defensive runtime invariants (commit 87b9061): server-contract violations (
windows.len() != n_windows,hash_ranges.len() != n_windows + 1, leftover edits not assigned to any window,gap_verificationlength mismatch with stable-segment count) are now hard runtime errors instead of silent corruption. Exactly the right call for a composition algorithm wherezipwould otherwise drop windows quietly.
What upload_ranges offers, visually
End-to-end flow:
sequenceDiagram
autonumber
participant C as Caller (e.g. hf-mount)
participant U as upload_ranges
participant S as CAS server
participant D as Cleaner / dedup
C->>U: dirty_inputs[]<br/>(original_range, reader, new_length)
Note over U: validate + coalesce<br/>snap to segment boundaries
U->>S: GET /v2/file-chunk-hashes/{id}<br/>X-Range-Dirty: bytes=A-B,C-D
S-->>U: windows[N] + hash_ranges[N+1]<br/>+ gap_verification[stable_segs]
loop one cleaner per window
U->>S: stream CAS prefix (boundary chunks)
U->>D: feed prefix bytes
U->>D: feed reader bytes (the edit)
U->>S: stream CAS suffix
U->>D: feed suffix bytes
D-->>U: chunks + new MDBFileInfo (detached)
end
Note over U: merge_seq = [gap0, w0, gap1, ..., gapN]<br/>combined_hash = merge(seq).hmac(0)
U->>S: register_composed_file(MDBFileInfo)
U->>S: finalize session
U-->>C: XetFileInfo { hash, size }
Why this is a big deal — the wire savings:
flowchart LR
subgraph old ["Old: full re-upload"]
OF["200MB original"] --> OU["download 200MB"]
OU --> OE["edit 1MB"]
OE --> OUP["upload 200MB"]
end
subgraph new ["upload_ranges: 2MB edit"]
NF["200MB original"] --> NS["snap edits to<br/>segment boundaries"]
NS --> NW["server returns<br/>2 windows + 3 gap subtrees"]
NW --> NB["download ~few KB<br/>boundary chunks"]
NB --> NU["upload ~2MB<br/>(2 dirty windows)"]
end
style old fill:#ffe5e5
style new fill:#e5ffe5
Hash composition without re-reading the file:
flowchart TB
F["Original file: chunks 0..M"] --> G0["gap0 subtree<br/>(server-stored)"] & G1["gap1 subtree<br/>(server-stored)"] & G2["gap2 subtree<br/>(server-stored)"]
R["DirtyInput readers"] --> CL1["cleaner: window 0"] --> W0["w0 subtree<br/>(client-computed)"]
R --> CL2["cleaner: window 1"] --> W1["w1 subtree<br/>(client-computed)"]
G0 --> M{{"MerkleHashSubtree::merge<br/>[gap0, w0, gap1, w1, gap2]"}}
W0 --> M
G1 --> M
W1 --> M
G2 --> M
M --> H["final_hash().hmac(0)<br/>== file_hash of edited file"]
Edit-shape coverage (one struct, six operations):
flowchart LR
DI["DirtyInput<br/>{ original_range: a..b, new_length: n }"] --> Q1{"a == b?"}
Q1 -->|yes| Q2{"n > 0?"}
Q1 -->|no| Q3{"n == 0?"}
Q2 -->|yes| INS["pure insert at a"]
Q2 -->|no| NOOP["degenerate no-op"]
Q3 -->|yes| DEL["pure delete a..b"]
Q3 -->|no| Q4{"b - a == n?"}
Q4 -->|yes| IPE["in-place edit"]
Q4 -->|no| RES["resize replace"]
INS -.->|special: a == original_size| APP["append"]
DEL -.->|special: b == original_size| TRU["truncate to a"]
Items to think about (none blocking)
chunk_window_builder.rsduplication. PR description flags this as intentional and temporary, mirroring xetcas#987 server logic so simulation clients answerget_file_chunk_hasheswithout HTTP. Worth filing a follow-up issue (or noting #717 in xetcas) to track the eventual dedup so the duplication doesn't quietly stick around as duplications always do without a forcing function.- Empty
gap_verificationon partial-verification shards. Whenfile_info.verification.len() != file_info.segments.len(), the simulation builder silently emits an emptygap_verification, and thencompose_mdberrors out at runtime with a less-targeted message. Inline below — either error up-front or document the silent path. - One readability nit + one defensive question. Inline.
Nothing here blocks the merge — most of these are conversation starters. Approving again on the strength of the current state. Nice work.
| response.hash_ranges.len(), | ||
| n_windows + 1 | ||
| ))); | ||
| } |
There was a problem hiding this comment.
These three guards (windows count, hash_ranges count at L182, edits-not-assigned-to-window at L269, and the gap_verification length checks in compose_mdb) are exactly the right thing to do. A subtle composition algorithm where zip silently truncates the merge sequence is precisely where you want loud bails over quiet corruption. The error messages even include expected vs actual counts, which makes server contract drift trivial to diagnose.
| .take_while(|d| { | ||
| let r = &d.original_range; | ||
| if r.start == r.end { | ||
| r.start < w_end || (r.start == w_end && w_end == original_size) |
There was a problem hiding this comment.
The boundary case for pure inserts at exactly w_end is genuinely tricky — an insert at byte w_end belongs to the next window unless w_end == original_size (in which case there is no next window). The comment at L201-203 captures the why. Worth referencing test_resize_insert_at_segment_boundary and test_mid_edit_plus_append here so a future reader can find the exercising tests quickly. nit.
| stream_cas_range(&ctx, &cas_client, original_hash, cursor, w_end, &mut cleaner).await?; | ||
| } | ||
|
|
||
| let (_info, chunks, mdb, _metrics) = cleaner.finish_with_chunks_detached().await?; |
There was a problem hiding this comment.
The _detached split is a real improvement — this resolves the GC concern I had on the prior pass. Window uploads now contribute their xorbs without registering an MDBFileInfo in the session shard, so only the final composed file shows up there. Combined with sirahd's tracking issue, the data lifecycle here is much cleaner than what I originally approved.
| let mut hash_ranges = response.hash_ranges; | ||
| let trailing_gap = hash_ranges.pop().flatten(); | ||
| // Leading gap is empty (None) => first window starts at byte 0. | ||
| let first_window_at_start = hash_ranges.first().is_some_and(Option::is_none); |
There was a problem hiding this comment.
nit: hash_ranges.first().is_some_and(Option::is_none) is concise but a bit cryptic — the reader has to remember Vec<Option<T>>::first() returns Option<&Option<T>> and reason about both layers. matches!(hash_ranges.first(), Some(None)) reads more directly. Same below for last_window_at_end if you want symmetry.
| gap_idx += 1; | ||
| seg_idx += 1; | ||
| } | ||
| let original_window_end = w.end.min(original_size); |
There was a problem hiding this comment.
Defensive question, not a fix: under the current server contract, can w.end ever exceed original_size? The server returns chunk-aligned ranges within the original file, and the snap logic on the client also caps at original_size. If w.end > original_size is structurally impossible, this min is dead defense and a debug_assert!(w.end <= original_size) would document the invariant more loudly. If it can happen (e.g. some forward-compat shape), worth a one-line comment about which case justifies it.
|
|
||
| // Emit one range hash per stable segment (= no overlap with any window). | ||
| // Segments and windows are both monotonic, so a two-pointer walk is O(S+W). | ||
| let gap_verification = if file_info.verification.len() == file_info.segments.len() { |
There was a problem hiding this comment.
The branch at L172 silently emits an empty gap_verification when file_info.verification.len() != file_info.segments.len(). Defensible for legacy shards that genuinely have no verification entries — but the failure mode for a partially-populated file_info would be a confusing runtime error in compose_mdb ("ran out of gap_verification entries at stable segment N") rather than something pointed at the actual cause. Two reasonable options:
- Tighten: emit empty only if
file_info.verification.is_empty(), otherwise returnClientError::Other(...)here with a clearer message. - Or document this branch's contract explicitly so the runtime error in
compose_mdbis interpretable.
Not blocking — calling it out since the simulation-only path is what tests run against.
| @@ -0,0 +1,205 @@ | |||
| //! Server-side state machine for `GET /v2/file-chunk-hashes/{file_id}` (mirrored from | |||
| //! xetcas PR #987) plus a small driver helper used by simulation clients to produce a | |||
| //! [`FileChunkHashesResponse`] without routing through HTTP. | |||
There was a problem hiding this comment.
The doc comment is honest about the duplication-with-xetcas#987 situation, which I appreciate. Worth filing a follow-up issue to track the eventual dedup (either xetcas pulling this in, or this getting moved to a shared crate) so it doesn't quietly become a permanent fork. Duplications like this almost always do without a forcing function.
| /// Distinct from the standard `Range` header (which scopes the response body): this header tags | ||
| /// regions that the client intends to re-chunk, and the response covers the whole file (windows + | ||
| /// gap subtrees). Value uses the same `bytes=A-B,C-D` syntax as `Range`. | ||
| pub const X_RANGE_DIRTY_HEADER: &str = "x-range-dirty"; |
There was a problem hiding this comment.
nit: "x-range-dirty" lowercase is unusual when most header constants in this crate (e.g. SESSION_ID_HEADER at L17 = "X-Xet-Session-Id", REQUEST_ID_HEADER at L19 = "X-Request-Id") use the X-Foo-Bar capitalized form. HTTP headers are case-insensitive on the wire so this is purely cosmetic, but matching the surrounding style would be a small consistency win.
Summary
APIs for range-aware file writes: instead of re-uploading an entire file when only part of it changed, compose a new CAS file from stable segments + re-chunked dirty windows. Supports resize edits (insert / delete / arbitrary replace) in addition to in-place rewrites.
API:
upload_rangesThe output file size is derived from the inputs (no
total_sizeparameter):original_size - removed + added.Edit shapes (all expressible with the same struct)
original_rangenew_lengtha..bb - aa..bp..p> 0a..b0original_size..original_size> 0N..original_size0dirty_inputsMotivating example:
Per-range
AsyncReadinstead ofReadSeekover the staging file. The earlier prototype tookdirty_ranges: &[(u64, u64)] + dirty_source: &mut dyn ReadSeek. That had a subtle bug: for truncation we silently extended the dirty set with a boundary chunk and read those bytes from the staging file, but if the file was never opened for write the staging file contains zeros at those positions (real bytes are in CAS) → silent corruption on the truncation boundary chunk. Pairing each edit with its own reader makes that structurally impossible: any byte not provided by the caller is fetched from CAS.How it works
High level
Step 1 — coalesce + snap edits to segment boundaries
Edits are user-coordinates (byte ranges). We snap each edit's
original_rangeto the enclosing CAS segments so composition can swap whole segments instead of truncating one mid-chunk. Adjacent / overlapping snapped ranges are then coalesced.Pure inserts (
start == end) snap to the segment that ownsstart; an insert atoriginal_sizesnaps to the last segment.Step 2 — server returns windows + gap subtrees
Single CAS call:
GET /v2/file-chunk-hashes/{file_id}with the segment-aligned ranges in anX-Range-Dirty: bytes=A-B,C-Dheader. Response shape (xetcas#987):windows[i].chunkscarries the chunk hashes the server actually owns for that window (we re-upload these bytes).hash_ranges[i]is the MerkleHashSubtree for the i-th unmodified gap, orNonewhen there is no gap there. This is the key to composing the final file hash without touching unmodified bytes.Step 3 — for each window, stream
[CAS prefix | edits | CAS suffix]through a fresh cleanerPure inserts contribute zero original bytes but still emit
new_lengthreader bytes. Pure deletes contribute zero reader bytes. The cleaner produces a newMDBFileInfoper window and aChunkHashList.Step 4 — compose the file hash via
MerkleHashSubtree::mergeSpecial-case: if
total_size == 0(e.g. truncate to empty) the result isMerkleHash::default()without HMAC, mirroringfile_hash([]).Step 5 — splice segments + register
Walk the original
MDBFileInfo.segmentsand replace any segment that falls inside a window with that window's freshly-uploaded segments. Verification entries follow segment-for-segment when present.metadata_ext = None(no SHA-256, see Limitations). Thenregister_composed_file+finalize.Multi-window example
Two edits: replace
[50MB, 51MB)and[150MB, 151MB)on a 200MB file:Empty original short-circuit
When
original_size == 0there is nothing to compose against — every edit'soriginal_rangemust be0..0(validated). We just stream the new bytes through a fresh cleaner (upload_fresh_file).Reviewer note:
chunk_window_builderis a re-implementation of xetcasxet_client/src/cas_client/chunk_window_builder.rsis a port of the same window-building state machine that already lives in xetcas — it's only used by the local / in-memory simulation clients (local_client,memory_client) so the mock CAS server returns the same shape as the real one in tests. No need to re-review it as part of this PR: it mirrors logic already reviewed and merged in xetcas#987. A follow-up xetcas PR will deduplicate by removing the server-side copy and pulling this one in (or vice versa); the duplication is intentional and temporary.Limitations
metadata_ext = Nonesince recomputing SHA-256 would require reading the full file. Only suitable for contexts that don't require SHA-256 verification (HF buckets, xet-native repos), not for Git LFS-backed repos.Tests (27)
Covering all edit shapes + edge cases. Notable:
test_resize_edits_abctest_resize_large_replace_grows_file[a..b)with much more datatest_resize_large_replace_shrinks_file[a..b)with much less datatest_resize_mid_file_inserttest_resize_mid_file_deletetest_resize_multi_edit_mixtest_resize_insert_at_segment_boundarytest_upload_ranges_mid_file_edittest_upload_ranges_truncationtest_upload_ranges_truncation_empty_stagingtest_upload_ranges_truncation_with_overlapping_dirtytest_truncate_to_empty_matches_clean_emptyMerkleHash::default()(matches a fresh empty cleaner)test_upload_ranges_appendtest_append_with_gap_before_dirty_rangetest_append_sparse_staging_filetest_mid_edit_plus_appendtest_empty_original_appendoriginal_size == 0+ append falls into the fresh-file path (P2 codex regression)test_empty_original_validates_rangesoriginal_size == 0still runs validation (reviewer regression)test_upload_ranges_at_file_starttest_upload_ranges_multiple_regionstest_single_input_spanning_many_chunkstest_data_integrity_scenariostest_noop_returns_original_hashdirty_inputs→ no CAS call, original hash returnedtest_rejects_dirty_range_past_total_sizeoriginal_sizetest_rejects_overlapping_dirty_rangestest_rejects_unsorted_dirty_rangestest_upload_ranges_small_file_mid_editDependencies
GET /v2/file-chunk-hashes/{file_id}withwindows[] + hash_ranges[]response shape — huggingface-internal/xetcas#987 (merged).Note
Medium Risk
Adds a new partial-upload composition path (
upload_ranges) that changes how file hashes and reconstruction plans are computed and registered, which can impact data integrity if windowing/merge invariants are wrong. Scope is contained to new APIs and plumbing but touches core dedup/finalization return shapes.Overview
Adds a new range-aware upload API,
upload_ranges, which composes an updated file by re-uploading only chunk-aligned “dirty windows” while reusing stable CAS segments for the rest of the file (including resize edits like insert/delete/replace).Introduces a new CAS client endpoint wrapper
get_file_chunk_hashes(HTTPGET /v2/file-chunk-hashes/{file_id}usingX-Range-Dirty) and new response types (FileChunkHashesResponse,ChunkWindow,X_RANGE_DIRTY_HEADER) to return dirty windows plus gapMerkleHashSubtreesummaries (and segment verification hashes) needed to recompute the final file hash without transferring per-chunk hashes.Extends the ingestion/dedup pipeline to support composition:
FileDeduper::finalizenow returns the per-fileChunkHashList,SingleFileCleanergainsfinish_with_chunks/finish_with_chunks_detached, andFileUploadSessionadds detached completion +register_composed_fileso window uploads don’t create orphan shard entries before the final composedMDBFileInfois registered. Simulation clients implementget_file_chunk_hasheslocally via the newchunk_window_builderhelper, and tests are added/updated to cover the new edit shapes and regressions.Reviewed by Cursor Bugbot for commit a1c8b19. Bugbot is set up for automated code reviews on this repo. Configure here.