✨ Serial creation#2635
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThreads numeric indices through the entry-result pipeline, adds a ReorderByIndex iterator to reorder (idx, EntryResult) pairs, updates spawn/drain to emit/consume indexed results, and updates create/update code and tests to ensure deterministic archive entry ordering. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI/Input
participant Spawn as spawn_entry_results
participant Chan as Channel
participant Reorder as ReorderByIndex
participant Drain as drain_entry_results
participant Writer as ArchiveWriter
Note over CLI,Writer: Indexed ordering pipeline
CLI->>Spawn: enumerate items -> (idx, item)
Spawn->>Chan: send (idx, EntryResult)
Chan->>Reorder: stream of (idx, EntryResult)
Reorder->>Reorder: buffer out-of-order entries (HashMap)
Reorder->>Drain: emit EntryResult in ascending idx order
Drain->>Writer: write deterministic entries
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical aspect of archive creation and modification: ensuring deterministic ordering of entries. By introducing the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces an OrderedByIndex iterator to ensure entries are processed in the order they were initially provided, addressing potential issues with parallel processing disrupting the intended sequence. This change affects the create and update commands, as well as their corresponding tests, to guarantee consistent entry ordering.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
Note Docstrings generation - SUCCESS |
Docstrings generation was requested by @ChanTsune. * #2635 (comment) The following files were modified: * `cli/src/command/core.rs` * `cli/src/command/core/iter.rs` * `cli/src/command/create.rs` * `cli/src/command/update.rs` * `cli/tests/cli/append/entry_order.rs` * `cli/tests/cli/create/entry_order.rs` * `cli/tests/cli/update/entry_order.rs`
* 📝 Add docstrings to `cli/serial-creation` Docstrings generation was requested by @ChanTsune. * #2635 (comment) The following files were modified: * `cli/src/command/core.rs` * `cli/src/command/core/iter.rs` * `cli/src/command/create.rs` * `cli/src/command/update.rs` * `cli/tests/cli/append/entry_order.rs` * `cli/tests/cli/create/entry_order.rs` * `cli/tests/cli/update/entry_order.rs` * ♻️ Keep only iter.rs docstrings and add unit tests - Revert docstring changes to core.rs, create.rs, update.rs, and test files - Keep OrderedByIndex documentation in iter.rs - Replace doc examples with actual unit tests in mod tests --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: ChanTsune <41658782+ChanTsune@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/src/command/update.rs`:
- Around line 539-543: target_files_mapping creation currently uses enumerate on
target_items which allows duplicate EntryName keys to overwrite earlier entries
and leave index gaps that break OrderedByIndex; deduplicate the input paths
before indexing or rebuild contiguous indices when constructing
target_files_mapping: ensure you iterate target_items preserving first
occurrence of each EntryName (e.g., track seen EntryName values and skip
duplicates) or collect into a Vec of unique (EntryName, item) and then
re-enumerate to assign contiguous indices before creating the IndexMap used by
OrderedByIndex; apply the same dedup/reindex approach to the other mappings
mentioned (the remaining-items and non-memmap cases) so no index gaps remain.
🧹 Nitpick comments (2)
cli/src/command/core/iter.rs (2)
76-85: Add a fail-fast signal when indices are missing.If the underlying iterator ends while
bufferstill holds items, gaps cause silent drops. A debug assertion (or log) makes this easier to detect during development.💡 Suggested guard
- None => return None, + None => { + debug_assert!( + self.buffer.is_empty(), + "OrderedByIndex: missing index {} while buffered items remain", + self.next + ); + return None; + }
94-129: Optional: add a test for missing/duplicate indices.Locking the documented gap/duplicate behavior would make future refactors safer.
* 📝 Add docstrings to `cli/serial-creation` Docstrings generation was requested by @ChanTsune. * #2635 (comment) The following files were modified: * `cli/src/command/core.rs` * `cli/src/command/core/iter.rs` * `cli/src/command/create.rs` * `cli/src/command/update.rs` * `cli/tests/cli/append/entry_order.rs` * `cli/tests/cli/create/entry_order.rs` * `cli/tests/cli/update/entry_order.rs` * ♻️ Keep only iter.rs docstrings and add unit tests - Revert docstring changes to core.rs, create.rs, update.rs, and test files - Keep OrderedByIndex documentation in iter.rs - Replace doc examples with actual unit tests in mod tests --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: ChanTsune <41658782+ChanTsune@users.noreply.github.com>
438c91b to
407c9fb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/src/command/core/iter.rs`:
- Around line 15-19: The iterator currently buffers items whose index is less
than self.next and later rewinds to emit them, which breaks the ascending-order
guarantee; change the logic in the iterator implementation (the code paths that
handle incoming idx, check self.next, and insert into the buffer) to drop/ignore
any item where idx < self.next instead of buffering it, and update the
top-of-file doc comment to explicitly state that late duplicates are ignored
(last-wins only for items seen before emission) so consumers know no rewinding
occurs. Ensure references to self.next, the buffer insertion code, and any
emit/flush routines are adjusted to skip late indices and do not trigger
rewinding.
* 📝 Add docstrings to `cli/serial-creation` Docstrings generation was requested by @ChanTsune. * #2635 (comment) The following files were modified: * `cli/src/command/core.rs` * `cli/src/command/core/iter.rs` * `cli/src/command/create.rs` * `cli/src/command/update.rs` * `cli/tests/cli/append/entry_order.rs` * `cli/tests/cli/create/entry_order.rs` * `cli/tests/cli/update/entry_order.rs` * ♻️ Keep only iter.rs docstrings and add unit tests - Revert docstring changes to core.rs, create.rs, update.rs, and test files - Keep OrderedByIndex documentation in iter.rs - Replace doc examples with actual unit tests in mod tests --------- Co-Authored-By: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-Authored-By: ChanTsune <41658782+ChanTsune@users.noreply.github.com>
407c9fb to
5f8f38d
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@cli/tests/cli/create/entry_order.rs`:
- Around line 333-405: The test create_split_preserves_entry_order may never
trigger splitting because test payloads compress too well; modify the test so
the split is actually exercised by (a) writing less-compressible data (e.g.,
random bytes) for large_file and medium_file instead of repeated bytes, and/or
(b) adding an assertion after creating the split archive (via
cli::Cli::try_parse_from(... "c" ... "--split" "400KB" ...).execute()) that
multiple split parts exist (inspect the filesystem for part files or check
archive metadata) before running cli::Cli::try_parse_from(... "concat"
...).execute(), and keep references to the same file variables (large_file,
medium_file, small_file) and the final validation using archive::for_each_entry.
- Around line 68-143: The test create_preserves_walkdir_order currently only
checks that entry_names == entry_names2 (determinism) but doesn't assert the
actual expected traversal; update the test to compute or hardcode the expected
walkdir depth-first order (based on the CLI's traversal policy) and add an
assert_eq!(entry_names, expected_order) (and optionally the same for
entry_names2) before or alongside the existing determinism check; reference the
variables entry_names/entry_names2, the test function
create_preserves_walkdir_order, the CLI invocation via cli::Cli::try_parse_from
and the archive reader archive::for_each_entry to locate where to insert the
expected_order assertion so the test verifies both correctness and determinism.
♻️ Duplicate comments (1)
cli/src/command/core/iter.rs (1)
15-19: Late duplicate indices break ascending-order guarantee.The documentation states duplicate indices result in "last item retained," but if a duplicate index arrives after it has already been emitted (
idx < self.next), it's buffered and may be emitted later via the exhausted-branch rewind (lines 82-86), breaking the ascending-order contract.Also applies to: 88-95
🧹 Nitpick comments (1)
cli/src/command/core/iter.rs (1)
102-174: Tests provide good coverage of ordering scenarios.The test suite covers key edge cases including gaps and reverse ordering. Consider adding a test for duplicate indices to document the expected behavior when the same index appears multiple times.
🧪 Suggested test for duplicate indices
#[test] fn handles_duplicate_indices() { // Second item with index 1 should overwrite the first let items = vec![(0, 'a'), (1, 'b'), (1, 'B'), (2, 'c')]; let out: Vec<_> = ReorderByIndex::new(items.into_iter()).collect(); assert_eq!(out, vec!['a', 'B', 'c']); }
* 📝 Add docstrings to `cli/serial-creation` Docstrings generation was requested by @ChanTsune. * #2635 (comment) The following files were modified: * `cli/src/command/core.rs` * `cli/src/command/core/iter.rs` * `cli/src/command/create.rs` * `cli/src/command/update.rs` * `cli/tests/cli/append/entry_order.rs` * `cli/tests/cli/create/entry_order.rs` * `cli/tests/cli/update/entry_order.rs` * ♻️ Keep only iter.rs docstrings and add unit tests - Revert docstring changes to core.rs, create.rs, update.rs, and test files - Keep OrderedByIndex documentation in iter.rs - Replace doc examples with actual unit tests in mod tests --------- Co-Authored-By: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-Authored-By: ChanTsune <41658782+ChanTsune@users.noreply.github.com>
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.