♻️ Extract SplitArchiveReader to unify memmap/streaming dispatch#2704
Conversation
📝 WalkthroughWalkthroughIntroduces a new SplitArchiveReader abstraction and refactors many CLI commands to use it, replacing direct archive/memmap handling and prior public run_entries/run_transform_entry exports with source-based iteration and transformation APIs across the CLI command modules. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI Command
participant SAR as SplitArchiveReader
participant Core as core helpers
participant FS as Files/Mmap
CLI->>SAR: SplitArchiveReader::new(collect_split_archives(...))
CLI->>SAR: source.for_each_entry(password, processor) / transform_entries(...)
SAR->>FS: iterate files or mmaps (cfg-specific)
SAR->>Core: call run_read_entries* / run_transform_entry(...)
Core->>SAR: return processed entries/results
SAR->>CLI: yield entries / write transformed output
CLI->>SAR: drop(source)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the archive processing logic within the CLI by introducing a Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request does a great job of refactoring the archive reading logic by introducing the SplitArchiveReader. This significantly reduces boilerplate code across multiple command files and unifies the handling for memmap and streaming reads. The changes are clean and follow the stated goal. I've found a couple of areas for improvement regarding the consistency of the new abstraction and some potential dead code.
I am having trouble creating individual review comments. Click here to see my feedback.
cli/src/command/core/archive_source.rs (33-52)
The use of self.files.drain(..) makes this method destructive for the not(feature = "memmap") case, meaning it can only be called once. This is inconsistent with the memmap version of this function, which is reusable. This could lead to unexpected behavior if the reader is used multiple times when the memmap feature is disabled.
To ensure consistent behavior, the method should be made non-destructive. This can be achieved by iterating over references to the files (.iter()) and seeking to the beginning of each file before use.
You'll need to bring std::io::Seek into scope to use file.seek(...).
This change should also be applied to for_each_entry and for_each_read_entry for consistency.
pub(crate) fn transform_entries<W, F, S>(
&mut self,
writer: W,
password: Option<&[u8]>,
processor: F,
strategy: S,
) -> anyhow::Result<()>
where
W: io::Write,
F: FnMut(io::Result<NormalEntry>) -> io::Result<Option<NormalEntry>>,
S: TransformStrategy,
{
for file in &mut self.files {
file.seek(io::SeekFrom::Start(0))?;
}
super::run_transform_entry(
writer,
self.files.iter(),
|| password,
processor,
strategy,
)
}cli/src/command/list.rs (519-530)
After this refactoring, the run_list_archive function appears to be unused. The logic has been moved directly into the list_archive function using the new SplitArchiveReader. Consider removing run_list_archive to clean up dead code.
d656d42 to
bccd204
Compare
Introduce SplitArchiveReader in cli/src/command/core/archive_source.rs to encapsulate #[cfg(feature = "memmap")] branching into a single location. The struct provides three methods (transform_entries, for_each_entry, for_each_read_entry) with dual cfg'd implementations, replacing the 6-line mmap boilerplate that was duplicated across 11 command files. - Migrated: delete, chmod, chown, strip, migrate, acl, xattr, diff, sort, list, update, stdio - Unified: run_update_archive from two cfg'd copies into one (takes SplitArchiveReader) - Removed: run_list_archive_mem, non-memmap run_entries, run_read_entries/run_read_entries_mem imports in update.rs - Made private: run_transform_entry (only used by archive_source.rs)
bccd204 to
33c165b
Compare
Introduce SplitArchiveReader in cli/src/command/core/archive_source.rs to encapsulate #[cfg(feature = "memmap")] branching into a single location.
The struct provides three methods (transform_entries, for_each_entry, for_each_read_entry) with dual cfg'd implementations, replacing the 6-line mmap boilerplate that was duplicated across 11 command files.
Summary by CodeRabbit