⚗️ Add .mtree support#2644
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds Unix-only MTREE support: new Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant Core as read_archive_source
participant Detector as detect_format
participant Mtree as mtree::transform_mtree_entries
participant FS as Filesystem
CLI->>Core: read_archive_source(source, create_options, filter, time_filters, password)
Core->>Detector: probe reader bytes to detect format
Detector-->>Core: SourceFormat (Pna or Mtree)
alt Pna
Core->>Core: transform_archive_entries(reader, ...)
Core-->>CLI: return NormalEntry results
else Mtree
Core->>Mtree: transform_mtree_entries(reader, create_options, filter, time_filters)
Mtree->>FS: stat/read files, resolve metadata, read contents
Mtree-->>Core: NormalEntry results (per-entry errors possible)
Core-->>CLI: return NormalEntry results
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the archive creation capabilities by integrating support for .mtree manifest files. Users can now leverage mtree manifests to define the structure and metadata of files to be archived, offering a granular level of control over properties like file types, permissions, ownership, and modification times. The system intelligently detects the input format, processing either standard PNA archives or the newly supported mtree manifests, and applies the specified metadata, ensuring accurate representation of the file hierarchy within the created archive. This feature is primarily implemented for Unix-based environments, aligning with the common usage of mtree in such systems. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for .mtree files, a format for specifying file hierarchies, which is a significant feature for creating reproducible archives. The implementation includes auto-detection between PNA and mtree formats and is well-tested. My review focuses on improving code maintainability by refactoring duplicated logic.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/Cargo.toml`:
- Line 54: Update the mtree2 dependency version string from "0.6.14" to "0.6.15"
in the Cargo.toml dependency entry for mtree2 to pick up the optional keyword
bugfix; locate the mtree2 entry (mtree2 = "0.6.14") and change the version to
mtree2 = "0.6.15".
🧹 Nitpick comments (1)
cli/tests/cli/stdio/mtree.rs (1)
23-64: Consider usingtempfilecrate for test directories.Tests create directories in the current working directory without cleanup, which can leave test artifacts. Consider using
tempfile::tempdir()for automatic cleanup and better isolation.This is a minor improvement and not blocking - the unique directory names per test prevent conflicts.
cb26c9b to
95f5665
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@cli/src/command/core/mtree.rs`:
- Around line 311-427: apply_mtree_metadata currently prefers
mtree_entry.uid()/gid()/uname()/gname() over CLI overrides in
KeepOptions::owner_strategy, causing inconsistent precedence with
transform_normal_entry; change the resolution so that
options.uid/options.gid/options.uname/options.gname take precedence when
present—i.e., use options.uid/options.gid if Some(...) before falling back to
mtree_entry.uid()/gid(), and when building uname/gname prefer
options.uname/options.gname first, then mtree_entry.uname()/gname(), then
filesystem lookup (the functions to update are apply_mtree_metadata and the use
of OwnerStrategy::Preserve { options } with
mtree_entry.uid()/gid()/uname()/gname()).
In `@cli/tests/cli/stdio/mtree.rs`:
- Around line 264-270: Update the comment that references a placeholder mtree2
issue: either replace "https://github.com/Byron/mtree-rs/issues/XX" with the
actual upstream issue URL for the mtree2 bug (and keep the surrounding note and
test guidance), or remove the URL entirely if no issue exists; locate the
disabled test comment for stdio_mtree_optional_missing in mtree.rs (the block
referencing mtree2 version 0.6.14 and the `Keyword::Optional` behavior) and make
the replacement so the comment no longer contains the stale "issues/XX"
placeholder.
- Around line 405-433: The test stdio_mtree_missing_required_file_fails is flaky
because leftover nonexistent.txt from prior runs can make the test pass; fix by
ensuring the test uses a clean directory or a temp directory before creating
manifest.mtree—either create a fresh tempdir (tempfile::tempdir) for base or
explicitly remove any preexisting files (e.g.,
base.join("nonexistent.txt").exists() -> fs::remove_file) and recreate the base
directory prior to writing manifest.mtree so the manifest truly references a
missing file.
1efda3a to
5680528
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/src/command/core.rs`:
- Around line 50-61: detect_format currently uses reader.fill_buf which may
return fewer than PNA_HEADER.len() bytes on streams, causing mis-detection;
instead, read exactly PNA_HEADER.len() bytes into a temporary buffer (handling
short reads/EOF), then compare that buffer to PNA_HEADER to decide Pna vs Mtree,
and finally chain the temporary buffer back onto the original reader (e.g., via
an io::Cursor and chain) so the downstream code sees the full stream; update
detect_format to use this read-then-chain approach and reference PNA_HEADER,
detect_format, SourceFormat::Pna, and SourceFormat::Mtree when making the
change.
♻️ Duplicate comments (2)
cli/src/command/core/mtree.rs (1)
329-386: CLI owner overrides still lose to mtree metadata.
This makes@mtreebehavior inconsistent withtransform_normal_entry, where CLI overrides win. Preferoptions.*first, then mtree, then FS.🔧 Suggested precedence fix
- let uid = if nochange { - options.uid.unwrap_or(fs_metadata.uid()) - } else { - mtree_entry - .uid() - .or(options.uid) - .unwrap_or(fs_metadata.uid()) - }; - let gid = if nochange { - options.gid.unwrap_or(fs_metadata.gid()) - } else { - mtree_entry - .gid() - .or(options.gid) - .unwrap_or(fs_metadata.gid()) - }; + let uid = if nochange { + options.uid.unwrap_or(fs_metadata.uid()) + } else { + options + .uid + .or_else(|| mtree_entry.uid()) + .unwrap_or(fs_metadata.uid()) + }; + let gid = if nochange { + options.gid.unwrap_or(fs_metadata.gid()) + } else { + options + .gid + .or_else(|| mtree_entry.gid()) + .unwrap_or(fs_metadata.gid()) + }; let uname = { let fallback = || -> io::Result<String> { if let Some(ref uname) = options.uname { Ok(uname.clone()) } else { Ok(User::from_uid(uid.into())? .name() .unwrap_or_default() .into()) } }; - if nochange { + if options.uname.is_some() { + options.uname.clone().unwrap_or_default() + } else if nochange { fallback()? } else if let Some(uname) = mtree_entry.uname() { String::from_utf8_lossy(uname).into_owned() } else { fallback()? } }; let gname = { let fallback = || -> io::Result<String> { if let Some(ref gname) = options.gname { Ok(gname.clone()) } else { Ok(Group::from_gid(gid.into())? .name() .unwrap_or_default() .into()) } }; - if nochange { + if options.gname.is_some() { + options.gname.clone().unwrap_or_default() + } else if nochange { fallback()? } else if let Some(gname) = mtree_entry.gname() { String::from_utf8_lossy(gname).into_owned() } else { fallback()? } };- let uid = mtree_entry.uid().or(options.uid).unwrap_or(0); - let gid = mtree_entry.gid().or(options.gid).unwrap_or(0); + let uid = options.uid.or_else(|| mtree_entry.uid()).unwrap_or(0); + let gid = options.gid.or_else(|| mtree_entry.gid()).unwrap_or(0); - let uname = if let Some(uname) = mtree_entry.uname() { - String::from_utf8_lossy(uname).into_owned() - } else if let Some(ref uname) = options.uname { - uname.clone() - } else { - String::new() - }; - let gname = if let Some(gname) = mtree_entry.gname() { - String::from_utf8_lossy(gname).into_owned() - } else if let Some(ref gname) = options.gname { - gname.clone() - } else { - String::new() - }; + let uname = if let Some(ref uname) = options.uname { + uname.clone() + } else if let Some(uname) = mtree_entry.uname() { + String::from_utf8_lossy(uname).into_owned() + } else { + String::new() + }; + let gname = if let Some(ref gname) = options.gname { + gname.clone() + } else if let Some(gname) = mtree_entry.gname() { + String::from_utf8_lossy(gname).into_owned() + } else { + String::new() + };Also applies to: 414-434
cli/tests/cli/stdio/mtree.rs (1)
303-308: Prevent flakiness ifnonexistent.txtalready exists.
If a previous run left the file behind, this test can pass unexpectedly.🧹 Suggested cleanup
fn stdio_mtree_missing_required_file_fails() { setup(); let base = PathBuf::from("stdio_mtree_missing_required_file_fails"); + let _ = fs::remove_dir_all(&base); fs::create_dir_all(&base).unwrap();
5680528 to
cac6a5e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/src/command/core/mtree.rs`:
- Around line 329-345: apply_mtree_metadata is giving mtree_entry.uid()/gid()
precedence over CLI options, while transform_normal_entry applies CLI options
first; make precedence consistent by applying options.uid/options.gid before
falling back to mtree_entry values (respecting the nochange flag behavior).
Update the uid/gid selection in apply_mtree_metadata to mirror
transform_normal_entry: when nochange is true use
options.unwrap_or(fs_metadata...), otherwise check options.uid/options.gid
first, then mtree_entry.uid()/gid(), then fs_metadata.uid()/gid(); modify the
uid and gid assignments accordingly (referencing apply_mtree_metadata,
transform_normal_entry, mtree_entry.uid()/gid(), options.uid/options.gid(),
fs_metadata.uid()/gid(), and the nochange flag).
♻️ Duplicate comments (2)
cli/tests/cli/stdio/mtree.rs (1)
302-329: Potential test flakiness from leftover files.If a previous test run or manual intervention leaves
nonexistent.txtin thestdio_mtree_missing_required_file_failsdirectory, this test could incorrectly pass. Consider adding cleanup at the start.🧹 Suggested cleanup
fn stdio_mtree_missing_required_file_fails() { setup(); let base = PathBuf::from("stdio_mtree_missing_required_file_fails"); + let _ = fs::remove_dir_all(&base); fs::create_dir_all(&base).unwrap();cli/src/command/core.rs (1)
50-62: Potential mis-detection of PNA on short reads from pipes/stdin.
fill_buf()can return fewer than 8 bytes even when more data will arrive (especially from pipes/stdin), causing a valid PNA stream to be incorrectly treated as mtree. Consider reading the header bytes explicitly and chaining them back into the reader.💡 One approach (read + chain)
-pub(crate) fn detect_format<R: io::BufRead>(reader: &mut R) -> io::Result<SourceFormat> { - let buf = reader.fill_buf()?; - if buf.len() >= PNA_HEADER.len() && buf[..PNA_HEADER.len()] == *PNA_HEADER { - return Ok(SourceFormat::Pna); - } - Ok(SourceFormat::Mtree) -} +pub(crate) fn detect_format<R: io::BufRead>(reader: &mut R) -> io::Result<SourceFormat> { + // fill_buf may return fewer bytes than available on pipes, so keep filling + loop { + let buf = reader.fill_buf()?; + if buf.len() >= PNA_HEADER.len() { + return if buf[..PNA_HEADER.len()] == *PNA_HEADER { + Ok(SourceFormat::Pna) + } else { + Ok(SourceFormat::Mtree) + }; + } + if buf.is_empty() { + // EOF reached with < 8 bytes: cannot be PNA + return Ok(SourceFormat::Mtree); + } + // Buffer not yet full but not EOF - this shouldn't happen with BufReader + // but for safety, treat as mtree + break; + } + Ok(SourceFormat::Mtree) +}
002cd9a to
0fc7d20
Compare
0fc7d20 to
02e9934
Compare
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.