Skip to content

⚗️ Add .mtree support#2644

Merged
ChanTsune merged 2 commits into
mainfrom
cli/@/mtree
Jan 26, 2026
Merged

⚗️ Add .mtree support#2644
ChanTsune merged 2 commits into
mainfrom
cli/@/mtree

Conversation

@ChanTsune
Copy link
Copy Markdown
Owner

@ChanTsune ChanTsune commented Jan 24, 2026

Summary by CodeRabbit

  • New Features

    • Unix-only mtree manifest support with automatic detection between archive and mtree inputs.
    • Stdio-based manifest handling, mixed-source manifests, and optional-entry semantics.
    • Improved metadata handling honoring preserve/nochange for permissions, ownership, and timestamps.
  • Tests

    • Added extensive Unix-only unit and integration tests covering mtree manifests, mixed sources, format detection, metadata behavior, and error cases.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 24, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds Unix-only MTREE support: new cli/src/command/core/mtree.rs module and tests, mtree2 dependency under cfg(unix), format detection (SourceFormat + detect_format), and platform-gated dispatch in read_archive_source to handle PNA vs MTREE inputs.

Changes

Cohort / File(s) Summary
Dependency Addition
\cli/Cargo.toml``
Added mtree2 = "0.6.15" under cfg(unix) dependencies.
Core Format Dispatch Logic
\cli/src/command/core.rs``
Added #[cfg(unix)] pub(crate) mod mtree; introduced pub(crate) enum SourceFormat and pub(crate) fn detect_format; added platform-gated pub(crate) fn read_archive_source that auto-detects PNA vs MTREE and dispatches to transform_archive_entries or mtree::transform_mtree_entries; added detect_format tests.
Mtree Implementation
\cli/src/command/core/mtree.rs``
New Unix-only module implementing MTREE parsing and transformation: transform_mtree_entries, entry creators (file/dir/symlink), metadata application (including nochange semantics), optional-entry handling, filesystem fallbacks, and per-entry error propagation.
Test Inclusion
\cli/tests/cli/stdio.rs``
Added #[cfg(unix)] mod mtree; to include Unix-only MTREE tests.
Mtree Integration Tests
\cli/tests/cli/stdio/mtree.rs``
New comprehensive Unix-only integration tests for MTREE stdio support: manifests, /set defaults, contents=, mixed sources, symlinks, metadata/nochange behavior, and error cases.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI
  participant Core as read_archive_source
  participant Detector as detect_format
  participant Mtree as mtree::transform_mtree_entries
  participant FS as Filesystem

  CLI->>Core: read_archive_source(source, create_options, filter, time_filters, password)
  Core->>Detector: probe reader bytes to detect format
  Detector-->>Core: SourceFormat (Pna or Mtree)
  alt Pna
    Core->>Core: transform_archive_entries(reader, ...)
    Core-->>CLI: return NormalEntry results
  else Mtree
    Core->>Mtree: transform_mtree_entries(reader, create_options, filter, time_filters)
    Mtree->>FS: stat/read files, resolve metadata, read contents
    Mtree-->>Core: NormalEntry results (per-entry errors possible)
    Core-->>CLI: return NormalEntry results
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • @ support #2523 — Related changes to archive-reading flow and read_archive_source/transform_archive_entries that this PR extends with MTREE dispatch.

Poem

🐰 I sniffed the header, PNA or MTREE,
hopped through manifests beneath the tree.
I stitched in times, permissions, and name,
skipped the optional, kept the frame.
tap-tap, the archive springs to be!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature being added: support for the .mtree format to the CLI tool.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the archive creation capabilities by integrating support for .mtree manifest files. Users can now leverage mtree manifests to define the structure and metadata of files to be archived, offering a granular level of control over properties like file types, permissions, ownership, and modification times. The system intelligently detects the input format, processing either standard PNA archives or the newly supported mtree manifests, and applies the specified metadata, ensuring accurate representation of the file hierarchy within the created archive. This feature is primarily implemented for Unix-based environments, aligning with the common usage of mtree in such systems.

Highlights

  • Mtree Manifest Support: Introduced the ability to create archives from .mtree manifest files, allowing precise control over archive contents and metadata.
  • Automatic Format Detection: Implemented logic to automatically detect whether an input source is a PNA archive or an mtree manifest based on magic bytes.
  • Metadata Preservation: Enabled the preservation of file metadata (timestamps, permissions, ownership) as specified in mtree manifests, with intelligent fallbacks to filesystem metadata.
  • Unix-Specific Features: Integrated mtree features like symlink handling and detailed ownership (UID, GID, uname, gname) specifically for Unix-like operating systems.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for .mtree files, a format for specifying file hierarchies, which is a significant feature for creating reproducible archives. The implementation includes auto-detection between PNA and mtree formats and is well-tested. My review focuses on improving code maintainability by refactoring duplicated logic.

Comment thread cli/src/command/core.rs
Comment thread cli/src/command/core/mtree.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cli/Cargo.toml`:
- Line 54: Update the mtree2 dependency version string from "0.6.14" to "0.6.15"
in the Cargo.toml dependency entry for mtree2 to pick up the optional keyword
bugfix; locate the mtree2 entry (mtree2 = "0.6.14") and change the version to
mtree2 = "0.6.15".
🧹 Nitpick comments (1)
cli/tests/cli/stdio/mtree.rs (1)

23-64: Consider using tempfile crate for test directories.

Tests create directories in the current working directory without cleanup, which can leave test artifacts. Consider using tempfile::tempdir() for automatic cleanup and better isolation.

This is a minor improvement and not blocking - the unique directory names per test prevent conflicts.

Comment thread cli/Cargo.toml Outdated
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file cli This issue is about cli application labels Jan 24, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@cli/src/command/core/mtree.rs`:
- Around line 311-427: apply_mtree_metadata currently prefers
mtree_entry.uid()/gid()/uname()/gname() over CLI overrides in
KeepOptions::owner_strategy, causing inconsistent precedence with
transform_normal_entry; change the resolution so that
options.uid/options.gid/options.uname/options.gname take precedence when
present—i.e., use options.uid/options.gid if Some(...) before falling back to
mtree_entry.uid()/gid(), and when building uname/gname prefer
options.uname/options.gname first, then mtree_entry.uname()/gname(), then
filesystem lookup (the functions to update are apply_mtree_metadata and the use
of OwnerStrategy::Preserve { options } with
mtree_entry.uid()/gid()/uname()/gname()).

In `@cli/tests/cli/stdio/mtree.rs`:
- Around line 264-270: Update the comment that references a placeholder mtree2
issue: either replace "https://github.com/Byron/mtree-rs/issues/XX" with the
actual upstream issue URL for the mtree2 bug (and keep the surrounding note and
test guidance), or remove the URL entirely if no issue exists; locate the
disabled test comment for stdio_mtree_optional_missing in mtree.rs (the block
referencing mtree2 version 0.6.14 and the `Keyword::Optional` behavior) and make
the replacement so the comment no longer contains the stale "issues/XX"
placeholder.
- Around line 405-433: The test stdio_mtree_missing_required_file_fails is flaky
because leftover nonexistent.txt from prior runs can make the test pass; fix by
ensuring the test uses a clean directory or a temp directory before creating
manifest.mtree—either create a fresh tempdir (tempfile::tempdir) for base or
explicitly remove any preexisting files (e.g.,
base.join("nonexistent.txt").exists() -> fs::remove_file) and recreate the base
directory prior to writing manifest.mtree so the manifest truly references a
missing file.

Comment thread cli/src/command/core/mtree.rs
Comment thread cli/tests/cli/stdio/mtree.rs Outdated
Comment thread cli/tests/cli/stdio/mtree.rs Outdated
@ChanTsune ChanTsune force-pushed the cli/@/mtree branch 5 times, most recently from 1efda3a to 5680528 Compare January 25, 2026 10:07
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cli/src/command/core.rs`:
- Around line 50-61: detect_format currently uses reader.fill_buf which may
return fewer than PNA_HEADER.len() bytes on streams, causing mis-detection;
instead, read exactly PNA_HEADER.len() bytes into a temporary buffer (handling
short reads/EOF), then compare that buffer to PNA_HEADER to decide Pna vs Mtree,
and finally chain the temporary buffer back onto the original reader (e.g., via
an io::Cursor and chain) so the downstream code sees the full stream; update
detect_format to use this read-then-chain approach and reference PNA_HEADER,
detect_format, SourceFormat::Pna, and SourceFormat::Mtree when making the
change.
♻️ Duplicate comments (2)
cli/src/command/core/mtree.rs (1)

329-386: CLI owner overrides still lose to mtree metadata.
This makes @mtree behavior inconsistent with transform_normal_entry, where CLI overrides win. Prefer options.* first, then mtree, then FS.

🔧 Suggested precedence fix
-        let uid = if nochange {
-            options.uid.unwrap_or(fs_metadata.uid())
-        } else {
-            mtree_entry
-                .uid()
-                .or(options.uid)
-                .unwrap_or(fs_metadata.uid())
-        };
-        let gid = if nochange {
-            options.gid.unwrap_or(fs_metadata.gid())
-        } else {
-            mtree_entry
-                .gid()
-                .or(options.gid)
-                .unwrap_or(fs_metadata.gid())
-        };
+        let uid = if nochange {
+            options.uid.unwrap_or(fs_metadata.uid())
+        } else {
+            options
+                .uid
+                .or_else(|| mtree_entry.uid())
+                .unwrap_or(fs_metadata.uid())
+        };
+        let gid = if nochange {
+            options.gid.unwrap_or(fs_metadata.gid())
+        } else {
+            options
+                .gid
+                .or_else(|| mtree_entry.gid())
+                .unwrap_or(fs_metadata.gid())
+        };

         let uname = {
             let fallback = || -> io::Result<String> {
                 if let Some(ref uname) = options.uname {
                     Ok(uname.clone())
                 } else {
                     Ok(User::from_uid(uid.into())?
                         .name()
                         .unwrap_or_default()
                         .into())
                 }
             };
-            if nochange {
+            if options.uname.is_some() {
+                options.uname.clone().unwrap_or_default()
+            } else if nochange {
                 fallback()?
             } else if let Some(uname) = mtree_entry.uname() {
                 String::from_utf8_lossy(uname).into_owned()
             } else {
                 fallback()?
             }
         };

         let gname = {
             let fallback = || -> io::Result<String> {
                 if let Some(ref gname) = options.gname {
                     Ok(gname.clone())
                 } else {
                     Ok(Group::from_gid(gid.into())?
                         .name()
                         .unwrap_or_default()
                         .into())
                 }
             };
-            if nochange {
+            if options.gname.is_some() {
+                options.gname.clone().unwrap_or_default()
+            } else if nochange {
                 fallback()?
             } else if let Some(gname) = mtree_entry.gname() {
                 String::from_utf8_lossy(gname).into_owned()
             } else {
                 fallback()?
             }
         };
-        let uid = mtree_entry.uid().or(options.uid).unwrap_or(0);
-        let gid = mtree_entry.gid().or(options.gid).unwrap_or(0);
+        let uid = options.uid.or_else(|| mtree_entry.uid()).unwrap_or(0);
+        let gid = options.gid.or_else(|| mtree_entry.gid()).unwrap_or(0);

-        let uname = if let Some(uname) = mtree_entry.uname() {
-            String::from_utf8_lossy(uname).into_owned()
-        } else if let Some(ref uname) = options.uname {
-            uname.clone()
-        } else {
-            String::new()
-        };
-        let gname = if let Some(gname) = mtree_entry.gname() {
-            String::from_utf8_lossy(gname).into_owned()
-        } else if let Some(ref gname) = options.gname {
-            gname.clone()
-        } else {
-            String::new()
-        };
+        let uname = if let Some(ref uname) = options.uname {
+            uname.clone()
+        } else if let Some(uname) = mtree_entry.uname() {
+            String::from_utf8_lossy(uname).into_owned()
+        } else {
+            String::new()
+        };
+        let gname = if let Some(ref gname) = options.gname {
+            gname.clone()
+        } else if let Some(gname) = mtree_entry.gname() {
+            String::from_utf8_lossy(gname).into_owned()
+        } else {
+            String::new()
+        };

Also applies to: 414-434

cli/tests/cli/stdio/mtree.rs (1)

303-308: Prevent flakiness if nonexistent.txt already exists.
If a previous run left the file behind, this test can pass unexpectedly.

🧹 Suggested cleanup
 fn stdio_mtree_missing_required_file_fails() {
     setup();

     let base = PathBuf::from("stdio_mtree_missing_required_file_fails");
+    let _ = fs::remove_dir_all(&base);
     fs::create_dir_all(&base).unwrap();

Comment thread cli/src/command/core.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cli/src/command/core/mtree.rs`:
- Around line 329-345: apply_mtree_metadata is giving mtree_entry.uid()/gid()
precedence over CLI options, while transform_normal_entry applies CLI options
first; make precedence consistent by applying options.uid/options.gid before
falling back to mtree_entry values (respecting the nochange flag behavior).
Update the uid/gid selection in apply_mtree_metadata to mirror
transform_normal_entry: when nochange is true use
options.unwrap_or(fs_metadata...), otherwise check options.uid/options.gid
first, then mtree_entry.uid()/gid(), then fs_metadata.uid()/gid(); modify the
uid and gid assignments accordingly (referencing apply_mtree_metadata,
transform_normal_entry, mtree_entry.uid()/gid(), options.uid/options.gid(),
fs_metadata.uid()/gid(), and the nochange flag).
♻️ Duplicate comments (2)
cli/tests/cli/stdio/mtree.rs (1)

302-329: Potential test flakiness from leftover files.

If a previous test run or manual intervention leaves nonexistent.txt in the stdio_mtree_missing_required_file_fails directory, this test could incorrectly pass. Consider adding cleanup at the start.

🧹 Suggested cleanup
 fn stdio_mtree_missing_required_file_fails() {
     setup();

     let base = PathBuf::from("stdio_mtree_missing_required_file_fails");
+    let _ = fs::remove_dir_all(&base);
     fs::create_dir_all(&base).unwrap();
cli/src/command/core.rs (1)

50-62: Potential mis-detection of PNA on short reads from pipes/stdin.

fill_buf() can return fewer than 8 bytes even when more data will arrive (especially from pipes/stdin), causing a valid PNA stream to be incorrectly treated as mtree. Consider reading the header bytes explicitly and chaining them back into the reader.

💡 One approach (read + chain)
-pub(crate) fn detect_format<R: io::BufRead>(reader: &mut R) -> io::Result<SourceFormat> {
-    let buf = reader.fill_buf()?;
-    if buf.len() >= PNA_HEADER.len() && buf[..PNA_HEADER.len()] == *PNA_HEADER {
-        return Ok(SourceFormat::Pna);
-    }
-    Ok(SourceFormat::Mtree)
-}
+pub(crate) fn detect_format<R: io::BufRead>(reader: &mut R) -> io::Result<SourceFormat> {
+    // fill_buf may return fewer bytes than available on pipes, so keep filling
+    loop {
+        let buf = reader.fill_buf()?;
+        if buf.len() >= PNA_HEADER.len() {
+            return if buf[..PNA_HEADER.len()] == *PNA_HEADER {
+                Ok(SourceFormat::Pna)
+            } else {
+                Ok(SourceFormat::Mtree)
+            };
+        }
+        if buf.is_empty() {
+            // EOF reached with < 8 bytes: cannot be PNA
+            return Ok(SourceFormat::Mtree);
+        }
+        // Buffer not yet full but not EOF - this shouldn't happen with BufReader
+        // but for safety, treat as mtree
+        break;
+    }
+    Ok(SourceFormat::Mtree)
+}

Comment thread cli/src/command/core/mtree.rs Outdated
@ChanTsune ChanTsune force-pushed the cli/@/mtree branch 2 times, most recently from 002cd9a to 0fc7d20 Compare January 25, 2026 13:31
Comment thread cli/src/command/core/mtree.rs Fixed
Comment thread cli/src/command/core/mtree.rs Fixed
@ChanTsune ChanTsune merged commit 0875bcb into main Jan 26, 2026
100 checks passed
@ChanTsune ChanTsune deleted the cli/@/mtree branch January 26, 2026 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli This issue is about cli application dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants