Skip to content

@ support#2523

Merged
ChanTsune merged 1 commit into
mainfrom
cli/stdio/@
Jan 18, 2026
Merged

@ support#2523
ChanTsune merged 1 commit into
mainfrom
cli/stdio/@

Conversation

@ChanTsune
Copy link
Copy Markdown
Owner

@ChanTsune ChanTsune commented Dec 28, 2025

Redesign @archive inclusion to preserve CLI argument order exactly. The previous implementation separated filesystem paths and @Archives before processing, which lost the original ordering.

Key changes:

  • Add ItemSource, ArchiveSource, CollectedItem types to core.rs
  • Add collect_items_from_sources for unified processing
  • Process arguments one by one, maintaining order
  • Support @- for reading archives from stdin
  • Detect stdin conflicts (input archive from stdin vs @-)

Summary by CodeRabbit

  • New Features

    • Accept archive inputs via @ (and @- for stdin), preserving interleaved filesystem+archive order; parsing happens after directory changes.
    • Archive inclusion supports passwords, time filters, and ownership/timestamp overrides; archives can be expanded as single or batched entries for writing.
  • Bug Fixes

    • Prevent duplicate stdin sources.
  • Tests

    • Added comprehensive archive-inclusion tests for merging, ordering, filtering, extraction, and integrity.
  • Documentation

    • Improved user-facing docs/help for archive-source usage.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 28, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds archive-aware item sources and streaming entry handling: new types (ArchiveSource, ItemSource, CollectedEntry, crate-private CollectedItem::ArchiveMarker, EntryResult, TimeFilters), helpers (collect_items_from_sources, transform_archive_entries, read_archive_source, transform_normal_entry, drain_entry_results), and wires archive-source handling through create/append/update and tests.

Changes

Cohort / File(s) Summary
Core types & helpers
cli/src/command/core.rs
Adds ArchiveSource (File/Stdin, Display), ItemSource (Filesystem/Archive) with parsing, introduces CollectedEntry and crate-private CollectedItem (Filesystem/ArchiveMarker), adds EntryResult and TimeFilters, PermissionStrategyResolver::has_overrides(), and functions collect_items_from_sources, transform_archive_entries, read_archive_source, transform_normal_entry, plus doc comments and tests.
StdIO CLI integration
cli/src/command/stdio.rs
Parse ItemSource after -C; use collect_items_from_sources; propagate filter, time_filters, and password into create/append flows; add validate_no_duplicate_stdin.
Create pipeline
cli/src/command/create.rs
Replace per-item rayon spawning with spawn_entry_results/drain_entry_results; create flows now accept filter, time_filters, password; consume EntryResult::Single/Batch streams and integrate archive-source reading.
Append pipeline
cli/src/command/append.rs
Switch to spawn_entry_results/drain_entry_results; run_append_archive signature updated to accept filter, time_filters, password; target items mapped to CollectedItem::Filesystem before archiving.
Update pipeline
cli/src/command/update.rs
Replace Vec<CollectedItem>Vec<CollectedEntry> in run_update_archive signatures and imports.
Tests
cli/tests/cli/stdio.rs, cli/tests/cli/stdio/archive_inclusion.rs
Add archive_inclusion test module with helpers (create_test_archive, get_archive_entry_names) and extensive tests covering archive inclusion, ordering, filters, solid/append modes, errors, and integrity.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (stdio)
    participant Parser as ItemSource Parser
    participant Collector as collect_items_from_sources
    participant FSys as Filesystem
    participant Archive as ArchiveSource Reader
    participant Worker as spawn_entry_results / transformer
    participant Writer as Archive Writer

    CLI->>Parser: parse_many(sources) after -C
    Parser-->>CLI: Vec<ItemSource>
    CLI->>Collector: collect_items_from_sources(sources, filter, time_filters)
    alt filesystem source
        Collector->>FSys: stat/read path
        FSys-->>Collector: CollectedEntry (Filesystem)
    else archive source
        Collector->>Archive: emit ArchiveMarker(ArchiveSource)
    end
    Collector-->>Worker: stream CollectedItem (Filesystem | ArchiveMarker)
    Worker->>Archive: transform_archive_entries(reader, filter, time_filters)  %% for ArchiveMarker
    Archive-->>Worker: Batch of NormalEntry items
    Worker->>Writer: EntryResult::Single/Batch stream
    Writer->>Writer: drain_entry_results -> add_entry / write
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

"I hop through paths both new and old,
I stash the files in batches bold,
Markers twitch and filters hum,
Entries stream — the archive's come,
A rabbit cheers: the build's not cold. 🐇"

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title '@' support is overly vague and generic, using minimal descriptive information that doesn't clearly convey the main change. Consider a more descriptive title like 'Preserve CLI argument order for archive inclusion' or 'Add @archive support with argument order preservation' to better communicate the primary objective.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a fundamental redesign of how command-line arguments are processed when creating and appending archives. The primary goal is to ensure that the order of specified files and included archives is maintained exactly as provided by the user. This enhancement also brings support for reading archive contents directly from standard input and includes robust conflict detection for stdin/stdout usage, making the archiving tool more flexible and predictable.

Highlights

  • Preserved CLI Argument Order: The implementation has been redesigned to ensure that the order of command-line arguments, including both filesystem paths and @archive inclusions, is strictly preserved during archive creation and appending. This addresses a previous limitation where the original ordering was lost.
  • New Data Structures for Item Handling: Introduced ItemSource, ArchiveSource, and CollectedItem enums in core.rs to represent different types of input (filesystem paths, archive files, or stdin for archives) and to facilitate unified, order-preserving processing.
  • Stdin Archive Support: Added support for reading archive entries from standard input using the @- syntax, following bsdtar's convention.
  • Stdin Conflict Detection: Implemented checks to detect and prevent conflicts, such as attempting to read an @- archive from stdin while simultaneously writing the output archive to stdout, or when the input archive for an append operation is also from stdin.
  • Refactored Archive Creation/Appending Logic: The run_create_archive and run_append functions in stdio.rs have been significantly refactored to integrate the new item collection logic and handle mixed sources, including dedicated functions for normal and solid archive creation with inclusions.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-designed feature to support @archive inclusions, preserving argument order. The new data structures like ItemSource and CollectedItem are a great way to handle the mixed sources. The implementation is thorough and includes a comprehensive set of new tests.

My review focuses on a few key areas:

  • A correctness issue with handling encrypted source archives in non-solid mode.
  • Minor improvements for code clarity and efficiency in branches where no archive inclusions are present.

Additionally, the new test suite for archive inclusion is excellent, but it would be even stronger if it included cases for encrypted source archives. This would have likely caught the issue mentioned above.

Overall, this is a solid contribution that greatly enhances the tool's capabilities.

Comment thread cli/src/command/stdio.rs Outdated
Comment thread cli/src/command/stdio.rs Outdated
Comment thread cli/src/command/stdio.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cli/src/command/core.rs (1)

369-380: Consider edge case: literal @ path handling.

The current implementation treats any argument starting with @ as an archive inclusion. If a user has a legitimate filesystem path starting with @ (rare but possible), they cannot archive it.

Consider documenting this behavior or providing an escape mechanism (e.g., ./@path works as a workaround since it doesn't start with @).

cli/src/command/stdio.rs (1)

1179-1184: Consider including entry names in verbose output.

The verbose output only shows "a (from @{source})" without the actual entry name. For consistency with other verbose outputs (e.g., line 1118, line 1207), consider showing the entry path.

However, since raw_entries() returns opaque raw entries, accessing the path may require additional parsing. If this would require significant changes, the current approach is acceptable.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2f9d47 and fcaba85.

📒 Files selected for processing (4)
  • cli/src/command/core.rs
  • cli/src/command/stdio.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/tests/cli/stdio/archive_inclusion.rs (3)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (196-215)
cli/src/command/stdio.rs (2)
cli/src/command/core.rs (6)
  • collect_items_from_sources (420-444)
  • create_entry (680-750)
  • entry_option (752-772)
  • parse_many (383-385)
  • has_stdin_archive (388-392)
  • new (197-218)
cli/src/command/append.rs (1)
  • open_archive_then_seek_to_end (473-477)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: build-local-artifacts (x86_64-pc-windows-msvc)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
  • GitHub Check: tier1 (macos-latest, nightly)
  • GitHub Check: tier1 (ubuntu-22.04-arm, beta)
  • GitHub Check: tier3_cross (redoxos/redoxer, x86_64-unknown-redox)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip2)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-emscripten)
  • GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
  • GitHub Check: Test WebAssembly (stable, wasm32-unknown-unknown)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
  • GitHub Check: Test WebAssembly (stable, wasm32-wasip2)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-unknown)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
  • GitHub Check: Test WebAssembly (beta, wasm32-wasip1)
  • GitHub Check: tier1 (ubuntu-latest, beta)
  • GitHub Check: msrv (ubuntu-latest, libpna)
  • GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (12)
cli/tests/cli/stdio.rs (1)

1-6: LGTM!

The new test module declarations are appropriately organized. The archive_inclusion module aligns with the PR's core feature for @ support, and strip_components/unlink_first modules provide additional test coverage.

cli/tests/cli/stdio/archive_inclusion.rs (1)

1-474: Comprehensive test coverage for the new @ archive inclusion feature.

The test suite covers all essential scenarios:

  • Basic inclusion, multiple archives, solid mode, append mode
  • Error cases (non-existent archives, stdin/stdout conflicts)
  • Data integrity with larger content
  • Ordering preservation - the key fix this PR addresses

The stdio_archive_inclusion_ordering test at lines 363-474 is particularly important as it validates the fix for the rayon parallel processing ordering issue mentioned in the PR objectives.

cli/src/command/core.rs (4)

293-299: LGTM!

Adding Clone and Debug derives to StoreAs is appropriate for the new CollectedItem type which contains StoreAs and needs these traits.


301-346: LGTM!

The ArchiveReader and ArchiveSource types are well-designed:

  • ArchiveReader avoids dynamic dispatch by using an enum for file/stdin sources
  • The 64KB buffer size is reasonable for archive I/O
  • Display implementation provides clean output for logging/error messages

382-392: LGTM!

parse_many and has_stdin_archive are straightforward utility methods with correct implementations.


395-444: LGTM!

The CollectedItem enum and collect_items_from_sources function correctly implement order-preserving collection:

  • Sequential processing maintains CLI argument order
  • Shared HardlinkResolver enables cross-path hardlink detection
  • ArchiveMarker placeholders preserve insertion points for archive entries

This addresses the PR's core objective of preserving CLI argument order.

cli/src/command/stdio.rs (6)

523-531: LGTM!

The order of operations is correct:

  1. Parse @archive paths before changing working directory (so paths are canonicalized correctly)
  2. Check for stdin/stdout conflict early with a clear error message

615-654: LGTM!

The branch handling for @archive inclusions is well-structured:

  • Uses WriteOptions::store() for solid mode entries (compression is applied at the solid block level)
  • Properly passes cli_option for the solid container's write options
  • Handles both file and stdout output cases

898-908: LGTM!

Consistent pattern with the create flow. The error message correctly differentiates the conflict scenario for append mode (stdin is used for the input archive, not output).


1017-1041: LGTM!

Clean implementation that processes items in order, correctly handling both filesystem entries and archive markers.


1067-1167: LGTM!

The archive creation functions are well-structured:

  • Clear separation between normal and solid mode paths
  • Solid mode correctly uses SolidEntryBuilder to accumulate entries before building the solid block
  • Both paths maintain the ordering guarantee

1190-1217: LGTM!

Correctly decodes and re-encodes entries for solid mode. The password parameter is properly used for encrypted source archives, and verbose output includes entry names.

Comment thread cli/src/command/stdio.rs Outdated
@github-actions github-actions Bot added the cli This issue is about cli application label Dec 28, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
cli/src/command/stdio.rs (3)

659-665: Use map with unreachable! instead of filter_map.

Since has_archive_inclusions is false in this branch, all items are guaranteed to be CollectedItem::Filesystem variants. Using filter_map is unnecessary and obscures this invariant.

Based on past review comments, replace with:

🔎 Proposed refactor
-        let target_items: Vec<_> = collected_items
-            .into_iter()
-            .filter_map(|item| match item {
-                CollectedItem::Filesystem(path, store_as) => Some((path, store_as)),
-                CollectedItem::ArchiveMarker(_) => None,
-            })
-            .collect();
+        let target_items: Vec<_> = collected_items
+            .into_iter()
+            .map(|item| match item {
+                CollectedItem::Filesystem(path, store_as) => (path, store_as),
+                CollectedItem::ArchiveMarker(_) => unreachable!("has_archive_inclusions is false"),
+            })
+            .collect();

997-1003: Use map with unreachable! instead of filter_map.

Since has_archive_inclusions is false in this branch, all items are guaranteed to be CollectedItem::Filesystem variants. Using filter_map is unnecessary and obscures this invariant.

Based on past review comments, replace with:

🔎 Proposed refactor
-        let target_items: Vec<_> = collected_items
-            .into_iter()
-            .filter_map(|item| match item {
-                CollectedItem::Filesystem(path, store_as) => Some((path, store_as)),
-                CollectedItem::ArchiveMarker(_) => None,
-            })
-            .collect();
+        let target_items: Vec<_> = collected_items
+            .into_iter()
+            .map(|item| match item {
+                CollectedItem::Filesystem(path, store_as) => (path, store_as),
+                CollectedItem::ArchiveMarker(_) => unreachable!("has_archive_inclusions is false"),
+            })
+            .collect();

1237-1288: Preserve extra chunks in rebuild_entry_for_solid.

The function preserves metadata and xattrs but does not preserve extra chunks (e.g., ACL chunks faCl, faCe). These chunks will be silently dropped when entries are included in a solid archive.

Based on past review comments and patterns in migrate.rs:96-101 and strip.rs:143-148, add extra chunk preservation:

🔎 Proposed fix
     // Apply xattrs
     for xattr in xattrs {
         builder.add_xattr(xattr.clone());
     }
+
+    // Apply extra chunks (e.g., ACL chunks)
+    for chunk in entry.extra_chunks() {
+        builder.add_extra_chunk(chunk.clone());
+    }

     builder.build()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcaba85 and 82f7c6d.

📒 Files selected for processing (4)
  • cli/src/command/core.rs
  • cli/src/command/stdio.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (80-82)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (196-215)
lib/src/chunk/types.rs (1)
  • len (106-108)
cli/src/command/stdio.rs (1)
cli/src/command/core.rs (4)
  • collect_items_from_sources (422-446)
  • create_entry (682-752)
  • has_stdin_archive (390-394)
  • new (197-218)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: tier1 (windows-11-arm, beta)
  • GitHub Check: tier1 (windows-latest, stable)
  • GitHub Check: Analyze (rust)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip2)
  • GitHub Check: tier1 (macos-latest, stable)
  • GitHub Check: tier1 (windows-11-arm, nightly)
  • GitHub Check: tier1 (windows-11-arm, stable)
  • GitHub Check: tier1 (macos-latest, nightly)
  • GitHub Check: tier1 (windows-11-arm, beta)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-unknown)
  • GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
  • GitHub Check: Test WebAssembly (stable, wasm32-wasip1)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
  • GitHub Check: Test WebAssembly (beta, wasm32-wasip1)
  • GitHub Check: Test WebAssembly (stable, wasm32-unknown-unknown)
  • GitHub Check: msrv (ubuntu-latest, libpna)
  • GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (17)
cli/src/command/core.rs (7)

23-23: LGTM!

The fmt import is necessary for the Display implementation on ArchiveSource.


293-299: LGTM!

The StoreAs enum correctly represents the different storage strategies for filesystem items.


301-315: LGTM!

The ArchiveReader enum avoids dynamic dispatch overhead while supporting both file and stdin sources. The Read trait implementation correctly delegates to the underlying reader.


350-395: LGTM!

The ItemSource parsing correctly handles the @archive syntax, including @- for stdin. Canonicalizing archive paths (line 376) before changing working directory (via -C) is the right approach to avoid path resolution issues.


397-407: LGTM!

The CollectedItem enum cleanly separates filesystem items from archive markers, enabling order-preserving interleaved processing.


422-446: LGTM!

The function correctly preserves CLI argument order while sharing a HardlinkResolver across all filesystem sources for proper cross-path hardlink detection. The sequential processing ensures predictable ordering.


324-337: This pattern is safe and is already verified by existing tests.

The io::stdin().lock() pattern is safe because io::stdin() returns a thread-local handle to the static stdin resource. The returned StdinLock is valid for accessing that static resource, even though the immediate Stdin value is temporary. The comprehensive test suite in cli/tests/cli/stdio/archive_inclusion.rs (including stdio_archive_inclusion_basic, stdio_archive_inclusion_solid, stdio_archive_inclusion_data_integrity, and others) already verifies that stdin archive inclusion via @- works correctly across various scenarios.

cli/tests/cli/stdio/archive_inclusion.rs (4)

1-8: LGTM!

The imports are appropriate for the test functionality.


9-27: LGTM!

The helper function creates test archives with appropriate defaults. Using unwrap() in test helpers is acceptable.


29-36: LGTM!

The helper function provides a clean way to extract entry names for test assertions.


38-474: LGTM!

The test suite comprehensively covers archive inclusion functionality:

  • Basic and multiple inclusions
  • Solid mode support
  • Append mode
  • Error handling (nonexistent archives, stdin conflicts)
  • Data integrity verification
  • Order preservation

The tests are well-structured and verify both success and failure scenarios.

cli/src/command/stdio.rs (6)

10-14: LGTM!

The import additions are necessary for the archive inclusion functionality.

Also applies to: 26-35


523-531: LGTM!

The stdin conflict detection correctly prevents using @- (archive from stdin) when writing output to stdout, which would cause a resource conflict. The error message is clear.


1021-1046: LGTM!

The function correctly processes items in order, handling filesystem entries and archive markers appropriately. The verbose output helps with debugging.


1073-1109: LGTM!

The dispatcher function cleanly separates solid and normal archive creation logic.


1111-1141: LGTM!

The function correctly creates a normal archive with proper ordering and finalization.


1143-1178: LGTM!

The function correctly creates a solid archive, properly using the SolidEntryBuilder and maintaining entry order.

Comment thread cli/src/command/stdio.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cli/tests/cli/stdio/archive_inclusion.rs (1)

1-754: Optional: Consider extracting common test patterns.

The test suite is comprehensive and well-structured. For maintainability, consider extracting the common CLI command invocation pattern into a helper function to reduce duplication. However, the current structure is clear and acceptable.

Example helper pattern (optional)
fn run_pna_create(output: &Path, base_dir: &Path, args: &[&str], extra_flags: &[&str]) -> assert_cmd::assert::Assert {
    let mut cmd = cargo_bin_cmd!("pna");
    cmd.args(["--quiet", "experimental", "stdio", "--create", "--unstable", "--overwrite"])
        .arg("-f").arg(output)
        .arg("-C").arg(base_dir)
        .args(extra_flags)
        .args(args);
    cmd.assert()
}

This would reduce the repetitive 10+ line command construction blocks throughout the tests.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c82c935 and 3d6cb52.

📒 Files selected for processing (1)
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (3)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (80-82)
lib/src/archive/write.rs (1)
  • write_header (85-88)
🔇 Additional comments (11)
cli/tests/cli/stdio/archive_inclusion.rs (11)

1-36: Well-structured test helpers.

The imports and helper functions are well-designed. create_test_archive and get_archive_entry_names provide clear abstractions that reduce duplication across the test suite.


38-94: LGTM: Comprehensive basic inclusion test.

The test correctly verifies basic @archive inclusion functionality, including both filesystem and archive sources. The use of HashSet for order-independent verification is appropriate for this scenario.


96-145: LGTM: Multiple archive inclusion test.

The test correctly verifies that multiple @archive inclusions can be processed in a single command, and all entries are preserved.


147-221: LGTM: Solid mode inclusion test with extraction verification.

The test properly verifies that archive inclusion works with solid mode and includes extraction verification to ensure data integrity. This is a thorough test of the feature interaction.


223-259: LGTM: Append mode inclusion test.

The test correctly verifies that @archive inclusion works in append mode, preserving both the original and newly included entries.


261-307: LGTM: Proper error case coverage.

Both error tests are correctly structured:

  • Non-existent archive inclusion properly fails
  • Stdin/stdout conflict detection works as expected

These tests ensure the feature handles error conditions gracefully.


309-361: LGTM: Data integrity verification.

The test properly verifies that archive inclusion preserves data integrity for larger content, including full round-trip extraction verification.


363-474: Excellent coverage of the core PR objective.

This test directly validates the fix for preserving CLI argument order when interleaving filesystem paths with @archive inclusions. The precise index-based assertions and the extraction verification make this a thorough test of the feature's key requirement.


476-591: LGTM: Filter interaction tests.

Both the exclude and include filter tests properly verify that filtering patterns work correctly with @archive inclusion, ensuring entries from included archives respect the specified filters.


593-754: LGTM: Comprehensive edge case coverage.

The final three tests provide excellent coverage of edge cases:

  • Archive-first ordering ensures order preservation works in all argument positions
  • Empty archive inclusion ensures graceful handling of edge cases
  • Combined filters with clear precedence rules (exclude wins) and well-documented expected behavior

These tests complete a thorough test suite for the archive inclusion feature.


43-44: Test artifacts are properly isolated. The setup() function (called at the start of each test) changes the current working directory to env!("CARGO_TARGET_TMPDIR"), a temporary directory managed by Cargo. All test artifacts created with PathBuf::from(...) are therefore isolated in this temporary location, not in the repository or working directory. No changes required.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
cli/tests/cli/stdio/archive_inclusion.rs (3)

75-75: Consider extracting a helper for @archive argument formatting.

The pattern format!("@{}", archive_path.to_str().unwrap()) is repeated throughout the tests. Extracting this into a helper function would improve maintainability and reduce duplication.

🔎 Proposed helper function

Add this helper near the top of the file:

fn archive_arg(path: &Path) -> String {
    format!("@{}", path.to_str().unwrap())
}

Then replace occurrences like:

-            &format!("@{}", source_archive.to_str().unwrap()),
+            &archive_arg(&source_archive),

Also applies to: 131-132, 182-182, 249-249, 334-334, 403-403, 511-511, 571-571, 625-625, 675-675, 727-727, 791-791


319-319: Consider using a named constant for the large content size.

The magic number 10000 for testing data integrity could be more self-documenting as a named constant.

🔎 Proposed refactor
+const LARGE_CONTENT_SIZE: usize = 10000;
+
 fn stdio_archive_inclusion_data_integrity() {
     // ...
-    let large_content = "x".repeat(10000);
+    let large_content = "x".repeat(LARGE_CONTENT_SIZE);

141-144: Consider adding descriptive assertion messages for consistency.

Some tests have detailed assertion messages (e.g., lines 84-92, 520-532) while others use simple assertions without messages. Adding consistent messages would make test failures easier to diagnose.

For example:

-    assert!(entry_names.contains("new.txt"));
-    assert!(entry_names.contains("from_source1.txt"));
-    assert!(entry_names.contains("from_source2.txt"));
-    assert_eq!(entry_names.len(), 3);
+    assert!(entry_names.contains("new.txt"), "Missing new.txt");
+    assert!(entry_names.contains("from_source1.txt"), "Missing from_source1.txt");
+    assert!(entry_names.contains("from_source2.txt"), "Missing from_source2.txt");
+    assert_eq!(entry_names.len(), 3, "Expected exactly 3 entries");

Also applies to: 191-193, 256-258

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d6cb52 and 11b9c28.

📒 Files selected for processing (1)
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (80-82)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (196-215)
🔇 Additional comments (1)
cli/tests/cli/stdio/archive_inclusion.rs (1)

9-36: LGTM! Well-designed helper functions.

The helper functions are clean and reusable. create_test_archive correctly handles parent directory creation and archive writing, while get_archive_entry_names provides a convenient way to verify archive contents.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cli/tests/cli/stdio/archive_inclusion.rs (2)

38-841: Comprehensive test coverage with opportunities for refactoring.

The test suite provides excellent coverage of archive inclusion scenarios, including:

  • Basic and multiple archive inclusion
  • Solid mode and append operations
  • Error cases (non-existent archives, stdin/stdout conflicts)
  • Ordering preservation
  • Filter interactions (include/exclude patterns)

The tests are well-structured and logically correct. However, there's significant code duplication across test functions, particularly in:

  • Archive creation patterns
  • CLI command invocation
  • Verification steps
Consider extracting common test patterns

You could reduce duplication by introducing helper functions or macros for common operations:

// Helper for running create command with common args
fn run_create_command(output: &Path, args: &[&str]) -> assert_cmd::assert::Assert {
    let mut cmd = cargo_bin_cmd!("pna");
    cmd.args(["--quiet", "experimental", "stdio", "--create", "--unstable", "--overwrite", "-f", output.to_str().unwrap()]);
    cmd.args(args);
    cmd.assert()
}

// Helper for extraction verification
fn extract_and_verify(archive: &Path, extract_dir: &Path) {
    cargo_bin_cmd!("pna")
        .args([
            "--quiet", "experimental", "stdio", "--extract",
            "--unstable", "--overwrite",
            "-f", archive.to_str().unwrap(),
            "--out-dir", extract_dir.to_str().unwrap(),
        ])
        .assert()
        .success();
}

This would make tests more concise and easier to maintain.


319-319: Consider documenting the test data size.

The value 10000 is used to create "large" content for data integrity testing. Adding a constant or comment explaining the choice would improve test clarity.

+    // Use 10KB to verify data integrity with moderately sized content
     let large_content = "x".repeat(10000);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11b9c28 and ca1dc25.

📒 Files selected for processing (1)
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (80-82)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (196-215)
🔇 Additional comments (2)
cli/tests/cli/stdio/archive_inclusion.rs (2)

1-36: Well-structured helper functions.

The imports and helper functions are clear and appropriate for the test module. Both create_test_archive and get_archive_entry_names provide reusable utilities that simplify the test cases.


3-3: No action required. Library versions are current and secure.

The specified versions—pna 0.29.3 (workspace dependency) and assert_cmd 2.1.1—are the latest available and have no known security vulnerabilities as of December 2025.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
cli/src/command/stdio.rs (2)

1189-1208: [Acknowledged] Critical issue already flagged: Entry metadata mismatch.

This function has been flagged in previous reviews for not rebuilding entries after decryption. The FHED header metadata may not match the actual entry data state, causing potential corruption when the destination archive has different encryption settings.

The fix would involve rebuilding entries (similar to append_archive_entries_to_solid) to ensure header metadata matches the actual data state. However, since this has already been thoroughly reviewed and documented, no additional verification is needed at this time.


1235-1286: [Acknowledged] Extra chunks not preserved.

This function has been flagged in previous reviews for not preserving extra chunks (e.g., ACL chunks faCl, faCe). These would be silently dropped when entries are included in a solid archive.

The fix would involve iterating over entry.extra_chunks() and calling builder.add_extra_chunk(chunk.clone()) for each, following the pattern in migrate.rs and strip.rs. Since this has already been documented, no additional verification is needed.

🧹 Nitpick comments (2)
cli/src/command/stdio.rs (2)

657-663: Consider using map instead of filter_map for clarity.

Since has_archive_inclusions is false in this branch, we know that collected_items only contains CollectedItem::Filesystem variants. Using map with unreachable!() for the ArchiveMarker case would express this invariant more clearly.

🔎 Proposed refactor
-        let target_items: Vec<_> = collected_items
-            .into_iter()
-            .filter_map(|item| match item {
-                CollectedItem::Filesystem(path, store_as) => Some((path, store_as)),
-                CollectedItem::ArchiveMarker(_) => None,
-            })
-            .collect();
+        let target_items: Vec<_> = collected_items
+            .into_iter()
+            .map(|item| match item {
+                CollectedItem::Filesystem(path, store_as) => (path, store_as),
+                CollectedItem::ArchiveMarker(_) => unreachable!("no archive markers when has_archive_inclusions is false"),
+            })
+            .collect();

993-999: Consider using map instead of filter_map for clarity.

Since has_archive_inclusions is false in this branch, using map with unreachable!() for the ArchiveMarker case would express this invariant more clearly.

🔎 Proposed refactor
-        let target_items: Vec<_> = collected_items
-            .into_iter()
-            .filter_map(|item| match item {
-                CollectedItem::Filesystem(path, store_as) => Some((path, store_as)),
-                CollectedItem::ArchiveMarker(_) => None,
-            })
-            .collect();
+        let target_items: Vec<_> = collected_items
+            .into_iter()
+            .map(|item| match item {
+                CollectedItem::Filesystem(path, store_as) => (path, store_as),
+                CollectedItem::ArchiveMarker(_) => unreachable!("no archive markers when has_archive_inclusions is false"),
+            })
+            .collect();
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca1dc25 and a724d8d.

📒 Files selected for processing (4)
  • cli/src/command/core.rs
  • cli/src/command/stdio.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/src/command/stdio.rs (3)
cli/src/command/core.rs (5)
  • collect_items_from_sources (390-414)
  • create_entry (650-720)
  • parse_many (353-355)
  • has_stdin_archive (358-362)
  • new (197-218)
cli/src/command/create.rs (1)
  • create_archive_file (510-572)
cli/src/command/append.rs (1)
  • run_append_archive (444-470)
🔇 Additional comments (15)
cli/src/command/core.rs (4)

301-316: LGTM! Clean enum design.

The ArchiveSource enum clearly represents the two archive input sources (file or stdin), and the Display implementation correctly formats "-" for stdin and the path display for files.


365-375: LGTM! Clear separation of concerns.

The CollectedItem enum cleanly separates filesystem items (requiring entry building) from archive markers (requiring entry copying), making the processing logic straightforward.


390-414: LGTM! Correct order preservation and hardlink detection.

The function correctly:

  • Shares a single HardlinkResolver across all sources for cross-path hardlink detection
  • Preserves CLI argument order by processing sources sequentially
  • Separates filesystem collection from archive marker insertion

339-350: The canonicalization error handling is already correct and produces clear error messages.

The fs::canonicalize() call properly propagates io::Error through the ? operator, which gets converted to anyhow::Error in main() and displayed to the user with the appropriate OS error message (e.g., "No such file or directory" for missing files). The early canonicalization before working directory changes is intentional and documented at lines 337–338. No changes needed.

cli/tests/cli/stdio.rs (1)

1-1: LGTM! Test module addition.

The new archive_inclusion test module aligns with the PR's feature addition.

cli/src/command/stdio.rs (10)

11-14: LGTM! Necessary imports for @archive support.

The new imports from core module are all used in the inclusion-aware creation and append flows.


523-531: LGTM! Correct ordering and conflict detection.

The early parsing of @archive paths before the -C working directory change ensures canonicalization works correctly. The stdin/stdout conflict check prevents a critical error scenario.


558-561: LGTM! Correct sequencing documented.

The comment clearly explains why working directory change happens after parsing @archive paths.


575-576: LGTM! Unified collection for mixed sources.

Using collect_items_from_sources instead of collect_items_from_paths enables processing both filesystem and archive sources while preserving CLI argument order.


898-908: LGTM! Consistent conflict detection for append mode.

The early parsing and stdin conflict detection mirrors the create mode logic, correctly preventing conflicts when the input archive comes from stdin.


1017-1046: LGTM! Order-preserving append logic.

The function correctly processes CollectedItem variants in sequence, preserving CLI argument order while delegating to appropriate handlers for filesystem entries and archive inclusions.


1073-1105: LGTM! Clean delegation to mode-specific handlers.

The function appropriately routes to solid or normal archive creation based on the solid flag.


1107-1141: LGTM! Order-preserving normal archive creation.

The function correctly processes items in sequence, creating filesystem entries and copying archive entries in the order specified on the command line.


1143-1187: LGTM! Correct solid archive creation with inclusions.

The function properly:

  • Uses WriteOptions::store() for individual entries (since compression happens at solid level)
  • Adds entries to the solid builder in order
  • Finalizes the solid block and wraps it in an archive

1210-1233: LGTM! Correct entry rebuilding for solid mode.

This function properly rebuilds entries via rebuild_entry_for_solid before adding them to the solid builder, ensuring metadata consistency.

@github-actions github-actions Bot added the lib This issue is about lib crate label Jan 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI Agents
In @cli/src/command/core.rs:
- Around line 1389-1428: transform_archive_entries currently opens only a single
reader and thus ignores split archives; update the function to first collect
split parts (using collect_split_archives) for the provided source and pass the
combined reader(s) into Archive::read_header (or adjust/read_archive_source to
accept the collected archive parts), then iterate entries as before;
specifically, locate transform_archive_entries and either call
collect_split_archives on the input path/source before creating
Archive::read_header or modify read_archive_source to return a reader that
already handles collect_split_archives so split archives (e.g.,
@archive.part1.pna + parts) are all processed.
🧹 Nitpick comments (3)
cli/tests/cli/stdio/archive_inclusion.rs (1)

38-95: Consider test directory cleanup.

The tests create directories directly in the working directory (e.g., PathBuf::from("stdio_archive_inclusion_basic")). These directories persist after test runs. Consider using a temporary directory or implementing cleanup to avoid accumulating test artifacts.

🔎 Suggested approach using tempdir
use tempfile::tempdir;

#[test]
fn stdio_archive_inclusion_basic() {
    setup();
    
    let temp = tempdir().unwrap();
    let base = temp.path();
    // ... rest of test using base instead of PathBuf::from("...")
}
cli/src/command/create.rs (1)

570-591: Archive source reading is synchronous within the rayon scope.

Unlike filesystem entries which are spawned into the thread pool (s.spawn_fifo), archive sources at lines 582-587 are read synchronously before sending results to the channel. This means:

  1. Archive source reading blocks the scope's iteration
  2. Multiple @archive arguments are processed sequentially
  3. Filesystem entries interspersed with archives may not be processed until the archive is fully read

This preserves ordering (matching PR objectives) but may impact performance for large archive sources. If this is intentional for ordering guarantees, consider adding a comment.

cli/src/command/core.rs (1)

1468-1492: Consider logging when ownership overrides are ignored.

When owner_options.has_overrides() is true but result.metadata().permission() is None, the ownership overrides are silently ignored. While this might be intentional (can't override non-existent permissions), it could lead to user confusion.

Consider adding a debug or info log message when overrides are specified but cannot be applied due to missing permission metadata.

Proposed enhancement
     // Apply ownership transformations if specified
     if owner_options.has_overrides() {
         if let Some(perm) = result.metadata().permission() {
             let new_perm = pna::Permission::new(
                 owner_options
                     .uid
                     .map(u64::from)
                     .unwrap_or_else(|| perm.uid()),
                 owner_options
                     .uname
                     .clone()
                     .unwrap_or_else(|| perm.uname().to_string()),
                 owner_options
                     .gid
                     .map(u64::from)
                     .unwrap_or_else(|| perm.gid()),
                 owner_options
                     .gname
                     .clone()
                     .unwrap_or_else(|| perm.gname().to_string()),
                 perm.permissions(),
             );
             let metadata = result.metadata().clone().with_permission(Some(new_perm));
             result = result.with_metadata(metadata);
+        } else {
+            log::debug!(
+                "Ownership overrides specified but entry '{}' has no permission metadata",
+                original_name
+            );
         }
     }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a724d8d and 0a94f50.

📒 Files selected for processing (8)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
  • lib/src/entry.rs
🧰 Additional context used
🧬 Code graph analysis (3)
lib/src/entry.rs (2)
cli/src/command/list.rs (1)
  • name (273-280)
lib/src/entry/attr.rs (1)
  • name (36-38)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (80-82)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (196-215)
cli/src/command/append.rs (1)
cli/src/command/core.rs (5)
  • collect_items_from_paths (483-497)
  • create_entry (727-769)
  • read_archive_source (1431-1447)
  • read_paths (1353-1357)
  • read_paths_stdin (1360-1362)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: tier1 (ubuntu-22.04-arm, beta)
  • GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
  • GitHub Check: plan
  • GitHub Check: Test WebAssembly (stable, wasm32-unknown-emscripten)
  • GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-emscripten)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
  • GitHub Check: tier1 (ubuntu-latest, nightly)
  • GitHub Check: tier1 (windows-latest, stable)
  • GitHub Check: tier1 (windows-latest, nightly)
  • GitHub Check: tier1 (windows-11-arm, nightly)
  • GitHub Check: tier1 (windows-11-arm, beta)
  • GitHub Check: tier1 (ubuntu-latest, stable)
  • GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
  • GitHub Check: tier1 (ubuntu-latest, beta)
  • GitHub Check: tier1 (ubuntu-22.04-arm, beta)
  • GitHub Check: tier1 (ubuntu-22.04-arm, stable)
  • GitHub Check: msrv (ubuntu-latest, libpna)
  • GitHub Check: msrv (ubuntu-latest, pna)
  • GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (22)
lib/src/entry.rs (1)

937-959: LGTM!

The with_name method follows the established builder pattern used by with_metadata and with_xattrs, provides clear documentation with a working example, and correctly updates the header name while preserving all other entry fields.

cli/tests/cli/stdio.rs (1)

1-10: LGTM!

The new archive_inclusion test module is properly declared alongside the existing test modules.

cli/src/command/update.rs (2)

13-16: LGTM!

The import changes correctly bring in CollectedEntry and related types for the updated function signatures.


520-540: Update mode doesn't support @archive inclusion.

The update command uses collect_items_from_paths rather than collect_items_from_sources, which means @archive syntax won't work in update mode. This appears intentional since update mode compares filesystem metadata against archive entries, which doesn't apply to archive-sourced entries.

If this is the intended behavior, consider documenting this limitation or providing a clear error message when users attempt to use @archive with the update command.

cli/tests/cli/stdio/archive_inclusion.rs (3)

9-36: LGTM!

The helper functions create_test_archive and get_archive_entry_names are well-designed, reusable utilities that simplify test authoring. The error handling with unwrap() is appropriate for test code.


489-538: Good test for entry ordering preservation.

This test validates a key PR objective: preserving the exact order of CLI arguments. The test correctly verifies that archive entries appear first, followed by filesystem entries, matching the command-line argument order.


289-310: Test correctly validates the expected failure behavior.

Using @- (archive from stdin) with -f - (output to stdout) cannot succeed because stdin gets locked when the code tries to read the input archive, making it impossible to then read from @-. The test properly expects failure on this unsupported combination. However, note that the failure is implicit (from attempting to lock stdin twice) rather than from explicit conflict detection, and the test does not validate the specific error message.

cli/src/command/stdio.rs (3)

11-15: LGTM!

The import additions correctly bring in ItemSource and collect_items_from_sources for the new source-based collection workflow.


612-626: Good documentation of the -C interaction.

The comment clearly explains why sources are parsed after set_current_dir - this ensures @archive paths are resolved relative to the changed directory, matching the behavior of filesystem paths.


686-696: LGTM!

The create_archive_file calls now correctly pass filter and time_filters to support filtering entries from archive sources.

cli/src/command/create.rs (3)

9-15: LGTM!

The imports correctly bring in the new types and functions needed for archive source handling.


598-633: LGTM!

The result handling correctly processes both EntryResult::Single and EntryResult::Batch variants, properly unwrapping the nested Option and Result types.


709-715: Good error propagation in flat_map.

The flat_map correctly converts EntryResult variants to a flattened iterator. The error case at line 713 (Err(e) => vec![Err(e)]) properly preserves errors by wrapping them in a single-element vector, ensuring they propagate through the iterator.

cli/src/command/append.rs (3)

9-14: LGTM!

The imports correctly mirror those in create.rs, maintaining consistency across the codebase.


461-472: LGTM!

The target items are correctly wrapped as CollectedItem::Filesystem and the new filter/time_filters parameters are properly passed to run_append_archive.


475-524: LGTM!

The run_append_archive function correctly handles both CollectedItem::Filesystem and CollectedItem::ArchiveMarker variants, with proper result processing for both EntryResult::Single and EntryResult::Batch. The implementation is consistent with create_archive_file in create.rs.

cli/src/command/core.rs (6)

273-277: LGTM!

The has_overrides method correctly checks for any ownership override presence.


341-354: LGTM!

The CollectedEntry and StoreAs types are well-designed for representing filesystem items with their storage strategies.


356-411: LGTM!

The archive source parsing logic correctly handles @archive, @- (stdin), and regular filesystem paths. The lazy path validation approach is appropriate for CLI argument parsing.


413-435: LGTM!

The CollectedItem and EntryResult types effectively separate filesystem items from archive markers while enabling efficient batched processing.


450-470: LGTM!

The function correctly preserves CLI argument order while enabling shared hardlink detection across filesystem paths. The separation of filesystem items and archive markers is well-implemented.


483-497: LGTM!

The signature changes to use CollectedEntry are consistently applied across collect_items_from_paths, collect_items_with_state, and create_entry. The refactoring maintains existing logic while introducing the new type structure.

Also applies to: 544-665, 727-769

Comment thread cli/src/command/core.rs
Comment thread cli/src/command/stdio.rs
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @cli/src/command/core.rs:
- Around line 1552-1598: The function transform_normal_entry destructures a
non-existent owner_options from CreateOptions; either add owner_options:
PermissionStrategyResolver to the CreateOptions struct and keep the current
destructuring and uses of owner_options.has_overrides(),
owner_options.uid/gid/uname/gname, or modify transform_normal_entry to read the
ownership resolver from keep_options.permission_strategy (i.e., replace
owner_options.* references with keep_options.permission_strategy.*). Update the
CreateOptions definition if you choose the first approach so the struct includes
owner_options: PermissionStrategyResolver and adjust any callers accordingly.

In @cli/src/command/stdio.rs:
- Around line 1027-1042: Before opening stdin for reading and calling
run_append_archive, validate that when appending to stdout (i.e., archive_path
is None) the collected sources do not include a stdin archive marker;
specifically, after collect_items_from_sources returns target_items check for
any CollectedItem::ArchiveMarker(ArchiveSource::Stdin) (or equivalent
ArchiveSource::Stdin representation) and return an error if found. Locate the
append-to-stdout logic around collect_items_from_sources, Archive::read_header
and run_append_archive and add this guard (or add a unit test like
stdio_archive_inclusion_stdin_stdout_conflict) so run_append_archive will never
attempt to read stdin twice.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a94f50 and 4cf9672.

📒 Files selected for processing (8)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
  • lib/src/entry.rs
🧰 Additional context used
🧬 Code graph analysis (3)
lib/src/entry.rs (2)
cli/src/command/list.rs (1)
  • name (273-280)
lib/src/entry/attr.rs (1)
  • name (36-38)
cli/src/command/stdio.rs (2)
cli/src/command/core.rs (3)
  • collect_items_from_sources (489-509)
  • collect_split_archives (711-728)
  • parse_many (447-449)
cli/src/command/create.rs (1)
  • create_archive_file (540-633)
cli/src/command/append.rs (3)
cli/src/command/core.rs (6)
  • collect_items_from_paths (522-536)
  • create_entry (766-807)
  • entry_option (809-829)
  • read_archive_source (1534-1550)
  • read_paths (1456-1460)
  • read_paths_stdin (1463-1465)
cli/src/command/create.rs (1)
  • collect_items_from_paths (457-460)
lib/src/entry.rs (2)
  • std (1185-1185)
  • entries (430-444)
🔇 Additional comments (27)
lib/src/entry.rs (1)

937-959: LGTM! Clean builder-pattern method.

The with_name method provides a clean way to rename entries during archive-to-archive copying operations. The implementation correctly follows the builder pattern and the documentation clearly explains the use case.

cli/src/command/core.rs (6)

313-317: LGTM! Useful helper for ownership override detection.

The has_overrides() method provides a clean way to check if any ownership overrides are specified, which is used to conditionally apply ownership transformations.


380-450: LGTM! Well-designed data structures for archive inclusion.

The new types (CollectedEntry, ArchiveSource, ItemSource) provide a clean abstraction for handling mixed filesystem and archive sources. The parsing logic correctly handles the @ prefix convention for archive inclusion.


476-509: LGTM! Correct order-preserving collection logic.

The collect_items_from_sources function correctly processes mixed sources while preserving CLI argument order. The shared HardlinkResolver enables cross-path hardlink detection as intended.


1533-1550: LGTM! Clean delegation pattern.

The function correctly handles both file and stdin archive sources by delegating to transform_archive_entries. The split archive issue applies to the underlying function and has already been flagged.


522-536: LGTM! Consistent refactoring to CollectedEntry.

The updates to collect_items_from_paths, collect_items_with_state, and create_entry consistently use the new CollectedEntry type. The refactoring is mechanical and correct.

Also applies to: 583-704, 766-807


1492-1531: Split archive support needed for @archive inclusion.

As noted in previous reviews, transform_archive_entries and read_archive_source only handle single-part archives. If a split archive is referenced via @archive.part1.pna, only the first part will be processed.

Other commands handle this by calling collect_split_archives before opening the archive. Apply the same pattern here.

Likely an incorrect or invalid review comment.

cli/tests/cli/stdio.rs (1)

1-1: LGTM! Standard module declaration.

cli/src/command/update.rs (1)

13-17: LGTM! Consistent refactoring to CollectedEntry.

The updates align with the CollectedEntry refactoring in core.rs. All usages are updated consistently and correctly.

Also applies to: 525-525, 598-598

cli/tests/cli/stdio/archive_inclusion.rs (1)

1-743: LGTM! Excellent test coverage for archive inclusion.

This comprehensive test suite covers all the major scenarios for the @archive feature:

  • Basic and multiple archive inclusion
  • Solid mode and append operations
  • Error cases (nonexistent archive, stdin/stdout conflict)
  • Data integrity verification
  • Filter interactions (include/exclude)
  • Entry ordering preservation
  • Edge cases (empty archives, combined filters)

The helper functions (create_test_archive, get_archive_entry_names) are well-designed and make the tests clear and maintainable.

cli/src/command/stdio.rs (5)

11-16: LGTM: Import updates align with source-based collection.

The additions of ItemSource and collect_items_from_sources are necessary for the new archive-inclusion processing model.


626-640: LGTM: Source parsing correctly ordered after directory change.

The comment at line 626 clearly explains why sources are parsed after set_current_dir, ensuring @archive paths are resolved relative to the working directory set by -C.


698-714: LGTM: Filter and time_filters properly propagated to archive creation.

Both file-based and stdout-based archive creation now receive the necessary filtering context for archive-inclusion processing.


1002-1003: LGTM: Source parsing consistently ordered after directory change in append paths.

The comment at line 1002 and the usage of ItemSource::parse_many followed by collect_items_from_sources mirrors the create path, ensuring @archive paths respect the -C working directory.

Also applies to: 1016-1018, 1027-1027


1019-1025: LGTM: Filter and time_filters properly propagated to append operations.

Both file-based and stdin/stdout-based append operations now receive the necessary filtering context for archive-inclusion processing.

Also applies to: 1035-1041

cli/src/command/create.rs (7)

9-16: LGTM: Import updates support archive-inclusion workflow.

The additions of CollectedItem, EntryResult, TimeFilters, and read_archive_source are necessary for the new unified item processing model.


457-460: LGTM: Filesystem entries correctly wrapped as CollectedItem.

The mapping to CollectedItem::Filesystem maintains type consistency with the unified processing model. Since the create command path shown here collects only filesystem paths, all items are filesystem variants.


540-550: LGTM: Function signatures extended for archive-inclusion support.

The addition of filter and time_filters parameters enables archive-source processing while maintaining backward compatibility for filesystem-only items.

Also applies to: 646-647


567-589: LGTM: CollectedItem variants handled appropriately in parallel scope.

Filesystem items spawn parallel tasks emitting EntryResult::Single, while archive markers call read_archive_source synchronously and emit EntryResult::Batch. The synchronous processing of archive sources may be intentional to preserve ordering or because the archive reading is already parallelized internally.


595-628: LGTM: EntryResult variants handled consistently in both solid and non-solid modes.

Both archive modes correctly pattern-match on EntryResult::Single and EntryResult::Batch, with proper error propagation and Option unwrapping.


662-678: LGTM: Split archive path mirrors standard path correctly.

The CollectedItem variant handling in split archives is consistent with the standard path, maintaining code uniformity.


685-710: LGTM: Split archive result handling is correct and idiomatic.

The solid mode pattern-matches on EntryResult variants consistently. The non-solid mode uses flat_map to flatten both Single and Batch results, followed by filter_map(Result::transpose) to filter out None values while transposing Result<Option<T>> to Option<Result<T>>. This is idiomatic Rust for handling nested Result/Option types.

Also applies to: 713-713

cli/src/command/append.rs (5)

9-16: LGTM: Import updates align with unified processing model.

The additions mirror those in create.rs, maintaining consistency across create and append workflows.


462-465: LGTM: Filesystem entries correctly wrapped for unified processing.

The mapping to CollectedItem::Filesystem is consistent with the create.rs pattern and maintains type compatibility.


476-482: LGTM: Function signature extended consistently with create workflow.

The addition of filter and time_filters parameters enables archive-source processing in the append path, maintaining consistency with the create workflow.


484-505: LGTM: Parallel processing pattern consistent with create workflow.

The CollectedItem variant handling mirrors create.rs: filesystem items spawn parallel tasks, archive markers are processed synchronously. This maintains ordering and leverages internal parallelism in archive reading.


507-524: LGTM: EntryResult handling and finalization are correct.

The pattern-matching on EntryResult::Single and EntryResult::Batch is consistent with create.rs, with proper error propagation and Option handling. The archive is correctly finalized before returning.

Comment thread cli/src/command/core.rs
Comment thread cli/src/command/stdio.rs
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/stdio.rs Fixed
Comment thread cli/src/command/stdio.rs Fixed
Comment thread cli/src/command/create.rs Fixed
Comment thread cli/src/command/stdio.rs Fixed
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/update.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Dismissed
Comment thread cli/src/command/core.rs Dismissed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @cli/src/command/core.rs:
- Around line 1492-1519: read_archive_source currently opens a single file and
doesn't handle split archives like other commands; update read_archive_source to
call collect_split_archives and use the returned parts as the input to
Archive::read_header (or combine them into a single reader/chain) so
transform_archive_entries receives the full split-archive stream, or
alternatively add a clear doc comment on read_archive_source stating that
@archive does not support split archives; reference the functions
collect_split_archives and read_archive_source (and ensure
transform_archive_entries continues to accept the reader produced).
- Around line 1559-1583: The code incorrectly references a non-existent
permission_strategy and wrong enum/variant names; change the pattern to match
keep_options.owner_strategy and the correct enum variant OwnerStrategy::Preserve
{ options } (where options is an OwnerOptions struct with fields uid, gid,
uname, gname), then replace uid/gid/uname/gname bindings with
options.uid/options.gid/options.uname/options.gname when building the new
pna::Permission and updating result.metadata(); keep the existing logic that
only applies when at least one option.is_some() and when
result.metadata().permission() is Some.
🧹 Nitpick comments (2)
cli/src/command/create.rs (1)

569-591: Archive sources processed synchronously within rayon scope.

read_archive_source for ArchiveMarker items is called synchronously (not spawned), while filesystem entries are processed in parallel via spawn_fifo. This is likely intentional since:

  1. Stdin can only be read once and needs ordering
  2. Archive entry order should be preserved

However, for ArchiveSource::File variants, parallel processing could improve performance. Consider spawning file-based archive sources if order preservation isn't required.

cli/src/command/core.rs (1)

470-480: Consider boxing the larger variant to reduce enum size.

Static analysis flags a large size difference between Single and Batch variants. The Batch variant contains a Vec which is significantly larger than the Single variant's contents. This causes all EntryResult instances to use the size of the largest variant.

If memory efficiency is a concern (especially with large numbers of entries in flight), consider boxing the Batch inner type:

Batch(Box<io::Result<Vec<io::Result<Option<NormalEntry>>>>>),

However, if the batching optimization's reduced synchronization overhead is the priority and EntryResult instances are short-lived, the current design is acceptable.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88e511b and 5e83d33.

📒 Files selected for processing (7)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/src/command/stdio.rs (1)
cli/src/command/append.rs (1)
  • run_append_archive (477-526)
cli/src/command/append.rs (2)
cli/src/command/create.rs (1)
  • collect_items_from_paths (457-460)
lib/src/entry.rs (2)
  • std (1185-1185)
  • entries (430-444)
🔇 Additional comments (24)
cli/tests/cli/stdio.rs (1)

1-1: LGTM!

The new archive_inclusion test module is properly added to exercise the @archive inclusion functionality introduced by this PR.

cli/src/command/update.rs (2)

13-14: LGTM!

Import changes correctly reflect the rename from CollectedItem to CollectedEntry for filesystem-only collection paths.


526-526: LGTM!

The function signature update to Vec<CollectedEntry> is consistent with the type rename and maintains the update command's filesystem-only semantics (as opposed to the create/append paths which now support @archive inclusions).

cli/src/command/stdio.rs (4)

619-633: Good design: Parse sources after directory change.

The comment and implementation correctly ensure that @archive paths are resolved relative to the new working directory when -C is used. This maintains consistency with how filesystem paths are handled.


698-708: LGTM!

The create_archive_file calls now consistently pass filter and time_filters for both file-based and stdout output paths, enabling proper filtering of entries from included archives.


1002-1025: LGTM!

The append path correctly mirrors the create path's pattern: sources are parsed after the directory change, and run_append_archive receives the filter and time_filters for archive source processing.


1027-1041: Add stdin conflict validation to append-to-stdout path or confirm protection mechanism.

When archive_path is None, stdin is consumed at line 1030 to read the input archive. If target_items contains CollectedItem::ArchiveMarker(ArchiveSource::Stdin) (from @-), run_append_archive will attempt to read stdin again via read_archive_source, causing failure. The create path has protection validated by test stdio_archive_inclusion_stdin_stdout_conflict, but the equivalent validation is not evident in the append path. Either add explicit validation before collect_items_from_sources when archive_path is None, or add a test case to verify the expected behavior.

cli/src/command/create.rs (4)

9-15: LGTM!

Import additions correctly bring in the new types (CollectedItem, EntryResult, TimeFilters) and read_archive_source function needed for archive inclusion support.


457-460: LGTM!

Collected filesystem items are correctly wrapped as CollectedItem::Filesystem to unify with the new enum-based item representation. This maintains backward compatibility for the non-stdio create path which doesn't support @archive inclusions.


597-630: LGTM!

The write loop correctly handles both EntryResult::Single (filesystem entries) and EntryResult::Batch (archive-sourced entries) with proper error propagation and Option handling for filtered entries.


706-712: LGTM!

The flat_map correctly flattens EntryResult variants for the non-solid split path, with proper error handling that converts batch errors into single-element error vectors.

cli/src/command/append.rs (3)

9-14: LGTM!

Import additions correctly mirror the create.rs pattern, bringing in the types and functions needed for unified item processing with archive source support.


463-474: LGTM!

The append command correctly wraps filesystem items and passes filter/time_filters to run_append_archive. This path doesn't parse @archive from CLI args (that's handled by the stdio path), maintaining separation of concerns.


477-526: LGTM!

The run_append_archive implementation correctly mirrors the create path's pattern:

  • Extended signature accepts filter and time_filters
  • Rayon scope processes Filesystem entries in parallel
  • ArchiveMarker entries processed via read_archive_source
  • Result loop handles both Single and Batch variants with proper error propagation

This consistency across create and append paths improves maintainability.

cli/src/command/core.rs (10)

318-324: LGTM!

The has_overrides() method provides a clean way to check if any ownership overrides are specified. The implementation correctly checks all four ownership-related fields.


386-399: LGTM!

The CollectedEntry struct cleanly encapsulates filesystem item data with pre-captured metadata. The separation from the new CollectedItem enum (which now serves as a discriminated union for filesystem vs. archive sources) is well-designed.


401-416: LGTM!

The ArchiveSource enum cleanly models the two input sources (file path and stdin), and the Display implementation follows the - convention for stdin, which is consistent with CLI tool conventions.


430-456: LGTM!

The parsing logic correctly implements the @archive convention. The @- → stdin mapping and @path → file mapping follow established CLI conventions (similar to bsdtar).

Note: Users needing to reference a filesystem path that starts with @ would need to use a workaround like ./@file. This is consistent with how similar tools handle this ambiguity.


458-468: LGTM!

The CollectedItem enum elegantly preserves CLI argument ordering while distinguishing between filesystem entries (ready for processing) and archive markers (indicating where to splice in archive entries). This design cleanly addresses the PR's objective of maintaining original argument order.


495-515: LGTM!

The function correctly preserves CLI argument order while handling mixed sources. The sequential processing with shared HardlinkResolver enables cross-path hardlink detection. Error handling is appropriate with the ? operator for filesystem errors.


528-542: LGTM!

The return type update to Vec<CollectedEntry> is consistent with the type rename, and the function's behavior is unchanged.


772-784: LGTM!

The parameter type update to &CollectedEntry is consistent with the type restructuring, maintaining the same functionality.


1521-1538: LGTM!

The function cleanly handles both file and stdin sources with appropriate buffering. The 64KB buffer is a reasonable choice for archive I/O.


1588-1717: LGTM!

The test updates correctly reflect the type changes from CollectedItem to CollectedEntry, and the test coverage for collect_items_from_paths remains intact.

Comment thread cli/src/command/core.rs
Comment thread cli/src/command/core.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cli/src/command/stdio.rs (1)

1004-1014: Bug: --no-recursive is ignored in stdio append.
CollectOptions.recursive uses args.recursive (Line 1005), but the actual inverse flag is args.no_recursive. As written, --no-recursive won’t stop descent.

Proposed fix
-    let collect_options = CollectOptions {
-        recursive: args.recursive,
+    let collect_options = CollectOptions {
+        recursive: !args.no_recursive,
         keep_dir: args.keep_dir,
         gitignore: args.gitignore,
         nodump: args.nodump,
         follow_links: args.follow_links,
         follow_command_links: args.follow_command_links,
         one_file_system: args.one_file_system,
         filter: &filter,
         time_filters: &time_filters,
     };
cli/src/command/append.rs (1)

477-526: Append entry order is not guaranteed under rayon scheduling.
If stdio append relies on exact argument order, run_append_archive should enforce ordering (index/buffer) instead of consuming rx in arrival order.

cli/src/command/create.rs (1)

569-633: Ordering is not guaranteed; parallel filesystem tasks can complete before synchronously-processed archives.

The code mixes asynchronous task spawning (Filesystem items via spawn_fifo) with synchronous processing (ArchiveMarker items via direct read_archive_source() call). Since ArchiveMarker sends results immediately within the loop, if an ArchiveMarker appears after a Filesystem item in CLI arguments, the archive's results arrive on the channel before the filesystem item completes, reversing the intended order.

This works correctly only when ArchiveMarkers precede Filesystem items (verified by stdio_archive_inclusion_archive_first test). The reverse case is not tested and will fail: running with file.txt @archive.pna`` produces archive entries before file entries in the output.

To fix, either:

  • Tag each item with an index and buffer results in a BTreeMap<usize, EntryResult>, draining in order
  • Process sequentially when argument order must be preserved (disable parallelism)
🤖 Fix all issues with AI agents
In `@cli/src/command/core.rs`:
- Around line 434-450: ItemSource::parse treats "@" as an empty filename and
doesn't allow escaping a literal leading '@'; change parse to handle "@" as
Archive(ArchiveSource::Stdin), treat arguments starting with "@@" as a
filesystem path with the first '@' removed (i.e. literal leading '@'), keep "@-"
or "@-" semantics if desired, and for single '@' return Stdin instead of
File(PathBuf::from("")). Update parse accordingly (ItemSource::parse) and
parse_many remains fine as it delegates to parse.
♻️ Duplicate comments (2)
cli/src/command/stdio.rs (1)

1026-1042: Append-to-stdout should explicitly reject @- (stdin) sources.
In this branch stdin is already consumed to read the input archive (Line 1030). Allowing @- in target_items makes the command attempt to read stdin again during run_append_archive(...).

Proposed guard (mirrors create-path conflict behavior)
+    use crate::command::core::{ArchiveSource, CollectedItem};
     } else {
         let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?;
+
+        if target_items.iter().any(|it| {
+            matches!(it, CollectedItem::ArchiveMarker(ArchiveSource::Stdin))
+        }) {
+            anyhow::bail!("cannot use `@-` when appending to stdout (-f -): stdin is already used for the input archive");
+        }
+
         let mut output_archive = Archive::write_header(io::stdout().lock())?;
         {
             let mut input_archive = Archive::read_header(io::stdin().lock())?;
             for entry in input_archive.raw_entries() {
                 output_archive.add_entry(entry?)?;
             }
         }
         run_append_archive(
             &create_options,
             output_archive,
             target_items,
             &filter,
             &time_filters,
         )
     }
cli/src/command/core.rs (1)

1515-1532: @archive should support split archives (or document it as unsupported).
read_archive_source currently opens only fs::File::open(path) and passes it to Archive::read_header, so @foo.pna won’t include foo.pna.part2... style parts.

🧹 Nitpick comments (4)
cli/tests/cli/stdio/archive_inclusion.rs (2)

9-36: Prefer temp directories over fixed cwd-relative folders.
These tests create persistent directories (e.g., stdio_archive_inclusion_basic/) and don’t clean up; using tempfile::TempDir (or the project’s existing test temp helper if any) would reduce cross-run pollution.


38-146: Consider adding one more ordering test for interleaving (file1, @archive, file2).
Right now most tests validate presence (via HashSet), not ordering; interleaving is the tricky case this PR is about.

cli/src/command/create.rs (1)

662-719: Same ordering risk in split-archive create, plus avoid Vec allocations in the flattener.
The flat_map builds small Vecs per element; you can avoid that with iterator combinators (std::iter::once, Either, etc.), but the bigger issue is still preserving CLI order.

cli/src/command/core.rs (1)

1486-1513: Potential memory spike: transform_archive_entries materializes all transformed entries.
For large included archives this allocates a full Vec before writing; consider a streaming iterator approach if this becomes an issue.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e83d33 and 4d732d7.

📒 Files selected for processing (7)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • cli/tests/cli/stdio.rs
  • cli/src/command/update.rs
🧰 Additional context used
🧬 Code graph analysis (3)
cli/src/command/append.rs (2)
cli/src/command/core.rs (4)
  • collect_items_from_paths (522-536)
  • read_archive_source (1516-1532)
  • read_paths (1438-1442)
  • read_paths_stdin (1445-1447)
cli/src/command/create.rs (1)
  • collect_items_from_paths (457-460)
cli/src/command/create.rs (2)
cli/src/command/core.rs (2)
  • read_archive_source (1516-1532)
  • read_paths (1438-1442)
lib/src/entry.rs (1)
  • entries (430-444)
cli/src/command/core.rs (4)
lib/src/entry/header.rs (1)
  • path (90-92)
cli/src/command/core/path_filter.rs (1)
  • new (12-20)
cli/src/command/core/path.rs (1)
  • new (28-38)
lib/src/entry/meta.rs (2)
  • uname (210-212)
  • gname (240-242)
🔇 Additional comments (3)
cli/src/command/stdio.rs (1)

619-634: Good: parse @archive sources after -C so relative archive paths behave like other args.
This matches the intended CLI semantics and makes ordering predictable.

cli/tests/cli/stdio/archive_inclusion.rs (1)

489-538: Nice: explicit order assertion for @archive-first behavior.
This directly validates the PR’s main user-visible promise.

cli/src/command/create.rs (1)

457-460: Good: normalize collected filesystem entries into CollectedItem::Filesystem.
Keeps create’s existing path-collection behavior while allowing stdio to inject ArchiveMarkers elsewhere.

Comment thread cli/src/command/core.rs
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
Comment thread cli/src/command/core.rs Fixed
@ChanTsune ChanTsune force-pushed the cli/stdio/@ branch 2 times, most recently from 2f699d2 to d830ba2 Compare January 14, 2026 08:51
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cli/src/command/create.rs (1)

542-635: Entry order is not guaranteed due to mixing inline and spawned task completion in the channel.

The code intends to preserve CLI argument order (documented in core.rs), and test stdio_archive_inclusion_archive_first() validates this for the case where @archive is placed before filesystem files. However, the current implementation has a race condition.

When the loop encounters CollectedItem::ArchiveMarker, it processes the archive synchronously and sends results to the channel immediately. When it encounters CollectedItem::Filesystem, it spawns a task that sends results asynchronously. A channel delivers messages in completion order, not spawn order—so if a spawned filesystem task completes late, but an inline archive marker result was sent early, the order is correct; but reversed input (file.txt @archive.pna``) would cause the archive entries to be sent before the filesystem entry completes, breaking CLI order.

This scenario isn't covered by existing tests. The suggested fix (indexing and sorting by original position before consuming from the channel) correctly solves the problem and ensures order is preserved regardless of input sequence.

cli/src/command/append.rs (1)

477-526: Append ordering is lost due to mixed async/sync operations with rayon spawn_fifo and channel receive order.

Filesystem entries are spawned asynchronously (lines 488-495), while archive markers are processed inline synchronously (lines 496-500). Since results are consumed in channel receive order (line 508+), not submission order, a fast inline archive marker can arrive at the channel before a slower spawned filesystem entry, reordering entries like file1 @archive`` in practice.

♻️ Duplicate comments (2)
cli/src/command/stdio.rs (1)

1002-1042: Reject @- when appending-to-stdout (stdin is already consumed by the base archive).
In the archive_path == None branch, stdin is used to read the input archive (Line 1030), and run_append_archive can later try to read stdin again via @-ArchiveSource::Stdin. If there isn’t an earlier guard, this needs an explicit validation before reading stdin / before calling run_append_archive.

Proposed fix (guard before reading stdin / copying base archive)
 } else {
     let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?;
+    // stdin is used for the base archive; disallow using it again via `@-`
+    if target_items.iter().any(|it| matches!(
+        it,
+        crate::command::core::CollectedItem::ArchiveMarker(crate::command::core::ArchiveSource::Stdin)
+    )) {
+        anyhow::bail!("cannot use `@-` when appending to stdout (-f -): stdin is already used for the input archive");
+    }
     let mut output_archive = Archive::write_header(io::stdout().lock())?;
     {
         let mut input_archive = Archive::read_header(io::stdin().lock())?;
         for entry in input_archive.raw_entries() {
             output_archive.add_entry(entry?)?;
         }
     }
     run_append_archive(
         &create_options,
         output_archive,
         target_items,
         &filter,
         &time_filters,
     )
 }
cli/src/command/core.rs (1)

1476-1534: @archive split/multipart support likely missing again (only first part is read).
read_archive_source(ArchiveSource::File) opens a single file (Line 1526-1528). If the archive is split, Archive::read_header(reader) won’t automatically open subsequent parts, so inclusion will be partial or error once has_next_archive() is hit.

Sketch: reuse existing “across archive” traversal for file sources
 pub(crate) fn read_archive_source(
     source: &ArchiveSource,
     create_options: &CreateOptions,
     filter: &PathFilter<'_>,
     time_filters: &TimeFilters,
 ) -> io::Result<Vec<io::Result<Option<NormalEntry>>>> {
     match source {
         ArchiveSource::File(path) => {
-            let reader = io::BufReader::with_capacity(64 * 1024, fs::File::open(path)?);
-            transform_archive_entries(reader, create_options, filter, time_filters)
+            let archives = collect_split_archives(path)?;
+            let readers = archives
+                .into_iter()
+                .map(|f| io::BufReader::with_capacity(64 * 1024, f));
+            let mut results = Vec::new();
+            run_across_archive(readers, |archive| {
+                for entry_result in archive.entries().extract_solid_entries(None) {
+                    match entry_result {
+                        Ok(entry) => {
+                            if filter.excluded(entry.header().path()) {
+                                continue;
+                            }
+                            let ctime = entry.metadata().created_time();
+                            let mtime = entry.metadata().modified_time();
+                            if !time_filters.matches_or_inactive(ctime, mtime) {
+                                continue;
+                            }
+                            results.push(transform_normal_entry(entry, create_options));
+                        }
+                        Err(e) => results.push(Err(e)),
+                    }
+                }
+                Ok(())
+            })?;
+            Ok(results)
         }
         ArchiveSource::Stdin => {
             let reader = io::BufReader::new(io::stdin().lock());
             transform_archive_entries(reader, create_options, filter, time_filters)
         }
     }
 }
🧹 Nitpick comments (6)
cli/src/command/stdio.rs (1)

10-16: Drop collect_items_from_paths from imports if it’s no longer needed in this module.
Right now ItemSource / collect_items_from_sources is introduced, but collect_items_from_paths is still imported (Line 14). If this module no longer needs it (outside run_update), consider tightening imports.

cli/src/command/create.rs (1)

637-720: Non-solid split path: flat_map + vec![..] allocations are avoidable.
This isn’t a blocker, but flat_map currently allocates a Vec per item (Line 706-712). Consider mapping to an iterator without intermediate Vecs if this shows up in profiles.

cli/src/command/core.rs (1)

466-476: Consider boxing EntryResult::Batch to reduce enum size skew.
Tooling already flags “large size difference between variants”; Batch(Box<...>) keeps the channel payload smaller/copy-friendlier.

cli/tests/cli/stdio/archive_inclusion.rs (3)

9-36: Helpers look fine; consider using WriteOptions::store() for more deterministic test archives.
Not required, but it can make tests less sensitive to default option changes.


489-539: Add an ordering test for file then @archive (reverse of archive-first).
You currently assert ordering when @archive comes first (Line 489+). Given the implementation uses concurrency + a channel, the “file first” case is the one most likely to reorder.


289-310: Add a test for append-to-stdout (-f -) combined with @- in append mode.
Create-mode has the stdin/stdout conflict test (Line 289+). Append-to-stdout has the same stdin double-use hazard and should be explicitly covered.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ace7fb0 and 2f699d2.

📒 Files selected for processing (7)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (3)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (90-92)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (199-218)
lib/src/chunk/types.rs (1)
  • len (153-155)
cli/src/command/create.rs (1)
cli/src/command/core.rs (3)
  • collect_items_from_paths (524-538)
  • create_entry (768-809)
  • read_archive_source (1518-1534)
cli/src/command/append.rs (2)
cli/src/command/core.rs (4)
  • collect_items_from_paths (524-538)
  • read_archive_source (1518-1534)
  • read_paths (1440-1444)
  • read_paths_stdin (1447-1449)
cli/src/command/create.rs (1)
  • collect_items_from_paths (457-460)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Run cargo fuzz (split_archive)
  • GitHub Check: tier1 (macos-latest, stable)
  • GitHub Check: tier1 (windows-11-arm, beta)
  • GitHub Check: tier1 (windows-11-arm, stable)
  • GitHub Check: tier1 (windows-latest, nightly)
  • GitHub Check: tier1 (macos-latest, nightly)
  • GitHub Check: tier1 (windows-latest, stable)
  • GitHub Check: tier1 (windows-latest, beta)
  • GitHub Check: tier1 (windows-11-arm, nightly)
  • GitHub Check: tier1 (ubuntu-22.04-arm, stable)
  • GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
  • GitHub Check: tier1 (ubuntu-22.04-arm, beta)
  • GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
  • GitHub Check: Test on OpenBSD
  • GitHub Check: Test on OmniOS
  • GitHub Check: tier1 (ubuntu-latest, beta)
  • GitHub Check: tier1 (windows-latest, stable)
🔇 Additional comments (4)
cli/src/command/stdio.rs (1)

616-634: Good: @archive parsing happens after -C, and filters/time-filters are plumbed through.
The “parse after changing directory” comment + ItemSource::parse_many(&files) (Line 619-620) matches the intended semantics, and passing &filter/&time_filters into create_archive_file aligns archive-inclusions with filesystem filtering.

Also applies to: 693-709

cli/src/command/create.rs (1)

457-460: Nice simplification: wrapping collected filesystem entries into CollectedItem::Filesystem.
Keeps create-side “no @” behavior unchanged while enabling reuse of the new archive-marker pipeline.

cli/src/command/append.rs (1)

463-474: Good: filter/time-filters are plumbed into append archive entry inclusion.
Wiring &filter and &time_filters through to run_append_archive is the right shape for consistent behavior across filesystem and included archives.

cli/src/command/core.rs (1)

424-511: LGTM: ItemSource + collect_items_from_sources preserve CLI argument ordering.
The model cleanly separates “expand filesystem” vs “insert archive marker” while keeping the input sequence intact.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
cli/src/command/core.rs (1)

1477-1537: @archive split-archive handling still looks incomplete (single-reader).
transform_archive_entries uses Archive::read_header(reader) and never advances via read_next_archive, and read_archive_source opens only one file. If @archive should support multipart archives like other commands, this will likely only process the first part.

cli/src/command/stdio.rs (1)

1027-1042: Append-to-stdout path can double-read stdin when using @-.
When archive_path is None, stdin is consumed to copy the input archive (Line 1030+), and run_append_archive may later try to read stdin again for CollectedItem::ArchiveMarker(ArchiveSource::Stdin). Add an upfront guard that rejects @- in this mode.

🧹 Nitpick comments (6)
cli/src/command/core.rs (2)

424-452: ItemSource::parse semantics are clear; add explicit escape docs for literal leading @.
Docs mention ./@file (good). Consider also documenting Windows-friendly equivalents (e.g., .\\@file) since @-leading filenames aren’t rare.


1539-1585: Ownership override application is correct but could avoid repeated cloning.
uname.clone()/gname.clone() per entry may be a noticeable cost for large includes; consider borrowing or precomputing the fallback strings when overrides are present.

cli/tests/cli/stdio/archive_inclusion.rs (1)

289-310: Add a sibling test for append-to-stdout + @- conflict.
You already cover create-to-stdout conflict; add an append equivalent to prevent regressions of the stdin double-consumption path.

cli/src/command/create.rs (2)

581-586: ArchiveMarker work should be offloaded from the scope thread.
Right now it runs inline, reducing concurrency and making reordering more likely; spawning it (as in the diff above) keeps behavior uniform.

Also applies to: 674-679


706-718: Non-solid split path allocates per message (vec![...]) and can be heavy.
If you keep EntryResult::Batch(Vec<...>), consider iterating batches without vec! wrapping (e.g., via small iterator adapters) to reduce transient allocations.

cli/src/command/append.rs (1)

496-521: Batching entire included archives into memory can be expensive.
read_archive_source returning Vec<...> means large @archive includes may spike RAM. A streaming approach (send entries incrementally, still ordered via indices) would scale better.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f699d2 and d830ba2.

📒 Files selected for processing (7)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • cli/src/command/update.rs
  • cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (4)
cli/src/command/stdio.rs (1)
cli/src/command/core.rs (3)
  • collect_items_from_sources (492-512)
  • collect_split_archives (714-731)
  • parse_many (449-451)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
  • for_each_entry (122-127)
lib/src/entry/header.rs (1)
  • path (90-92)
lib/src/archive/write.rs (1)
  • write_header (85-88)
lib/src/entry/builder.rs (1)
  • new_file (199-218)
lib/src/chunk/types.rs (1)
  • len (153-155)
cli/src/command/append.rs (1)
cli/src/command/core.rs (4)
  • collect_items_from_paths (525-539)
  • create_entry (769-810)
  • entry_option (812-832)
  • read_archive_source (1519-1537)
cli/src/command/core.rs (5)
lib/src/entry/header.rs (1)
  • path (90-92)
cli/src/command/core/re/gnu.rs (2)
  • new (39-93)
  • new (151-153)
lib/src/entry.rs (1)
  • reader (1013-1029)
lib/src/entry/meta.rs (2)
  • uname (210-212)
  • gname (240-242)
cli/src/command/core/time_filter.rs (1)
  • matches (74-80)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: tier1 (macos-latest, beta)
  • GitHub Check: tier1 (windows-latest, beta)
  • GitHub Check: tier1 (windows-11-arm, beta)
  • GitHub Check: tier1 (ubuntu-22.04-arm, stable)
  • GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
  • GitHub Check: Run rust-clippy analyzing
  • GitHub Check: tier1 (macos-latest, nightly)
  • GitHub Check: tier1 (windows-latest, stable)
  • GitHub Check: tier1 (macos-latest, stable)
  • GitHub Check: tier1 (ubuntu-22.04-arm, stable)
  • GitHub Check: tier1 (windows-latest, beta)
  • GitHub Check: tier1 (windows-latest, nightly)
  • GitHub Check: tier1 (ubuntu-22.04-arm, beta)
🔇 Additional comments (9)
cli/src/command/core.rs (5)

380-477: New CLI-source abstractions look coherent and well-scoped.
ArchiveSource/ItemSource/CollectedItem/EntryResult separation matches the “preserve CLI ordering” goal and keeps filesystem collection concerns isolated.


492-512: collect_items_from_sources preserves argument order, but downstream must not reorder results.
This function emits markers in-order; ensure the consumer (create/append) doesn’t reintroduce reordering via parallelism.


769-810: create_entry signature migration to CollectedEntry looks correct.
Destructuring + metadata reuse stays straightforward.


675-706: Time filtering on filesystem collection is applied at the right point.
Filtering after metadata acquisition but before push avoids extra work and keeps behavior consistent.


1718-1783: Parsing tests cover the key edge cases.
Nice coverage for @, @-, @path, and ./@literal.

cli/src/command/stdio.rs (1)

619-634: Parsing sources after -C is the right behavior.
This makes @archive paths consistently relative to the adjusted working directory.

Also applies to: 1002-1019

cli/tests/cli/stdio/archive_inclusion.rs (1)

1-743: Great breadth of coverage for inclusion, ordering, filters, and solid mode.
This is the right level of end-to-end validation for a CLI ordering guarantee change.

cli/src/command/create.rs (1)

457-460: Wrapping collected filesystem entries as CollectedItem::Filesystem is fine (no behavior change).
This prepares the pipeline for mixed sources cleanly.

cli/src/command/append.rs (1)

463-474: Signature change to pass filter/time_filters through is consistent.
Good plumbing to make inclusion honor the same filtering semantics.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment thread cli/src/command/append.rs Outdated
Comment thread cli/src/command/create.rs Outdated
@ChanTsune ChanTsune force-pushed the cli/stdio/@ branch 3 times, most recently from 6162aa1 to c366417 Compare January 18, 2026 09:06
Redesign @archive inclusion to preserve CLI argument order exactly.
The previous implementation separated filesystem paths and @Archives
before processing, which lost the original ordering.

Key changes:
- Add ItemSource, ArchiveSource, CollectedItem types to core.rs
- Add collect_items_from_sources for unified processing
- Process arguments one by one, maintaining order
- Support @- for reading archives from stdin
- Detect stdin conflicts (input archive from stdin vs @-)
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cli/src/command/create.rs`:
- Around line 9-16: The import list in create.rs is not formatted according to
rustfmt (CI failing); run rustfmt (cargo fmt) and reformat the file,
specifically realign and wrap the use/import block that contains symbols like
AclStrategy, CollectOptions, CollectedItem, CreateOptions, EntryResult,
FflagsStrategy, KeepOptions, MIN_SPLIT_PART_BYTES, MacMetadataStrategy,
PathFilter, PathTransformers, PathnameEditor, PermissionStrategyResolver,
TimeFilterResolver, TimeFilters, TimestampStrategyResolver, XattrStrategy,
collect_items_from_paths, drain_entry_results, entry_option,
re::{bsd::SubstitutionRule, gnu::TransformRule}, read_paths, read_paths_stdin,
spawn_entry_results, write_split_archive so the imports conform to rustfmt rules
(line breaks/comma placement) and then re-run cargo fmt --check to ensure CI
passes.
♻️ Duplicate comments (3)
cli/src/command/core.rs (2)

528-565: Ordering can be lost with parallel sends.

spawn_entry_results sends filesystem entries from Rayon tasks and archive batches inline; the receiver processes completion order, not CLI order. This can reorder entries (and hardlink targets) despite the PR’s ordering guarantee. Please add indexed ordering (or a sequential path) before draining.

💡 Possible fix (index + reorder)
+use std::collections::BTreeMap;
-use std::sync::mpsc::Receiver;
+use std::sync::mpsc::Receiver;

-pub(crate) fn spawn_entry_results(...) -> Receiver<EntryResult> {
-    let (tx, rx) = std::sync::mpsc::channel();
+pub(crate) fn spawn_entry_results(...) -> Receiver<(usize, EntryResult)> {
+    let (tx, rx) = std::sync::mpsc::channel::<(usize, EntryResult)>();
     rayon::scope_fifo(|s| {
-        for item in target_items {
+        for (idx, item) in target_items.into_iter().enumerate() {
             match item {
                 CollectedItem::Filesystem(entry) => {
                     let tx = tx.clone();
                     s.spawn_fifo(move |_| {
                         log::debug!("Adding: {}", entry.path.display());
-                        tx.send(EntryResult::Single(create_entry(&entry, create_options)))
+                        tx.send((idx, EntryResult::Single(create_entry(&entry, create_options))))
                             .unwrap_or_else(|e| log::error!("{e}: {}", entry.path.display()));
                     })
                 }
                 CollectedItem::ArchiveMarker(source) => {
                     let result = read_archive_source(
                         &source, create_options, filter, time_filters, password,
                     );
-                    tx.send(EntryResult::Batch(result))
+                    tx.send((idx, EntryResult::Batch(result)))
                         .unwrap_or_else(|e| log::error!("{e}: archive source {}", source));
                 }
             }
         }
         drop(tx);
     });
     rx
}

-pub(crate) fn drain_entry_results<I, F, T>(results: I, mut add_entry: F) -> io::Result<()>
+pub(crate) fn drain_entry_results<I, F, T>(results: I, mut add_entry: F) -> io::Result<()>
 where
-    I: IntoIterator<Item = EntryResult>,
+    I: IntoIterator<Item = (usize, EntryResult)>,
     F: FnMut(NormalEntry) -> io::Result<T>,
 {
-    for result in results {
+    let mut next = 0usize;
+    let mut pending = BTreeMap::<usize, EntryResult>::new();
+    for (idx, result) in results {
+        pending.insert(idx, result);
+        while let Some(result) = pending.remove(&next) {
+            next += 1;
             match result {
                 EntryResult::Single(entry) => {
                     if let Some(entry) = entry? {
                         add_entry(entry)?;
                     }
                 }
                 EntryResult::Batch(entries) => {
                     for entry in entries? {
                         if let Some(entry) = entry? {
                             add_entry(entry)?;
                         }
                     }
                 }
             }
+        }
     }
     Ok(())
 }

1565-1629: @archive ignores split archive parts.

read_archive_source opens only the first file; transform_archive_entries doesn’t advance to subsequent parts. For split archives, @archive.part1.pna will only process part 1. Other commands use collect_split_archives; please align here or explicitly document the limitation.

💡 Possible fix (collect split parts)
-        ArchiveSource::File(path) => {
-            let file = fs::File::open(path)
-                .map_err(|e| io::Error::new(e.kind(), format!("{}: {}", path.display(), e)))?;
-            let reader = io::BufReader::with_capacity(64 * 1024, file);
-            transform_archive_entries(reader, create_options, filter, time_filters, password)
-                .map_err(|e| io::Error::new(e.kind(), format!("{}: {}", path.display(), e)))
-        }
+        ArchiveSource::File(path) => {
+            let archives = collect_split_archives(path)?;
+            let mut results = Vec::new();
+            run_across_archive(
+                archives.into_iter().map(|f| io::BufReader::with_capacity(64 * 1024, f)),
+                |archive| {
+                    for entry_result in archive.entries().extract_solid_entries(password) {
+                        match entry_result {
+                            Ok(entry) => {
+                                if !filter.excluded(entry.header().path()) {
+                                    let ctime = entry.metadata().created_time();
+                                    let mtime = entry.metadata().modified_time();
+                                    if time_filters.matches_or_inactive(ctime, mtime) {
+                                        results.push(transform_normal_entry(entry, create_options));
+                                    }
+                                }
+                            }
+                            Err(e) => results.push(Err(e)),
+                        }
+                    }
+                    Ok(())
+                },
+            )?;
+            Ok(results)
+        }
cli/src/command/stdio.rs (1)

1032-1047: Guard against @- when appending to stdout.

In the archive_path == None branch, stdin is consumed to read the existing archive, and @- would attempt to read stdin again. Add a guard to reject ArchiveSource::Stdin in this branch.

🛠️ Suggested guard
-    } else {
-        let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?;
+    } else {
+        let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?;
+        if target_items.iter().any(|item| {
+            matches!(item, CollectedItem::ArchiveMarker(ArchiveSource::Stdin))
+        }) {
+            anyhow::bail!("cannot use `@-` when appending to stdout (stdin is already in use)");
+        }
         let mut output_archive = Archive::write_header(io::stdout().lock())?;
         {
             let mut input_archive = Archive::read_header(io::stdin().lock())?;
             for entry in input_archive.raw_entries() {
                 output_archive.add_entry(entry?)?;
             }
         }
         run_append_archive(

You’ll also need to import CollectedItem and ArchiveSource in this module.

Comment thread cli/src/command/create.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli This issue is about cli application lib This issue is about lib crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants