@ support#2523
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds archive-aware item sources and streaming entry handling: new types ( Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (stdio)
participant Parser as ItemSource Parser
participant Collector as collect_items_from_sources
participant FSys as Filesystem
participant Archive as ArchiveSource Reader
participant Worker as spawn_entry_results / transformer
participant Writer as Archive Writer
CLI->>Parser: parse_many(sources) after -C
Parser-->>CLI: Vec<ItemSource>
CLI->>Collector: collect_items_from_sources(sources, filter, time_filters)
alt filesystem source
Collector->>FSys: stat/read path
FSys-->>Collector: CollectedEntry (Filesystem)
else archive source
Collector->>Archive: emit ArchiveMarker(ArchiveSource)
end
Collector-->>Worker: stream CollectedItem (Filesystem | ArchiveMarker)
Worker->>Archive: transform_archive_entries(reader, filter, time_filters) %% for ArchiveMarker
Archive-->>Worker: Batch of NormalEntry items
Worker->>Writer: EntryResult::Single/Batch stream
Writer->>Writer: drain_entry_results -> add_entry / write
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ChanTsune, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a fundamental redesign of how command-line arguments are processed when creating and appending archives. The primary goal is to ensure that the order of specified files and included archives is maintained exactly as provided by the user. This enhancement also brings support for reading archive contents directly from standard input and includes robust conflict detection for stdin/stdout usage, making the archiving tool more flexible and predictable. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-designed feature to support @archive inclusions, preserving argument order. The new data structures like ItemSource and CollectedItem are a great way to handle the mixed sources. The implementation is thorough and includes a comprehensive set of new tests.
My review focuses on a few key areas:
- A correctness issue with handling encrypted source archives in non-solid mode.
- Minor improvements for code clarity and efficiency in branches where no archive inclusions are present.
Additionally, the new test suite for archive inclusion is excellent, but it would be even stronger if it included cases for encrypted source archives. This would have likely caught the issue mentioned above.
Overall, this is a solid contribution that greatly enhances the tool's capabilities.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cli/src/command/core.rs (1)
369-380: Consider edge case: literal@path handling.The current implementation treats any argument starting with
@as an archive inclusion. If a user has a legitimate filesystem path starting with@(rare but possible), they cannot archive it.Consider documenting this behavior or providing an escape mechanism (e.g.,
./@pathworks as a workaround since it doesn't start with@).cli/src/command/stdio.rs (1)
1179-1184: Consider including entry names in verbose output.The verbose output only shows
"a (from @{source})"without the actual entry name. For consistency with other verbose outputs (e.g., line 1118, line 1207), consider showing the entry path.However, since
raw_entries()returns opaque raw entries, accessing the path may require additional parsing. If this would require significant changes, the current approach is acceptable.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cli/src/command/core.rscli/src/command/stdio.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/tests/cli/stdio/archive_inclusion.rs (3)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(196-215)
cli/src/command/stdio.rs (2)
cli/src/command/core.rs (6)
collect_items_from_sources(420-444)create_entry(680-750)entry_option(752-772)parse_many(383-385)has_stdin_archive(388-392)new(197-218)cli/src/command/append.rs (1)
open_archive_then_seek_to_end(473-477)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: build-local-artifacts (x86_64-pc-windows-msvc)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
- GitHub Check: tier1 (macos-latest, nightly)
- GitHub Check: tier1 (ubuntu-22.04-arm, beta)
- GitHub Check: tier3_cross (redoxos/redoxer, x86_64-unknown-redox)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip2)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-emscripten)
- GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
- GitHub Check: Test WebAssembly (stable, wasm32-unknown-unknown)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
- GitHub Check: Test WebAssembly (stable, wasm32-wasip2)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-unknown)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
- GitHub Check: Test WebAssembly (beta, wasm32-wasip1)
- GitHub Check: tier1 (ubuntu-latest, beta)
- GitHub Check: msrv (ubuntu-latest, libpna)
- GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (12)
cli/tests/cli/stdio.rs (1)
1-6: LGTM!The new test module declarations are appropriately organized. The
archive_inclusionmodule aligns with the PR's core feature for@support, andstrip_components/unlink_firstmodules provide additional test coverage.cli/tests/cli/stdio/archive_inclusion.rs (1)
1-474: Comprehensive test coverage for the new@archive inclusion feature.The test suite covers all essential scenarios:
- Basic inclusion, multiple archives, solid mode, append mode
- Error cases (non-existent archives, stdin/stdout conflicts)
- Data integrity with larger content
- Ordering preservation - the key fix this PR addresses
The
stdio_archive_inclusion_orderingtest at lines 363-474 is particularly important as it validates the fix for the rayon parallel processing ordering issue mentioned in the PR objectives.cli/src/command/core.rs (4)
293-299: LGTM!Adding
CloneandDebugderives toStoreAsis appropriate for the newCollectedItemtype which containsStoreAsand needs these traits.
301-346: LGTM!The
ArchiveReaderandArchiveSourcetypes are well-designed:
ArchiveReaderavoids dynamic dispatch by using an enum for file/stdin sources- The 64KB buffer size is reasonable for archive I/O
Displayimplementation provides clean output for logging/error messages
382-392: LGTM!
parse_manyandhas_stdin_archiveare straightforward utility methods with correct implementations.
395-444: LGTM!The
CollectedItemenum andcollect_items_from_sourcesfunction correctly implement order-preserving collection:
- Sequential processing maintains CLI argument order
- Shared
HardlinkResolverenables cross-path hardlink detectionArchiveMarkerplaceholders preserve insertion points for archive entriesThis addresses the PR's core objective of preserving CLI argument order.
cli/src/command/stdio.rs (6)
523-531: LGTM!The order of operations is correct:
- Parse
@archivepaths before changing working directory (so paths are canonicalized correctly)- Check for stdin/stdout conflict early with a clear error message
615-654: LGTM!The branch handling for
@archiveinclusions is well-structured:
- Uses
WriteOptions::store()for solid mode entries (compression is applied at the solid block level)- Properly passes
cli_optionfor the solid container's write options- Handles both file and stdout output cases
898-908: LGTM!Consistent pattern with the create flow. The error message correctly differentiates the conflict scenario for append mode (stdin is used for the input archive, not output).
1017-1041: LGTM!Clean implementation that processes items in order, correctly handling both filesystem entries and archive markers.
1067-1167: LGTM!The archive creation functions are well-structured:
- Clear separation between normal and solid mode paths
- Solid mode correctly uses
SolidEntryBuilderto accumulate entries before building the solid block- Both paths maintain the ordering guarantee
1190-1217: LGTM!Correctly decodes and re-encodes entries for solid mode. The password parameter is properly used for encrypted source archives, and verbose output includes entry names.
fcaba85 to
82f7c6d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
cli/src/command/stdio.rs (3)
659-665: Usemapwithunreachable!instead offilter_map.Since
has_archive_inclusionsisfalsein this branch, all items are guaranteed to beCollectedItem::Filesystemvariants. Usingfilter_mapis unnecessary and obscures this invariant.Based on past review comments, replace with:
🔎 Proposed refactor
- let target_items: Vec<_> = collected_items - .into_iter() - .filter_map(|item| match item { - CollectedItem::Filesystem(path, store_as) => Some((path, store_as)), - CollectedItem::ArchiveMarker(_) => None, - }) - .collect(); + let target_items: Vec<_> = collected_items + .into_iter() + .map(|item| match item { + CollectedItem::Filesystem(path, store_as) => (path, store_as), + CollectedItem::ArchiveMarker(_) => unreachable!("has_archive_inclusions is false"), + }) + .collect();
997-1003: Usemapwithunreachable!instead offilter_map.Since
has_archive_inclusionsisfalsein this branch, all items are guaranteed to beCollectedItem::Filesystemvariants. Usingfilter_mapis unnecessary and obscures this invariant.Based on past review comments, replace with:
🔎 Proposed refactor
- let target_items: Vec<_> = collected_items - .into_iter() - .filter_map(|item| match item { - CollectedItem::Filesystem(path, store_as) => Some((path, store_as)), - CollectedItem::ArchiveMarker(_) => None, - }) - .collect(); + let target_items: Vec<_> = collected_items + .into_iter() + .map(|item| match item { + CollectedItem::Filesystem(path, store_as) => (path, store_as), + CollectedItem::ArchiveMarker(_) => unreachable!("has_archive_inclusions is false"), + }) + .collect();
1237-1288: Preserve extra chunks inrebuild_entry_for_solid.The function preserves metadata and xattrs but does not preserve extra chunks (e.g., ACL chunks
faCl,faCe). These chunks will be silently dropped when entries are included in a solid archive.Based on past review comments and patterns in
migrate.rs:96-101andstrip.rs:143-148, add extra chunk preservation:🔎 Proposed fix
// Apply xattrs for xattr in xattrs { builder.add_xattr(xattr.clone()); } + + // Apply extra chunks (e.g., ACL chunks) + for chunk in entry.extra_chunks() { + builder.add_extra_chunk(chunk.clone()); + } builder.build()
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cli/src/command/core.rscli/src/command/stdio.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(80-82)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(196-215)lib/src/chunk/types.rs (1)
len(106-108)
cli/src/command/stdio.rs (1)
cli/src/command/core.rs (4)
collect_items_from_sources(422-446)create_entry(682-752)has_stdin_archive(390-394)new(197-218)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: tier1 (windows-11-arm, beta)
- GitHub Check: tier1 (windows-latest, stable)
- GitHub Check: Analyze (rust)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip2)
- GitHub Check: tier1 (macos-latest, stable)
- GitHub Check: tier1 (windows-11-arm, nightly)
- GitHub Check: tier1 (windows-11-arm, stable)
- GitHub Check: tier1 (macos-latest, nightly)
- GitHub Check: tier1 (windows-11-arm, beta)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-unknown)
- GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
- GitHub Check: Test WebAssembly (stable, wasm32-wasip1)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-wasip1)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-unknown)
- GitHub Check: Test WebAssembly (beta, wasm32-wasip1)
- GitHub Check: Test WebAssembly (stable, wasm32-unknown-unknown)
- GitHub Check: msrv (ubuntu-latest, libpna)
- GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (17)
cli/src/command/core.rs (7)
23-23: LGTM!The
fmtimport is necessary for theDisplayimplementation onArchiveSource.
293-299: LGTM!The
StoreAsenum correctly represents the different storage strategies for filesystem items.
301-315: LGTM!The
ArchiveReaderenum avoids dynamic dispatch overhead while supporting both file and stdin sources. TheReadtrait implementation correctly delegates to the underlying reader.
350-395: LGTM!The
ItemSourceparsing correctly handles the@archivesyntax, including@-for stdin. Canonicalizing archive paths (line 376) before changing working directory (via-C) is the right approach to avoid path resolution issues.
397-407: LGTM!The
CollectedItemenum cleanly separates filesystem items from archive markers, enabling order-preserving interleaved processing.
422-446: LGTM!The function correctly preserves CLI argument order while sharing a
HardlinkResolveracross all filesystem sources for proper cross-path hardlink detection. The sequential processing ensures predictable ordering.
324-337: This pattern is safe and is already verified by existing tests.The
io::stdin().lock()pattern is safe becauseio::stdin()returns a thread-local handle to the static stdin resource. The returnedStdinLockis valid for accessing that static resource, even though the immediateStdinvalue is temporary. The comprehensive test suite incli/tests/cli/stdio/archive_inclusion.rs(includingstdio_archive_inclusion_basic,stdio_archive_inclusion_solid,stdio_archive_inclusion_data_integrity, and others) already verifies that stdin archive inclusion via@-works correctly across various scenarios.cli/tests/cli/stdio/archive_inclusion.rs (4)
1-8: LGTM!The imports are appropriate for the test functionality.
9-27: LGTM!The helper function creates test archives with appropriate defaults. Using
unwrap()in test helpers is acceptable.
29-36: LGTM!The helper function provides a clean way to extract entry names for test assertions.
38-474: LGTM!The test suite comprehensively covers archive inclusion functionality:
- Basic and multiple inclusions
- Solid mode support
- Append mode
- Error handling (nonexistent archives, stdin conflicts)
- Data integrity verification
- Order preservation
The tests are well-structured and verify both success and failure scenarios.
cli/src/command/stdio.rs (6)
10-14: LGTM!The import additions are necessary for the archive inclusion functionality.
Also applies to: 26-35
523-531: LGTM!The stdin conflict detection correctly prevents using
@-(archive from stdin) when writing output to stdout, which would cause a resource conflict. The error message is clear.
1021-1046: LGTM!The function correctly processes items in order, handling filesystem entries and archive markers appropriately. The verbose output helps with debugging.
1073-1109: LGTM!The dispatcher function cleanly separates solid and normal archive creation logic.
1111-1141: LGTM!The function correctly creates a normal archive with proper ordering and finalization.
1143-1178: LGTM!The function correctly creates a solid archive, properly using the
SolidEntryBuilderand maintaining entry order.
82f7c6d to
c82c935
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cli/tests/cli/stdio/archive_inclusion.rs (1)
1-754: Optional: Consider extracting common test patterns.The test suite is comprehensive and well-structured. For maintainability, consider extracting the common CLI command invocation pattern into a helper function to reduce duplication. However, the current structure is clear and acceptable.
Example helper pattern (optional)
fn run_pna_create(output: &Path, base_dir: &Path, args: &[&str], extra_flags: &[&str]) -> assert_cmd::assert::Assert { let mut cmd = cargo_bin_cmd!("pna"); cmd.args(["--quiet", "experimental", "stdio", "--create", "--unstable", "--overwrite"]) .arg("-f").arg(output) .arg("-C").arg(base_dir) .args(extra_flags) .args(args); cmd.assert() }This would reduce the repetitive 10+ line command construction blocks throughout the tests.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (3)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(80-82)lib/src/archive/write.rs (1)
write_header(85-88)
🔇 Additional comments (11)
cli/tests/cli/stdio/archive_inclusion.rs (11)
1-36: Well-structured test helpers.The imports and helper functions are well-designed.
create_test_archiveandget_archive_entry_namesprovide clear abstractions that reduce duplication across the test suite.
38-94: LGTM: Comprehensive basic inclusion test.The test correctly verifies basic
@archiveinclusion functionality, including both filesystem and archive sources. The use ofHashSetfor order-independent verification is appropriate for this scenario.
96-145: LGTM: Multiple archive inclusion test.The test correctly verifies that multiple
@archiveinclusions can be processed in a single command, and all entries are preserved.
147-221: LGTM: Solid mode inclusion test with extraction verification.The test properly verifies that archive inclusion works with solid mode and includes extraction verification to ensure data integrity. This is a thorough test of the feature interaction.
223-259: LGTM: Append mode inclusion test.The test correctly verifies that
@archiveinclusion works in append mode, preserving both the original and newly included entries.
261-307: LGTM: Proper error case coverage.Both error tests are correctly structured:
- Non-existent archive inclusion properly fails
- Stdin/stdout conflict detection works as expected
These tests ensure the feature handles error conditions gracefully.
309-361: LGTM: Data integrity verification.The test properly verifies that archive inclusion preserves data integrity for larger content, including full round-trip extraction verification.
363-474: Excellent coverage of the core PR objective.This test directly validates the fix for preserving CLI argument order when interleaving filesystem paths with
@archiveinclusions. The precise index-based assertions and the extraction verification make this a thorough test of the feature's key requirement.
476-591: LGTM: Filter interaction tests.Both the exclude and include filter tests properly verify that filtering patterns work correctly with
@archiveinclusion, ensuring entries from included archives respect the specified filters.
593-754: LGTM: Comprehensive edge case coverage.The final three tests provide excellent coverage of edge cases:
- Archive-first ordering ensures order preservation works in all argument positions
- Empty archive inclusion ensures graceful handling of edge cases
- Combined filters with clear precedence rules (exclude wins) and well-documented expected behavior
These tests complete a thorough test suite for the archive inclusion feature.
43-44: Test artifacts are properly isolated. Thesetup()function (called at the start of each test) changes the current working directory toenv!("CARGO_TARGET_TMPDIR"), a temporary directory managed by Cargo. All test artifacts created withPathBuf::from(...)are therefore isolated in this temporary location, not in the repository or working directory. No changes required.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
cli/tests/cli/stdio/archive_inclusion.rs (3)
75-75: Consider extracting a helper for @archive argument formatting.The pattern
format!("@{}", archive_path.to_str().unwrap())is repeated throughout the tests. Extracting this into a helper function would improve maintainability and reduce duplication.🔎 Proposed helper function
Add this helper near the top of the file:
fn archive_arg(path: &Path) -> String { format!("@{}", path.to_str().unwrap()) }Then replace occurrences like:
- &format!("@{}", source_archive.to_str().unwrap()), + &archive_arg(&source_archive),Also applies to: 131-132, 182-182, 249-249, 334-334, 403-403, 511-511, 571-571, 625-625, 675-675, 727-727, 791-791
319-319: Consider using a named constant for the large content size.The magic number
10000for testing data integrity could be more self-documenting as a named constant.🔎 Proposed refactor
+const LARGE_CONTENT_SIZE: usize = 10000; + fn stdio_archive_inclusion_data_integrity() { // ... - let large_content = "x".repeat(10000); + let large_content = "x".repeat(LARGE_CONTENT_SIZE);
141-144: Consider adding descriptive assertion messages for consistency.Some tests have detailed assertion messages (e.g., lines 84-92, 520-532) while others use simple assertions without messages. Adding consistent messages would make test failures easier to diagnose.
For example:
- assert!(entry_names.contains("new.txt")); - assert!(entry_names.contains("from_source1.txt")); - assert!(entry_names.contains("from_source2.txt")); - assert_eq!(entry_names.len(), 3); + assert!(entry_names.contains("new.txt"), "Missing new.txt"); + assert!(entry_names.contains("from_source1.txt"), "Missing from_source1.txt"); + assert!(entry_names.contains("from_source2.txt"), "Missing from_source2.txt"); + assert_eq!(entry_names.len(), 3, "Expected exactly 3 entries");Also applies to: 191-193, 256-258
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(80-82)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(196-215)
🔇 Additional comments (1)
cli/tests/cli/stdio/archive_inclusion.rs (1)
9-36: LGTM! Well-designed helper functions.The helper functions are clean and reusable.
create_test_archivecorrectly handles parent directory creation and archive writing, whileget_archive_entry_namesprovides a convenient way to verify archive contents.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
cli/tests/cli/stdio/archive_inclusion.rs (2)
38-841: Comprehensive test coverage with opportunities for refactoring.The test suite provides excellent coverage of archive inclusion scenarios, including:
- Basic and multiple archive inclusion
- Solid mode and append operations
- Error cases (non-existent archives, stdin/stdout conflicts)
- Ordering preservation
- Filter interactions (include/exclude patterns)
The tests are well-structured and logically correct. However, there's significant code duplication across test functions, particularly in:
- Archive creation patterns
- CLI command invocation
- Verification steps
Consider extracting common test patterns
You could reduce duplication by introducing helper functions or macros for common operations:
// Helper for running create command with common args fn run_create_command(output: &Path, args: &[&str]) -> assert_cmd::assert::Assert { let mut cmd = cargo_bin_cmd!("pna"); cmd.args(["--quiet", "experimental", "stdio", "--create", "--unstable", "--overwrite", "-f", output.to_str().unwrap()]); cmd.args(args); cmd.assert() } // Helper for extraction verification fn extract_and_verify(archive: &Path, extract_dir: &Path) { cargo_bin_cmd!("pna") .args([ "--quiet", "experimental", "stdio", "--extract", "--unstable", "--overwrite", "-f", archive.to_str().unwrap(), "--out-dir", extract_dir.to_str().unwrap(), ]) .assert() .success(); }This would make tests more concise and easier to maintain.
319-319: Consider documenting the test data size.The value
10000is used to create "large" content for data integrity testing. Adding a constant or comment explaining the choice would improve test clarity.+ // Use 10KB to verify data integrity with moderately sized content let large_content = "x".repeat(10000);
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(80-82)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(196-215)
🔇 Additional comments (2)
cli/tests/cli/stdio/archive_inclusion.rs (2)
1-36: Well-structured helper functions.The imports and helper functions are clear and appropriate for the test module. Both
create_test_archiveandget_archive_entry_namesprovide reusable utilities that simplify the test cases.
3-3: No action required. Library versions are current and secure.The specified versions—pna 0.29.3 (workspace dependency) and assert_cmd 2.1.1—are the latest available and have no known security vulnerabilities as of December 2025.
ca1dc25 to
a724d8d
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
cli/src/command/stdio.rs (2)
1189-1208: [Acknowledged] Critical issue already flagged: Entry metadata mismatch.This function has been flagged in previous reviews for not rebuilding entries after decryption. The FHED header metadata may not match the actual entry data state, causing potential corruption when the destination archive has different encryption settings.
The fix would involve rebuilding entries (similar to
append_archive_entries_to_solid) to ensure header metadata matches the actual data state. However, since this has already been thoroughly reviewed and documented, no additional verification is needed at this time.
1235-1286: [Acknowledged] Extra chunks not preserved.This function has been flagged in previous reviews for not preserving extra chunks (e.g., ACL chunks
faCl,faCe). These would be silently dropped when entries are included in a solid archive.The fix would involve iterating over
entry.extra_chunks()and callingbuilder.add_extra_chunk(chunk.clone())for each, following the pattern inmigrate.rsandstrip.rs. Since this has already been documented, no additional verification is needed.
🧹 Nitpick comments (2)
cli/src/command/stdio.rs (2)
657-663: Consider usingmapinstead offilter_mapfor clarity.Since
has_archive_inclusionsisfalsein this branch, we know thatcollected_itemsonly containsCollectedItem::Filesystemvariants. Usingmapwithunreachable!()for theArchiveMarkercase would express this invariant more clearly.🔎 Proposed refactor
- let target_items: Vec<_> = collected_items - .into_iter() - .filter_map(|item| match item { - CollectedItem::Filesystem(path, store_as) => Some((path, store_as)), - CollectedItem::ArchiveMarker(_) => None, - }) - .collect(); + let target_items: Vec<_> = collected_items + .into_iter() + .map(|item| match item { + CollectedItem::Filesystem(path, store_as) => (path, store_as), + CollectedItem::ArchiveMarker(_) => unreachable!("no archive markers when has_archive_inclusions is false"), + }) + .collect();
993-999: Consider usingmapinstead offilter_mapfor clarity.Since
has_archive_inclusionsisfalsein this branch, usingmapwithunreachable!()for theArchiveMarkercase would express this invariant more clearly.🔎 Proposed refactor
- let target_items: Vec<_> = collected_items - .into_iter() - .filter_map(|item| match item { - CollectedItem::Filesystem(path, store_as) => Some((path, store_as)), - CollectedItem::ArchiveMarker(_) => None, - }) - .collect(); + let target_items: Vec<_> = collected_items + .into_iter() + .map(|item| match item { + CollectedItem::Filesystem(path, store_as) => (path, store_as), + CollectedItem::ArchiveMarker(_) => unreachable!("no archive markers when has_archive_inclusions is false"), + }) + .collect();
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cli/src/command/core.rscli/src/command/stdio.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (1)
cli/src/command/stdio.rs (3)
cli/src/command/core.rs (5)
collect_items_from_sources(390-414)create_entry(650-720)parse_many(353-355)has_stdin_archive(358-362)new(197-218)cli/src/command/create.rs (1)
create_archive_file(510-572)cli/src/command/append.rs (1)
run_append_archive(444-470)
🔇 Additional comments (15)
cli/src/command/core.rs (4)
301-316: LGTM! Clean enum design.The
ArchiveSourceenum clearly represents the two archive input sources (file or stdin), and theDisplayimplementation correctly formats "-" for stdin and the path display for files.
365-375: LGTM! Clear separation of concerns.The
CollectedItemenum cleanly separates filesystem items (requiring entry building) from archive markers (requiring entry copying), making the processing logic straightforward.
390-414: LGTM! Correct order preservation and hardlink detection.The function correctly:
- Shares a single
HardlinkResolveracross all sources for cross-path hardlink detection- Preserves CLI argument order by processing sources sequentially
- Separates filesystem collection from archive marker insertion
339-350: The canonicalization error handling is already correct and produces clear error messages.The
fs::canonicalize()call properly propagatesio::Errorthrough the?operator, which gets converted toanyhow::Errorinmain()and displayed to the user with the appropriate OS error message (e.g., "No such file or directory" for missing files). The early canonicalization before working directory changes is intentional and documented at lines 337–338. No changes needed.cli/tests/cli/stdio.rs (1)
1-1: LGTM! Test module addition.The new
archive_inclusiontest module aligns with the PR's feature addition.cli/src/command/stdio.rs (10)
11-14: LGTM! Necessary imports for @archive support.The new imports from
coremodule are all used in the inclusion-aware creation and append flows.
523-531: LGTM! Correct ordering and conflict detection.The early parsing of
@archivepaths before the-Cworking directory change ensures canonicalization works correctly. The stdin/stdout conflict check prevents a critical error scenario.
558-561: LGTM! Correct sequencing documented.The comment clearly explains why working directory change happens after parsing
@archivepaths.
575-576: LGTM! Unified collection for mixed sources.Using
collect_items_from_sourcesinstead ofcollect_items_from_pathsenables processing both filesystem and archive sources while preserving CLI argument order.
898-908: LGTM! Consistent conflict detection for append mode.The early parsing and stdin conflict detection mirrors the create mode logic, correctly preventing conflicts when the input archive comes from stdin.
1017-1046: LGTM! Order-preserving append logic.The function correctly processes
CollectedItemvariants in sequence, preserving CLI argument order while delegating to appropriate handlers for filesystem entries and archive inclusions.
1073-1105: LGTM! Clean delegation to mode-specific handlers.The function appropriately routes to solid or normal archive creation based on the
solidflag.
1107-1141: LGTM! Order-preserving normal archive creation.The function correctly processes items in sequence, creating filesystem entries and copying archive entries in the order specified on the command line.
1143-1187: LGTM! Correct solid archive creation with inclusions.The function properly:
- Uses
WriteOptions::store()for individual entries (since compression happens at solid level)- Adds entries to the solid builder in order
- Finalizes the solid block and wraps it in an archive
1210-1233: LGTM! Correct entry rebuilding for solid mode.This function properly rebuilds entries via
rebuild_entry_for_solidbefore adding them to the solid builder, ensuring metadata consistency.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI Agents
In @cli/src/command/core.rs:
- Around line 1389-1428: transform_archive_entries currently opens only a single
reader and thus ignores split archives; update the function to first collect
split parts (using collect_split_archives) for the provided source and pass the
combined reader(s) into Archive::read_header (or adjust/read_archive_source to
accept the collected archive parts), then iterate entries as before;
specifically, locate transform_archive_entries and either call
collect_split_archives on the input path/source before creating
Archive::read_header or modify read_archive_source to return a reader that
already handles collect_split_archives so split archives (e.g.,
@archive.part1.pna + parts) are all processed.
🧹 Nitpick comments (3)
cli/tests/cli/stdio/archive_inclusion.rs (1)
38-95: Consider test directory cleanup.The tests create directories directly in the working directory (e.g.,
PathBuf::from("stdio_archive_inclusion_basic")). These directories persist after test runs. Consider using a temporary directory or implementing cleanup to avoid accumulating test artifacts.🔎 Suggested approach using tempdir
use tempfile::tempdir; #[test] fn stdio_archive_inclusion_basic() { setup(); let temp = tempdir().unwrap(); let base = temp.path(); // ... rest of test using base instead of PathBuf::from("...") }cli/src/command/create.rs (1)
570-591: Archive source reading is synchronous within the rayon scope.Unlike filesystem entries which are spawned into the thread pool (
s.spawn_fifo), archive sources at lines 582-587 are read synchronously before sending results to the channel. This means:
- Archive source reading blocks the scope's iteration
- Multiple
@archivearguments are processed sequentially- Filesystem entries interspersed with archives may not be processed until the archive is fully read
This preserves ordering (matching PR objectives) but may impact performance for large archive sources. If this is intentional for ordering guarantees, consider adding a comment.
cli/src/command/core.rs (1)
1468-1492: Consider logging when ownership overrides are ignored.When
owner_options.has_overrides()is true butresult.metadata().permission()is None, the ownership overrides are silently ignored. While this might be intentional (can't override non-existent permissions), it could lead to user confusion.Consider adding a debug or info log message when overrides are specified but cannot be applied due to missing permission metadata.
Proposed enhancement
// Apply ownership transformations if specified if owner_options.has_overrides() { if let Some(perm) = result.metadata().permission() { let new_perm = pna::Permission::new( owner_options .uid .map(u64::from) .unwrap_or_else(|| perm.uid()), owner_options .uname .clone() .unwrap_or_else(|| perm.uname().to_string()), owner_options .gid .map(u64::from) .unwrap_or_else(|| perm.gid()), owner_options .gname .clone() .unwrap_or_else(|| perm.gname().to_string()), perm.permissions(), ); let metadata = result.metadata().clone().with_permission(Some(new_perm)); result = result.with_metadata(metadata); + } else { + log::debug!( + "Ownership overrides specified but entry '{}' has no permission metadata", + original_name + ); } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rslib/src/entry.rs
🧰 Additional context used
🧬 Code graph analysis (3)
lib/src/entry.rs (2)
cli/src/command/list.rs (1)
name(273-280)lib/src/entry/attr.rs (1)
name(36-38)
cli/tests/cli/stdio/archive_inclusion.rs (4)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(80-82)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(196-215)
cli/src/command/append.rs (1)
cli/src/command/core.rs (5)
collect_items_from_paths(483-497)create_entry(727-769)read_archive_source(1431-1447)read_paths(1353-1357)read_paths_stdin(1360-1362)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: tier1 (ubuntu-22.04-arm, beta)
- GitHub Check: Test WebAssembly (beta, wasm32-wasip2)
- GitHub Check: plan
- GitHub Check: Test WebAssembly (stable, wasm32-unknown-emscripten)
- GitHub Check: Test WebAssembly (nightly-2025-12-09, wasm32-unknown-emscripten)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
- GitHub Check: tier1 (ubuntu-latest, nightly)
- GitHub Check: tier1 (windows-latest, stable)
- GitHub Check: tier1 (windows-latest, nightly)
- GitHub Check: tier1 (windows-11-arm, nightly)
- GitHub Check: tier1 (windows-11-arm, beta)
- GitHub Check: tier1 (ubuntu-latest, stable)
- GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
- GitHub Check: tier1 (ubuntu-latest, beta)
- GitHub Check: tier1 (ubuntu-22.04-arm, beta)
- GitHub Check: tier1 (ubuntu-22.04-arm, stable)
- GitHub Check: msrv (ubuntu-latest, libpna)
- GitHub Check: msrv (ubuntu-latest, pna)
- GitHub Check: msrv (ubuntu-latest, portable-network-archive)
🔇 Additional comments (22)
lib/src/entry.rs (1)
937-959: LGTM!The
with_namemethod follows the established builder pattern used bywith_metadataandwith_xattrs, provides clear documentation with a working example, and correctly updates the header name while preserving all other entry fields.cli/tests/cli/stdio.rs (1)
1-10: LGTM!The new
archive_inclusiontest module is properly declared alongside the existing test modules.cli/src/command/update.rs (2)
13-16: LGTM!The import changes correctly bring in
CollectedEntryand related types for the updated function signatures.
520-540: Update mode doesn't support@archiveinclusion.The update command uses
collect_items_from_pathsrather thancollect_items_from_sources, which means@archivesyntax won't work in update mode. This appears intentional since update mode compares filesystem metadata against archive entries, which doesn't apply to archive-sourced entries.If this is the intended behavior, consider documenting this limitation or providing a clear error message when users attempt to use
@archivewith the update command.cli/tests/cli/stdio/archive_inclusion.rs (3)
9-36: LGTM!The helper functions
create_test_archiveandget_archive_entry_namesare well-designed, reusable utilities that simplify test authoring. The error handling withunwrap()is appropriate for test code.
489-538: Good test for entry ordering preservation.This test validates a key PR objective: preserving the exact order of CLI arguments. The test correctly verifies that archive entries appear first, followed by filesystem entries, matching the command-line argument order.
289-310: Test correctly validates the expected failure behavior.Using
@-(archive from stdin) with-f -(output to stdout) cannot succeed because stdin gets locked when the code tries to read the input archive, making it impossible to then read from@-. The test properly expects failure on this unsupported combination. However, note that the failure is implicit (from attempting to lock stdin twice) rather than from explicit conflict detection, and the test does not validate the specific error message.cli/src/command/stdio.rs (3)
11-15: LGTM!The import additions correctly bring in
ItemSourceandcollect_items_from_sourcesfor the new source-based collection workflow.
612-626: Good documentation of the-Cinteraction.The comment clearly explains why sources are parsed after
set_current_dir- this ensures@archivepaths are resolved relative to the changed directory, matching the behavior of filesystem paths.
686-696: LGTM!The
create_archive_filecalls now correctly passfilterandtime_filtersto support filtering entries from archive sources.cli/src/command/create.rs (3)
9-15: LGTM!The imports correctly bring in the new types and functions needed for archive source handling.
598-633: LGTM!The result handling correctly processes both
EntryResult::SingleandEntryResult::Batchvariants, properly unwrapping the nestedOptionandResulttypes.
709-715: Good error propagation in flat_map.The
flat_mapcorrectly convertsEntryResultvariants to a flattened iterator. The error case at line 713 (Err(e) => vec![Err(e)]) properly preserves errors by wrapping them in a single-element vector, ensuring they propagate through the iterator.cli/src/command/append.rs (3)
9-14: LGTM!The imports correctly mirror those in
create.rs, maintaining consistency across the codebase.
461-472: LGTM!The target items are correctly wrapped as
CollectedItem::Filesystemand the new filter/time_filters parameters are properly passed torun_append_archive.
475-524: LGTM!The
run_append_archivefunction correctly handles bothCollectedItem::FilesystemandCollectedItem::ArchiveMarkervariants, with proper result processing for bothEntryResult::SingleandEntryResult::Batch. The implementation is consistent withcreate_archive_fileincreate.rs.cli/src/command/core.rs (6)
273-277: LGTM!The
has_overridesmethod correctly checks for any ownership override presence.
341-354: LGTM!The
CollectedEntryandStoreAstypes are well-designed for representing filesystem items with their storage strategies.
356-411: LGTM!The archive source parsing logic correctly handles
@archive,@-(stdin), and regular filesystem paths. The lazy path validation approach is appropriate for CLI argument parsing.
413-435: LGTM!The
CollectedItemandEntryResulttypes effectively separate filesystem items from archive markers while enabling efficient batched processing.
450-470: LGTM!The function correctly preserves CLI argument order while enabling shared hardlink detection across filesystem paths. The separation of filesystem items and archive markers is well-implemented.
483-497: LGTM!The signature changes to use
CollectedEntryare consistently applied acrosscollect_items_from_paths,collect_items_with_state, andcreate_entry. The refactoring maintains existing logic while introducing the new type structure.Also applies to: 544-665, 727-769
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @cli/src/command/core.rs:
- Around line 1552-1598: The function transform_normal_entry destructures a
non-existent owner_options from CreateOptions; either add owner_options:
PermissionStrategyResolver to the CreateOptions struct and keep the current
destructuring and uses of owner_options.has_overrides(),
owner_options.uid/gid/uname/gname, or modify transform_normal_entry to read the
ownership resolver from keep_options.permission_strategy (i.e., replace
owner_options.* references with keep_options.permission_strategy.*). Update the
CreateOptions definition if you choose the first approach so the struct includes
owner_options: PermissionStrategyResolver and adjust any callers accordingly.
In @cli/src/command/stdio.rs:
- Around line 1027-1042: Before opening stdin for reading and calling
run_append_archive, validate that when appending to stdout (i.e., archive_path
is None) the collected sources do not include a stdin archive marker;
specifically, after collect_items_from_sources returns target_items check for
any CollectedItem::ArchiveMarker(ArchiveSource::Stdin) (or equivalent
ArchiveSource::Stdin representation) and return an error if found. Locate the
append-to-stdout logic around collect_items_from_sources, Archive::read_header
and run_append_archive and add this guard (or add a unit test like
stdio_archive_inclusion_stdin_stdout_conflict) so run_append_archive will never
attempt to read stdin twice.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rslib/src/entry.rs
🧰 Additional context used
🧬 Code graph analysis (3)
lib/src/entry.rs (2)
cli/src/command/list.rs (1)
name(273-280)lib/src/entry/attr.rs (1)
name(36-38)
cli/src/command/stdio.rs (2)
cli/src/command/core.rs (3)
collect_items_from_sources(489-509)collect_split_archives(711-728)parse_many(447-449)cli/src/command/create.rs (1)
create_archive_file(540-633)
cli/src/command/append.rs (3)
cli/src/command/core.rs (6)
collect_items_from_paths(522-536)create_entry(766-807)entry_option(809-829)read_archive_source(1534-1550)read_paths(1456-1460)read_paths_stdin(1463-1465)cli/src/command/create.rs (1)
collect_items_from_paths(457-460)lib/src/entry.rs (2)
std(1185-1185)entries(430-444)
🔇 Additional comments (27)
lib/src/entry.rs (1)
937-959: LGTM! Clean builder-pattern method.The
with_namemethod provides a clean way to rename entries during archive-to-archive copying operations. The implementation correctly follows the builder pattern and the documentation clearly explains the use case.cli/src/command/core.rs (6)
313-317: LGTM! Useful helper for ownership override detection.The
has_overrides()method provides a clean way to check if any ownership overrides are specified, which is used to conditionally apply ownership transformations.
380-450: LGTM! Well-designed data structures for archive inclusion.The new types (
CollectedEntry,ArchiveSource,ItemSource) provide a clean abstraction for handling mixed filesystem and archive sources. The parsing logic correctly handles the@prefix convention for archive inclusion.
476-509: LGTM! Correct order-preserving collection logic.The
collect_items_from_sourcesfunction correctly processes mixed sources while preserving CLI argument order. The sharedHardlinkResolverenables cross-path hardlink detection as intended.
1533-1550: LGTM! Clean delegation pattern.The function correctly handles both file and stdin archive sources by delegating to
transform_archive_entries. The split archive issue applies to the underlying function and has already been flagged.
522-536: LGTM! Consistent refactoring toCollectedEntry.The updates to
collect_items_from_paths,collect_items_with_state, andcreate_entryconsistently use the newCollectedEntrytype. The refactoring is mechanical and correct.Also applies to: 583-704, 766-807
1492-1531: Split archive support needed for@archiveinclusion.As noted in previous reviews,
transform_archive_entriesandread_archive_sourceonly handle single-part archives. If a split archive is referenced via@archive.part1.pna, only the first part will be processed.Other commands handle this by calling
collect_split_archivesbefore opening the archive. Apply the same pattern here.Likely an incorrect or invalid review comment.
cli/tests/cli/stdio.rs (1)
1-1: LGTM! Standard module declaration.cli/src/command/update.rs (1)
13-17: LGTM! Consistent refactoring toCollectedEntry.The updates align with the
CollectedEntryrefactoring incore.rs. All usages are updated consistently and correctly.Also applies to: 525-525, 598-598
cli/tests/cli/stdio/archive_inclusion.rs (1)
1-743: LGTM! Excellent test coverage for archive inclusion.This comprehensive test suite covers all the major scenarios for the
@archivefeature:
- Basic and multiple archive inclusion
- Solid mode and append operations
- Error cases (nonexistent archive, stdin/stdout conflict)
- Data integrity verification
- Filter interactions (include/exclude)
- Entry ordering preservation
- Edge cases (empty archives, combined filters)
The helper functions (
create_test_archive,get_archive_entry_names) are well-designed and make the tests clear and maintainable.cli/src/command/stdio.rs (5)
11-16: LGTM: Import updates align with source-based collection.The additions of
ItemSourceandcollect_items_from_sourcesare necessary for the new archive-inclusion processing model.
626-640: LGTM: Source parsing correctly ordered after directory change.The comment at line 626 clearly explains why sources are parsed after
set_current_dir, ensuring@archivepaths are resolved relative to the working directory set by-C.
698-714: LGTM: Filter and time_filters properly propagated to archive creation.Both file-based and stdout-based archive creation now receive the necessary filtering context for archive-inclusion processing.
1002-1003: LGTM: Source parsing consistently ordered after directory change in append paths.The comment at line 1002 and the usage of
ItemSource::parse_manyfollowed bycollect_items_from_sourcesmirrors the create path, ensuring@archivepaths respect the-Cworking directory.Also applies to: 1016-1018, 1027-1027
1019-1025: LGTM: Filter and time_filters properly propagated to append operations.Both file-based and stdin/stdout-based append operations now receive the necessary filtering context for archive-inclusion processing.
Also applies to: 1035-1041
cli/src/command/create.rs (7)
9-16: LGTM: Import updates support archive-inclusion workflow.The additions of
CollectedItem,EntryResult,TimeFilters, andread_archive_sourceare necessary for the new unified item processing model.
457-460: LGTM: Filesystem entries correctly wrapped as CollectedItem.The mapping to
CollectedItem::Filesystemmaintains type consistency with the unified processing model. Since the create command path shown here collects only filesystem paths, all items are filesystem variants.
540-550: LGTM: Function signatures extended for archive-inclusion support.The addition of
filterandtime_filtersparameters enables archive-source processing while maintaining backward compatibility for filesystem-only items.Also applies to: 646-647
567-589: LGTM: CollectedItem variants handled appropriately in parallel scope.Filesystem items spawn parallel tasks emitting
EntryResult::Single, while archive markers callread_archive_sourcesynchronously and emitEntryResult::Batch. The synchronous processing of archive sources may be intentional to preserve ordering or because the archive reading is already parallelized internally.
595-628: LGTM: EntryResult variants handled consistently in both solid and non-solid modes.Both archive modes correctly pattern-match on
EntryResult::SingleandEntryResult::Batch, with proper error propagation and Option unwrapping.
662-678: LGTM: Split archive path mirrors standard path correctly.The CollectedItem variant handling in split archives is consistent with the standard path, maintaining code uniformity.
685-710: LGTM: Split archive result handling is correct and idiomatic.The solid mode pattern-matches on
EntryResultvariants consistently. The non-solid mode usesflat_mapto flatten both Single and Batch results, followed byfilter_map(Result::transpose)to filter outNonevalues while transposingResult<Option<T>>toOption<Result<T>>. This is idiomatic Rust for handling nested Result/Option types.Also applies to: 713-713
cli/src/command/append.rs (5)
9-16: LGTM: Import updates align with unified processing model.The additions mirror those in create.rs, maintaining consistency across create and append workflows.
462-465: LGTM: Filesystem entries correctly wrapped for unified processing.The mapping to
CollectedItem::Filesystemis consistent with the create.rs pattern and maintains type compatibility.
476-482: LGTM: Function signature extended consistently with create workflow.The addition of
filterandtime_filtersparameters enables archive-source processing in the append path, maintaining consistency with the create workflow.
484-505: LGTM: Parallel processing pattern consistent with create workflow.The CollectedItem variant handling mirrors create.rs: filesystem items spawn parallel tasks, archive markers are processed synchronously. This maintains ordering and leverages internal parallelism in archive reading.
507-524: LGTM: EntryResult handling and finalization are correct.The pattern-matching on
EntryResult::SingleandEntryResult::Batchis consistent with create.rs, with proper error propagation and Option handling. The archive is correctly finalized before returning.
88e511b to
5e83d33
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @cli/src/command/core.rs:
- Around line 1492-1519: read_archive_source currently opens a single file and
doesn't handle split archives like other commands; update read_archive_source to
call collect_split_archives and use the returned parts as the input to
Archive::read_header (or combine them into a single reader/chain) so
transform_archive_entries receives the full split-archive stream, or
alternatively add a clear doc comment on read_archive_source stating that
@archive does not support split archives; reference the functions
collect_split_archives and read_archive_source (and ensure
transform_archive_entries continues to accept the reader produced).
- Around line 1559-1583: The code incorrectly references a non-existent
permission_strategy and wrong enum/variant names; change the pattern to match
keep_options.owner_strategy and the correct enum variant OwnerStrategy::Preserve
{ options } (where options is an OwnerOptions struct with fields uid, gid,
uname, gname), then replace uid/gid/uname/gname bindings with
options.uid/options.gid/options.uname/options.gname when building the new
pna::Permission and updating result.metadata(); keep the existing logic that
only applies when at least one option.is_some() and when
result.metadata().permission() is Some.
🧹 Nitpick comments (2)
cli/src/command/create.rs (1)
569-591: Archive sources processed synchronously within rayon scope.
read_archive_sourceforArchiveMarkeritems is called synchronously (not spawned), while filesystem entries are processed in parallel viaspawn_fifo. This is likely intentional since:
- Stdin can only be read once and needs ordering
- Archive entry order should be preserved
However, for
ArchiveSource::Filevariants, parallel processing could improve performance. Consider spawning file-based archive sources if order preservation isn't required.cli/src/command/core.rs (1)
470-480: Consider boxing the larger variant to reduce enum size.Static analysis flags a large size difference between
SingleandBatchvariants. TheBatchvariant contains aVecwhich is significantly larger than theSinglevariant's contents. This causes allEntryResultinstances to use the size of the largest variant.If memory efficiency is a concern (especially with large numbers of entries in flight), consider boxing the
Batchinner type:Batch(Box<io::Result<Vec<io::Result<Option<NormalEntry>>>>>),However, if the batching optimization's reduced synchronization overhead is the priority and
EntryResultinstances are short-lived, the current design is acceptable.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- cli/tests/cli/stdio/archive_inclusion.rs
🧰 Additional context used
🧬 Code graph analysis (2)
cli/src/command/stdio.rs (1)
cli/src/command/append.rs (1)
run_append_archive(477-526)
cli/src/command/append.rs (2)
cli/src/command/create.rs (1)
collect_items_from_paths(457-460)lib/src/entry.rs (2)
std(1185-1185)entries(430-444)
🔇 Additional comments (24)
cli/tests/cli/stdio.rs (1)
1-1: LGTM!The new
archive_inclusiontest module is properly added to exercise the@archiveinclusion functionality introduced by this PR.cli/src/command/update.rs (2)
13-14: LGTM!Import changes correctly reflect the rename from
CollectedItemtoCollectedEntryfor filesystem-only collection paths.
526-526: LGTM!The function signature update to
Vec<CollectedEntry>is consistent with the type rename and maintains the update command's filesystem-only semantics (as opposed to the create/append paths which now support@archiveinclusions).cli/src/command/stdio.rs (4)
619-633: Good design: Parse sources after directory change.The comment and implementation correctly ensure that
@archivepaths are resolved relative to the new working directory when-Cis used. This maintains consistency with how filesystem paths are handled.
698-708: LGTM!The
create_archive_filecalls now consistently passfilterandtime_filtersfor both file-based and stdout output paths, enabling proper filtering of entries from included archives.
1002-1025: LGTM!The append path correctly mirrors the create path's pattern: sources are parsed after the directory change, and
run_append_archivereceives the filter and time_filters for archive source processing.
1027-1041: Add stdin conflict validation to append-to-stdout path or confirm protection mechanism.When
archive_pathisNone, stdin is consumed at line 1030 to read the input archive. Iftarget_itemscontainsCollectedItem::ArchiveMarker(ArchiveSource::Stdin)(from@-),run_append_archivewill attempt to read stdin again viaread_archive_source, causing failure. The create path has protection validated by teststdio_archive_inclusion_stdin_stdout_conflict, but the equivalent validation is not evident in the append path. Either add explicit validation beforecollect_items_from_sourceswhenarchive_pathisNone, or add a test case to verify the expected behavior.cli/src/command/create.rs (4)
9-15: LGTM!Import additions correctly bring in the new types (
CollectedItem,EntryResult,TimeFilters) andread_archive_sourcefunction needed for archive inclusion support.
457-460: LGTM!Collected filesystem items are correctly wrapped as
CollectedItem::Filesystemto unify with the new enum-based item representation. This maintains backward compatibility for the non-stdio create path which doesn't support@archiveinclusions.
597-630: LGTM!The write loop correctly handles both
EntryResult::Single(filesystem entries) andEntryResult::Batch(archive-sourced entries) with proper error propagation andOptionhandling for filtered entries.
706-712: LGTM!The
flat_mapcorrectly flattensEntryResultvariants for the non-solid split path, with proper error handling that converts batch errors into single-element error vectors.cli/src/command/append.rs (3)
9-14: LGTM!Import additions correctly mirror the create.rs pattern, bringing in the types and functions needed for unified item processing with archive source support.
463-474: LGTM!The append command correctly wraps filesystem items and passes filter/time_filters to
run_append_archive. This path doesn't parse@archivefrom CLI args (that's handled by the stdio path), maintaining separation of concerns.
477-526: LGTM!The
run_append_archiveimplementation correctly mirrors the create path's pattern:
- Extended signature accepts
filterandtime_filters- Rayon scope processes
Filesystementries in parallelArchiveMarkerentries processed viaread_archive_source- Result loop handles both
SingleandBatchvariants with proper error propagationThis consistency across create and append paths improves maintainability.
cli/src/command/core.rs (10)
318-324: LGTM!The
has_overrides()method provides a clean way to check if any ownership overrides are specified. The implementation correctly checks all four ownership-related fields.
386-399: LGTM!The
CollectedEntrystruct cleanly encapsulates filesystem item data with pre-captured metadata. The separation from the newCollectedItemenum (which now serves as a discriminated union for filesystem vs. archive sources) is well-designed.
401-416: LGTM!The
ArchiveSourceenum cleanly models the two input sources (file path and stdin), and theDisplayimplementation follows the-convention for stdin, which is consistent with CLI tool conventions.
430-456: LGTM!The parsing logic correctly implements the
@archiveconvention. The@-→ stdin mapping and@path→ file mapping follow established CLI conventions (similar to bsdtar).Note: Users needing to reference a filesystem path that starts with
@would need to use a workaround like./@file. This is consistent with how similar tools handle this ambiguity.
458-468: LGTM!The
CollectedItemenum elegantly preserves CLI argument ordering while distinguishing between filesystem entries (ready for processing) and archive markers (indicating where to splice in archive entries). This design cleanly addresses the PR's objective of maintaining original argument order.
495-515: LGTM!The function correctly preserves CLI argument order while handling mixed sources. The sequential processing with shared
HardlinkResolverenables cross-path hardlink detection. Error handling is appropriate with the?operator for filesystem errors.
528-542: LGTM!The return type update to
Vec<CollectedEntry>is consistent with the type rename, and the function's behavior is unchanged.
772-784: LGTM!The parameter type update to
&CollectedEntryis consistent with the type restructuring, maintaining the same functionality.
1521-1538: LGTM!The function cleanly handles both file and stdin sources with appropriate buffering. The 64KB buffer is a reasonable choice for archive I/O.
1588-1717: LGTM!The test updates correctly reflect the type changes from
CollectedItemtoCollectedEntry, and the test coverage forcollect_items_from_pathsremains intact.
5e83d33 to
4d732d7
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
cli/src/command/stdio.rs (1)
1004-1014: Bug:--no-recursiveis ignored in stdio append.
CollectOptions.recursiveusesargs.recursive(Line 1005), but the actual inverse flag isargs.no_recursive. As written,--no-recursivewon’t stop descent.Proposed fix
- let collect_options = CollectOptions { - recursive: args.recursive, + let collect_options = CollectOptions { + recursive: !args.no_recursive, keep_dir: args.keep_dir, gitignore: args.gitignore, nodump: args.nodump, follow_links: args.follow_links, follow_command_links: args.follow_command_links, one_file_system: args.one_file_system, filter: &filter, time_filters: &time_filters, };cli/src/command/append.rs (1)
477-526: Append entry order is not guaranteed under rayon scheduling.
If stdio append relies on exact argument order,run_append_archiveshould enforce ordering (index/buffer) instead of consumingrxin arrival order.cli/src/command/create.rs (1)
569-633: Ordering is not guaranteed; parallel filesystem tasks can complete before synchronously-processed archives.The code mixes asynchronous task spawning (
Filesystemitems viaspawn_fifo) with synchronous processing (ArchiveMarkeritems via directread_archive_source()call). SinceArchiveMarkersends results immediately within the loop, if anArchiveMarkerappears after aFilesystemitem in CLI arguments, the archive's results arrive on the channel before the filesystem item completes, reversing the intended order.This works correctly only when
ArchiveMarkersprecedeFilesystemitems (verified bystdio_archive_inclusion_archive_firsttest). The reverse case is not tested and will fail: running withfile.txt@archive.pna`` produces archive entries before file entries in the output.To fix, either:
- Tag each item with an index and buffer results in a
BTreeMap<usize, EntryResult>, draining in order- Process sequentially when argument order must be preserved (disable parallelism)
🤖 Fix all issues with AI agents
In `@cli/src/command/core.rs`:
- Around line 434-450: ItemSource::parse treats "@" as an empty filename and
doesn't allow escaping a literal leading '@'; change parse to handle "@" as
Archive(ArchiveSource::Stdin), treat arguments starting with "@@" as a
filesystem path with the first '@' removed (i.e. literal leading '@'), keep "@-"
or "@-" semantics if desired, and for single '@' return Stdin instead of
File(PathBuf::from("")). Update parse accordingly (ItemSource::parse) and
parse_many remains fine as it delegates to parse.
♻️ Duplicate comments (2)
cli/src/command/stdio.rs (1)
1026-1042: Append-to-stdout should explicitly reject@-(stdin) sources.
In this branch stdin is already consumed to read the input archive (Line 1030). Allowing@-intarget_itemsmakes the command attempt to read stdin again duringrun_append_archive(...).Proposed guard (mirrors create-path conflict behavior)
+ use crate::command::core::{ArchiveSource, CollectedItem}; } else { let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?; + + if target_items.iter().any(|it| { + matches!(it, CollectedItem::ArchiveMarker(ArchiveSource::Stdin)) + }) { + anyhow::bail!("cannot use `@-` when appending to stdout (-f -): stdin is already used for the input archive"); + } + let mut output_archive = Archive::write_header(io::stdout().lock())?; { let mut input_archive = Archive::read_header(io::stdin().lock())?; for entry in input_archive.raw_entries() { output_archive.add_entry(entry?)?; } } run_append_archive( &create_options, output_archive, target_items, &filter, &time_filters, ) }cli/src/command/core.rs (1)
1515-1532:@archiveshould support split archives (or document it as unsupported).
read_archive_sourcecurrently opens onlyfs::File::open(path)and passes it toArchive::read_header, so@foo.pnawon’t includefoo.pna.part2...style parts.
🧹 Nitpick comments (4)
cli/tests/cli/stdio/archive_inclusion.rs (2)
9-36: Prefer temp directories over fixed cwd-relative folders.
These tests create persistent directories (e.g.,stdio_archive_inclusion_basic/) and don’t clean up; usingtempfile::TempDir(or the project’s existing test temp helper if any) would reduce cross-run pollution.
38-146: Consider adding one more ordering test for interleaving (file1,@archive,file2).
Right now most tests validate presence (viaHashSet), not ordering; interleaving is the tricky case this PR is about.cli/src/command/create.rs (1)
662-719: Same ordering risk in split-archive create, plus avoidVecallocations in the flattener.
Theflat_mapbuilds smallVecs per element; you can avoid that with iterator combinators (std::iter::once,Either, etc.), but the bigger issue is still preserving CLI order.cli/src/command/core.rs (1)
1486-1513: Potential memory spike:transform_archive_entriesmaterializes all transformed entries.
For large included archives this allocates a fullVecbefore writing; consider a streaming iterator approach if this becomes an issue.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- cli/tests/cli/stdio.rs
- cli/src/command/update.rs
🧰 Additional context used
🧬 Code graph analysis (3)
cli/src/command/append.rs (2)
cli/src/command/core.rs (4)
collect_items_from_paths(522-536)read_archive_source(1516-1532)read_paths(1438-1442)read_paths_stdin(1445-1447)cli/src/command/create.rs (1)
collect_items_from_paths(457-460)
cli/src/command/create.rs (2)
cli/src/command/core.rs (2)
read_archive_source(1516-1532)read_paths(1438-1442)lib/src/entry.rs (1)
entries(430-444)
cli/src/command/core.rs (4)
lib/src/entry/header.rs (1)
path(90-92)cli/src/command/core/path_filter.rs (1)
new(12-20)cli/src/command/core/path.rs (1)
new(28-38)lib/src/entry/meta.rs (2)
uname(210-212)gname(240-242)
🔇 Additional comments (3)
cli/src/command/stdio.rs (1)
619-634: Good: parse@archivesources after-Cso relative archive paths behave like other args.
This matches the intended CLI semantics and makes ordering predictable.cli/tests/cli/stdio/archive_inclusion.rs (1)
489-538: Nice: explicit order assertion for@archive-first behavior.
This directly validates the PR’s main user-visible promise.cli/src/command/create.rs (1)
457-460: Good: normalize collected filesystem entries intoCollectedItem::Filesystem.
Keeps create’s existing path-collection behavior while allowing stdio to injectArchiveMarkers elsewhere.
4d732d7 to
ace7fb0
Compare
2f699d2 to
d830ba2
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cli/src/command/create.rs (1)
542-635: Entry order is not guaranteed due to mixing inline and spawned task completion in the channel.The code intends to preserve CLI argument order (documented in
core.rs), and teststdio_archive_inclusion_archive_first()validates this for the case where@archiveis placed before filesystem files. However, the current implementation has a race condition.When the loop encounters
CollectedItem::ArchiveMarker, it processes the archive synchronously and sends results to the channel immediately. When it encountersCollectedItem::Filesystem, it spawns a task that sends results asynchronously. A channel delivers messages in completion order, not spawn order—so if a spawned filesystem task completes late, but an inline archive marker result was sent early, the order is correct; but reversed input (file.txt@archive.pna``) would cause the archive entries to be sent before the filesystem entry completes, breaking CLI order.This scenario isn't covered by existing tests. The suggested fix (indexing and sorting by original position before consuming from the channel) correctly solves the problem and ensures order is preserved regardless of input sequence.
cli/src/command/append.rs (1)
477-526: Append ordering is lost due to mixed async/sync operations with rayon spawn_fifo and channel receive order.Filesystem entries are spawned asynchronously (lines 488-495), while archive markers are processed inline synchronously (lines 496-500). Since results are consumed in channel receive order (line 508+), not submission order, a fast inline archive marker can arrive at the channel before a slower spawned filesystem entry, reordering entries like
file1@archive`` in practice.
♻️ Duplicate comments (2)
cli/src/command/stdio.rs (1)
1002-1042: Reject@-when appending-to-stdout (stdin is already consumed by the base archive).
In thearchive_path == Nonebranch, stdin is used to read the input archive (Line 1030), andrun_append_archivecan later try to read stdin again via@-→ArchiveSource::Stdin. If there isn’t an earlier guard, this needs an explicit validation before reading stdin / before callingrun_append_archive.Proposed fix (guard before reading stdin / copying base archive)
} else { let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?; + // stdin is used for the base archive; disallow using it again via `@-` + if target_items.iter().any(|it| matches!( + it, + crate::command::core::CollectedItem::ArchiveMarker(crate::command::core::ArchiveSource::Stdin) + )) { + anyhow::bail!("cannot use `@-` when appending to stdout (-f -): stdin is already used for the input archive"); + } let mut output_archive = Archive::write_header(io::stdout().lock())?; { let mut input_archive = Archive::read_header(io::stdin().lock())?; for entry in input_archive.raw_entries() { output_archive.add_entry(entry?)?; } } run_append_archive( &create_options, output_archive, target_items, &filter, &time_filters, ) }cli/src/command/core.rs (1)
1476-1534:@archivesplit/multipart support likely missing again (only first part is read).
read_archive_source(ArchiveSource::File)opens a single file (Line 1526-1528). If the archive is split,Archive::read_header(reader)won’t automatically open subsequent parts, so inclusion will be partial or error oncehas_next_archive()is hit.Sketch: reuse existing “across archive” traversal for file sources
pub(crate) fn read_archive_source( source: &ArchiveSource, create_options: &CreateOptions, filter: &PathFilter<'_>, time_filters: &TimeFilters, ) -> io::Result<Vec<io::Result<Option<NormalEntry>>>> { match source { ArchiveSource::File(path) => { - let reader = io::BufReader::with_capacity(64 * 1024, fs::File::open(path)?); - transform_archive_entries(reader, create_options, filter, time_filters) + let archives = collect_split_archives(path)?; + let readers = archives + .into_iter() + .map(|f| io::BufReader::with_capacity(64 * 1024, f)); + let mut results = Vec::new(); + run_across_archive(readers, |archive| { + for entry_result in archive.entries().extract_solid_entries(None) { + match entry_result { + Ok(entry) => { + if filter.excluded(entry.header().path()) { + continue; + } + let ctime = entry.metadata().created_time(); + let mtime = entry.metadata().modified_time(); + if !time_filters.matches_or_inactive(ctime, mtime) { + continue; + } + results.push(transform_normal_entry(entry, create_options)); + } + Err(e) => results.push(Err(e)), + } + } + Ok(()) + })?; + Ok(results) } ArchiveSource::Stdin => { let reader = io::BufReader::new(io::stdin().lock()); transform_archive_entries(reader, create_options, filter, time_filters) } } }
🧹 Nitpick comments (6)
cli/src/command/stdio.rs (1)
10-16: Dropcollect_items_from_pathsfrom imports if it’s no longer needed in this module.
Right nowItemSource/collect_items_from_sourcesis introduced, butcollect_items_from_pathsis still imported (Line 14). If this module no longer needs it (outsiderun_update), consider tightening imports.cli/src/command/create.rs (1)
637-720: Non-solid split path:flat_map+vec![..]allocations are avoidable.
This isn’t a blocker, butflat_mapcurrently allocates aVecper item (Line 706-712). Consider mapping to an iterator without intermediateVecs if this shows up in profiles.cli/src/command/core.rs (1)
466-476: Consider boxingEntryResult::Batchto reduce enum size skew.
Tooling already flags “large size difference between variants”;Batch(Box<...>)keeps the channel payload smaller/copy-friendlier.cli/tests/cli/stdio/archive_inclusion.rs (3)
9-36: Helpers look fine; consider usingWriteOptions::store()for more deterministic test archives.
Not required, but it can make tests less sensitive to default option changes.
489-539: Add an ordering test forfilethen@archive(reverse of archive-first).
You currently assert ordering when@archivecomes first (Line 489+). Given the implementation uses concurrency + a channel, the “file first” case is the one most likely to reorder.
289-310: Add a test for append-to-stdout (-f -) combined with@-in append mode.
Create-mode has the stdin/stdout conflict test (Line 289+). Append-to-stdout has the same stdin double-use hazard and should be explicitly covered.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- cli/src/command/update.rs
- cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (3)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(90-92)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(199-218)lib/src/chunk/types.rs (1)
len(153-155)
cli/src/command/create.rs (1)
cli/src/command/core.rs (3)
collect_items_from_paths(524-538)create_entry(768-809)read_archive_source(1518-1534)
cli/src/command/append.rs (2)
cli/src/command/core.rs (4)
collect_items_from_paths(524-538)read_archive_source(1518-1534)read_paths(1440-1444)read_paths_stdin(1447-1449)cli/src/command/create.rs (1)
collect_items_from_paths(457-460)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Run cargo fuzz (split_archive)
- GitHub Check: tier1 (macos-latest, stable)
- GitHub Check: tier1 (windows-11-arm, beta)
- GitHub Check: tier1 (windows-11-arm, stable)
- GitHub Check: tier1 (windows-latest, nightly)
- GitHub Check: tier1 (macos-latest, nightly)
- GitHub Check: tier1 (windows-latest, stable)
- GitHub Check: tier1 (windows-latest, beta)
- GitHub Check: tier1 (windows-11-arm, nightly)
- GitHub Check: tier1 (ubuntu-22.04-arm, stable)
- GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
- GitHub Check: tier1 (ubuntu-22.04-arm, beta)
- GitHub Check: Test WebAssembly (beta, wasm32-unknown-emscripten)
- GitHub Check: Test on OpenBSD
- GitHub Check: Test on OmniOS
- GitHub Check: tier1 (ubuntu-latest, beta)
- GitHub Check: tier1 (windows-latest, stable)
🔇 Additional comments (4)
cli/src/command/stdio.rs (1)
616-634: Good:@archiveparsing happens after-C, and filters/time-filters are plumbed through.
The “parse after changing directory” comment +ItemSource::parse_many(&files)(Line 619-620) matches the intended semantics, and passing&filter/&time_filtersintocreate_archive_filealigns archive-inclusions with filesystem filtering.Also applies to: 693-709
cli/src/command/create.rs (1)
457-460: Nice simplification: wrapping collected filesystem entries intoCollectedItem::Filesystem.
Keeps create-side “no @” behavior unchanged while enabling reuse of the new archive-marker pipeline.cli/src/command/append.rs (1)
463-474: Good: filter/time-filters are plumbed into append archive entry inclusion.
Wiring&filterand&time_filtersthrough torun_append_archiveis the right shape for consistent behavior across filesystem and included archives.cli/src/command/core.rs (1)
424-511: LGTM:ItemSource+collect_items_from_sourcespreserve CLI argument ordering.
The model cleanly separates “expand filesystem” vs “insert archive marker” while keeping the input sequence intact.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
cli/src/command/core.rs (1)
1477-1537:@archivesplit-archive handling still looks incomplete (single-reader).
transform_archive_entriesusesArchive::read_header(reader)and never advances viaread_next_archive, andread_archive_sourceopens only one file. If@archiveshould support multipart archives like other commands, this will likely only process the first part.cli/src/command/stdio.rs (1)
1027-1042: Append-to-stdout path can double-read stdin when using@-.
Whenarchive_pathisNone, stdin is consumed to copy the input archive (Line 1030+), andrun_append_archivemay later try to read stdin again forCollectedItem::ArchiveMarker(ArchiveSource::Stdin). Add an upfront guard that rejects@-in this mode.
🧹 Nitpick comments (6)
cli/src/command/core.rs (2)
424-452:ItemSource::parsesemantics are clear; add explicit escape docs for literal leading@.
Docs mention./@file(good). Consider also documenting Windows-friendly equivalents (e.g.,.\\@file) since@-leading filenames aren’t rare.
1539-1585: Ownership override application is correct but could avoid repeated cloning.
uname.clone()/gname.clone()per entry may be a noticeable cost for large includes; consider borrowing or precomputing the fallback strings when overrides are present.cli/tests/cli/stdio/archive_inclusion.rs (1)
289-310: Add a sibling test for append-to-stdout +@-conflict.
You already cover create-to-stdout conflict; add an append equivalent to prevent regressions of the stdin double-consumption path.cli/src/command/create.rs (2)
581-586:ArchiveMarkerwork should be offloaded from the scope thread.
Right now it runs inline, reducing concurrency and making reordering more likely; spawning it (as in the diff above) keeps behavior uniform.Also applies to: 674-679
706-718: Non-solid split path allocates per message (vec![...]) and can be heavy.
If you keepEntryResult::Batch(Vec<...>), consider iterating batches withoutvec!wrapping (e.g., via small iterator adapters) to reduce transient allocations.cli/src/command/append.rs (1)
496-521: Batching entire included archives into memory can be expensive.
read_archive_sourcereturningVec<...>means large@archiveincludes may spike RAM. A streaming approach (send entries incrementally, still ordered via indices) would scale better.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/tests/cli/stdio.rscli/tests/cli/stdio/archive_inclusion.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- cli/src/command/update.rs
- cli/tests/cli/stdio.rs
🧰 Additional context used
🧬 Code graph analysis (4)
cli/src/command/stdio.rs (1)
cli/src/command/core.rs (3)
collect_items_from_sources(492-512)collect_split_archives(714-731)parse_many(449-451)
cli/tests/cli/stdio/archive_inclusion.rs (5)
cli/tests/cli/utils/archive.rs (1)
for_each_entry(122-127)lib/src/entry/header.rs (1)
path(90-92)lib/src/archive/write.rs (1)
write_header(85-88)lib/src/entry/builder.rs (1)
new_file(199-218)lib/src/chunk/types.rs (1)
len(153-155)
cli/src/command/append.rs (1)
cli/src/command/core.rs (4)
collect_items_from_paths(525-539)create_entry(769-810)entry_option(812-832)read_archive_source(1519-1537)
cli/src/command/core.rs (5)
lib/src/entry/header.rs (1)
path(90-92)cli/src/command/core/re/gnu.rs (2)
new(39-93)new(151-153)lib/src/entry.rs (1)
reader(1013-1029)lib/src/entry/meta.rs (2)
uname(210-212)gname(240-242)cli/src/command/core/time_filter.rs (1)
matches(74-80)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: tier1 (macos-latest, beta)
- GitHub Check: tier1 (windows-latest, beta)
- GitHub Check: tier1 (windows-11-arm, beta)
- GitHub Check: tier1 (ubuntu-22.04-arm, stable)
- GitHub Check: tier1 (ubuntu-22.04-arm, nightly)
- GitHub Check: Run rust-clippy analyzing
- GitHub Check: tier1 (macos-latest, nightly)
- GitHub Check: tier1 (windows-latest, stable)
- GitHub Check: tier1 (macos-latest, stable)
- GitHub Check: tier1 (ubuntu-22.04-arm, stable)
- GitHub Check: tier1 (windows-latest, beta)
- GitHub Check: tier1 (windows-latest, nightly)
- GitHub Check: tier1 (ubuntu-22.04-arm, beta)
🔇 Additional comments (9)
cli/src/command/core.rs (5)
380-477: New CLI-source abstractions look coherent and well-scoped.
ArchiveSource/ItemSource/CollectedItem/EntryResultseparation matches the “preserve CLI ordering” goal and keeps filesystem collection concerns isolated.
492-512:collect_items_from_sourcespreserves argument order, but downstream must not reorder results.
This function emits markers in-order; ensure the consumer (create/append) doesn’t reintroduce reordering via parallelism.
769-810:create_entrysignature migration toCollectedEntrylooks correct.
Destructuring + metadata reuse stays straightforward.
675-706: Time filtering on filesystem collection is applied at the right point.
Filtering after metadata acquisition but before push avoids extra work and keeps behavior consistent.
1718-1783: Parsing tests cover the key edge cases.
Nice coverage for@,@-,@path, and./@literal.cli/src/command/stdio.rs (1)
619-634: Parsing sources after-Cis the right behavior.
This makes@archivepaths consistently relative to the adjusted working directory.Also applies to: 1002-1019
cli/tests/cli/stdio/archive_inclusion.rs (1)
1-743: Great breadth of coverage for inclusion, ordering, filters, and solid mode.
This is the right level of end-to-end validation for a CLI ordering guarantee change.cli/src/command/create.rs (1)
457-460: Wrapping collected filesystem entries asCollectedItem::Filesystemis fine (no behavior change).
This prepares the pipeline for mixed sources cleanly.cli/src/command/append.rs (1)
463-474: Signature change to passfilter/time_filtersthrough is consistent.
Good plumbing to make inclusion honor the same filtering semantics.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
6162aa1 to
c366417
Compare
Redesign @archive inclusion to preserve CLI argument order exactly. The previous implementation separated filesystem paths and @Archives before processing, which lost the original ordering. Key changes: - Add ItemSource, ArchiveSource, CollectedItem types to core.rs - Add collect_items_from_sources for unified processing - Process arguments one by one, maintaining order - Support @- for reading archives from stdin - Detect stdin conflicts (input archive from stdin vs @-)
c366417 to
392c736
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cli/src/command/create.rs`:
- Around line 9-16: The import list in create.rs is not formatted according to
rustfmt (CI failing); run rustfmt (cargo fmt) and reformat the file,
specifically realign and wrap the use/import block that contains symbols like
AclStrategy, CollectOptions, CollectedItem, CreateOptions, EntryResult,
FflagsStrategy, KeepOptions, MIN_SPLIT_PART_BYTES, MacMetadataStrategy,
PathFilter, PathTransformers, PathnameEditor, PermissionStrategyResolver,
TimeFilterResolver, TimeFilters, TimestampStrategyResolver, XattrStrategy,
collect_items_from_paths, drain_entry_results, entry_option,
re::{bsd::SubstitutionRule, gnu::TransformRule}, read_paths, read_paths_stdin,
spawn_entry_results, write_split_archive so the imports conform to rustfmt rules
(line breaks/comma placement) and then re-run cargo fmt --check to ensure CI
passes.
♻️ Duplicate comments (3)
cli/src/command/core.rs (2)
528-565: Ordering can be lost with parallel sends.
spawn_entry_resultssends filesystem entries from Rayon tasks and archive batches inline; the receiver processes completion order, not CLI order. This can reorder entries (and hardlink targets) despite the PR’s ordering guarantee. Please add indexed ordering (or a sequential path) before draining.💡 Possible fix (index + reorder)
+use std::collections::BTreeMap; -use std::sync::mpsc::Receiver; +use std::sync::mpsc::Receiver; -pub(crate) fn spawn_entry_results(...) -> Receiver<EntryResult> { - let (tx, rx) = std::sync::mpsc::channel(); +pub(crate) fn spawn_entry_results(...) -> Receiver<(usize, EntryResult)> { + let (tx, rx) = std::sync::mpsc::channel::<(usize, EntryResult)>(); rayon::scope_fifo(|s| { - for item in target_items { + for (idx, item) in target_items.into_iter().enumerate() { match item { CollectedItem::Filesystem(entry) => { let tx = tx.clone(); s.spawn_fifo(move |_| { log::debug!("Adding: {}", entry.path.display()); - tx.send(EntryResult::Single(create_entry(&entry, create_options))) + tx.send((idx, EntryResult::Single(create_entry(&entry, create_options)))) .unwrap_or_else(|e| log::error!("{e}: {}", entry.path.display())); }) } CollectedItem::ArchiveMarker(source) => { let result = read_archive_source( &source, create_options, filter, time_filters, password, ); - tx.send(EntryResult::Batch(result)) + tx.send((idx, EntryResult::Batch(result))) .unwrap_or_else(|e| log::error!("{e}: archive source {}", source)); } } } drop(tx); }); rx } -pub(crate) fn drain_entry_results<I, F, T>(results: I, mut add_entry: F) -> io::Result<()> +pub(crate) fn drain_entry_results<I, F, T>(results: I, mut add_entry: F) -> io::Result<()> where - I: IntoIterator<Item = EntryResult>, + I: IntoIterator<Item = (usize, EntryResult)>, F: FnMut(NormalEntry) -> io::Result<T>, { - for result in results { + let mut next = 0usize; + let mut pending = BTreeMap::<usize, EntryResult>::new(); + for (idx, result) in results { + pending.insert(idx, result); + while let Some(result) = pending.remove(&next) { + next += 1; match result { EntryResult::Single(entry) => { if let Some(entry) = entry? { add_entry(entry)?; } } EntryResult::Batch(entries) => { for entry in entries? { if let Some(entry) = entry? { add_entry(entry)?; } } } } + } } Ok(()) }
1565-1629:@archiveignores split archive parts.
read_archive_sourceopens only the first file;transform_archive_entriesdoesn’t advance to subsequent parts. For split archives,@archive.part1.pnawill only process part 1. Other commands usecollect_split_archives; please align here or explicitly document the limitation.💡 Possible fix (collect split parts)
- ArchiveSource::File(path) => { - let file = fs::File::open(path) - .map_err(|e| io::Error::new(e.kind(), format!("{}: {}", path.display(), e)))?; - let reader = io::BufReader::with_capacity(64 * 1024, file); - transform_archive_entries(reader, create_options, filter, time_filters, password) - .map_err(|e| io::Error::new(e.kind(), format!("{}: {}", path.display(), e))) - } + ArchiveSource::File(path) => { + let archives = collect_split_archives(path)?; + let mut results = Vec::new(); + run_across_archive( + archives.into_iter().map(|f| io::BufReader::with_capacity(64 * 1024, f)), + |archive| { + for entry_result in archive.entries().extract_solid_entries(password) { + match entry_result { + Ok(entry) => { + if !filter.excluded(entry.header().path()) { + let ctime = entry.metadata().created_time(); + let mtime = entry.metadata().modified_time(); + if time_filters.matches_or_inactive(ctime, mtime) { + results.push(transform_normal_entry(entry, create_options)); + } + } + } + Err(e) => results.push(Err(e)), + } + } + Ok(()) + }, + )?; + Ok(results) + }cli/src/command/stdio.rs (1)
1032-1047: Guard against@-when appending to stdout.In the
archive_path == Nonebranch, stdin is consumed to read the existing archive, and@-would attempt to read stdin again. Add a guard to rejectArchiveSource::Stdinin this branch.🛠️ Suggested guard
- } else { - let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?; + } else { + let target_items = collect_items_from_sources(sources, &collect_options, &mut resolver)?; + if target_items.iter().any(|item| { + matches!(item, CollectedItem::ArchiveMarker(ArchiveSource::Stdin)) + }) { + anyhow::bail!("cannot use `@-` when appending to stdout (stdin is already in use)"); + } let mut output_archive = Archive::write_header(io::stdout().lock())?; { let mut input_archive = Archive::read_header(io::stdin().lock())?; for entry in input_archive.raw_entries() { output_archive.add_entry(entry?)?; } } run_append_archive(You’ll also need to import
CollectedItemandArchiveSourcein this module.
392c736 to
0b3b912
Compare
Redesign @archive inclusion to preserve CLI argument order exactly. The previous implementation separated filesystem paths and @Archives before processing, which lost the original ordering.
Key changes:
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.