🐛 Preserve Windows broken symlink metadata#2819
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces path-based Windows metadata/stat logic with handle-based APIs, adds no-follow ACL retrieval and SecurityDescriptor::try_from_handle, changes HardlinkResolver to honor follow_command_links, adds symlink-focused integration tests, and normalizes Windows symlink paths in pna. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as Core CLI
participant FS as Windows FS (open_read_metadata, file_information, mode_from_file_information)
participant SEC as SecurityDescriptor (try_from_handle / try_from)
participant ACL as ACL helpers (get_facl / get_facl_nofollow)
CLI->>FS: open_read_metadata(path, follow_symlink)
FS-->>CLI: FileHandle
CLI->>FS: file_information(handle)
FS-->>CLI: BY_HANDLE_FILE_INFORMATION
CLI->>FS: mode_from_file_information(path, info, is_symlink)
FS-->>CLI: file mode
alt is symlink
CLI->>ACL: get_facl_nofollow(path)
ACL->>FS: open_read_metadata(path, follow_symlink=false)
FS-->>ACL: FileHandle
ACL->>SEC: try_from_handle(handle, path)
SEC-->>ACL: SecurityDescriptor
else not symlink
CLI->>ACL: get_facl(path)
ACL->>SEC: try_from(path)
SEC-->>ACL: SecurityDescriptor
end
ACL-->>CLI: chunk::Acl
CLI->>CLI: apply_metadata(mode, ACL)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a long-standing issue on Windows where metadata, particularly ACLs and permissions, for symbolic links (especially broken ones) was not being correctly preserved. The changes introduce a more robust mechanism for interacting with Windows file system objects at a lower level, allowing the tool to read and apply metadata directly to symlinks without attempting to dereference them. This significantly improves the reliability of archiving and extracting files that include symbolic links on Windows systems. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an issue with preserving metadata for broken symbolic links on Windows by replacing CRT functions with native Windows APIs. The changes are well-structured, introducing helpers like FileHandle for resource management and specific functions to handle symlinks without following them. The addition of comprehensive tests for broken symlinks and symlink-following options is also a great improvement. I have a couple of minor suggestions regarding the use of recently stabilized Rust APIs, which might affect compatibility with older toolchains. Overall, this is a high-quality contribution.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
cli/tests/cli/stdio/symlink.rs (2)
49-58: Prevent silent overwrite of duplicate archive paths in test indexing.At Line 57,
HashMap::insertwill replace an existing entry with the same path. If duplicates slip into an archive, this helper can mask it and make assertions misleading.Proposed tightening
fn archive_entries(path: &Path) -> HashMap<String, (DataKind, Option<String>)> { let mut entries = HashMap::new(); for_each_entry(path, |entry| { let kind = entry.header().data_kind(); let link_target = match kind { DataKind::SymbolicLink => Some(normalize_link_target(&read_symlink_target(&entry))), _ => None, }; - entries.insert(entry.header().path().to_string(), (kind, link_target)); + let prev = entries.insert(entry.header().path().to_string(), (kind, link_target)); + assert!(prev.is_none(), "duplicate archive path detected in test fixture"); }) .unwrap(); entries }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/tests/cli/stdio/symlink.rs` around lines 49 - 58, The helper archive_entries uses HashMap::insert which silently overwrites duplicate paths; change it to detect duplicates and fail loudly: in archive_entries (inside the for_each_entry closure) check whether entries already contains the key (use entries.entry(entry.header().path().to_string()) or entries.contains_key) before inserting, and if present panic/unwrap with a clear message including the duplicate path (and optionally the previous and new DataKind/link_target) so test indexing cannot silently mask duplicate archive entries.
169-307: Consider extracting shared CLI invocation helpers in this test module.The create/extract command scaffolding is repeated several times (e.g., Line 169–184, Line 187–201, Line 209–225, Line 228–242, Line 294–308). Pulling that into tiny helpers will make future option changes less error-prone.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/tests/cli/stdio/symlink.rs` around lines 169 - 307, The test repeats identical CLI invocations for creating and extracting archives; add small helper functions (e.g., create_archive and extract_archive) that wrap cargo_bin_cmd!("pna").args(...).assert().success() and accept the varying parameters (archive path, input dir, flags like "-H", list of paths, and out-dir) and then replace the repeated blocks (the sequences creating test1_archive/test2_archive/test3_archive and extracting to test1_out/test2_out/test3_out) with calls to these helpers; reference and update uses around cargo_bin_cmd!("pna") invocations, test*_archive/test*_out variables, and keep assertions (assert_symlink and archive_entries) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cli/tests/cli/stdio/symlink.rs`:
- Around line 49-58: The helper archive_entries uses HashMap::insert which
silently overwrites duplicate paths; change it to detect duplicates and fail
loudly: in archive_entries (inside the for_each_entry closure) check whether
entries already contains the key (use
entries.entry(entry.header().path().to_string()) or entries.contains_key) before
inserting, and if present panic/unwrap with a clear message including the
duplicate path (and optionally the previous and new DataKind/link_target) so
test indexing cannot silently mask duplicate archive entries.
- Around line 169-307: The test repeats identical CLI invocations for creating
and extracting archives; add small helper functions (e.g., create_archive and
extract_archive) that wrap cargo_bin_cmd!("pna").args(...).assert().success()
and accept the varying parameters (archive path, input dir, flags like "-H",
list of paths, and out-dir) and then replace the repeated blocks (the sequences
creating test1_archive/test2_archive/test3_archive and extracting to
test1_out/test2_out/test3_out) with calls to these helpers; reference and update
uses around cargo_bin_cmd!("pna") invocations, test*_archive/test*_out
variables, and keep assertions (assert_symlink and archive_entries) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 12b45c80-cf02-49d1-9b68-c6605da1ebe4
📒 Files selected for processing (2)
cli/tests/cli/stdio/symlink.rspna/src/fs.rs
c86b581 to
9767939
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cli/tests/cli/stdio/symlink.rs (1)
26-47: Consider cleaning up test directories.The test setup functions remove and recreate directories but don't clean up after tests complete. Consider using a test fixture pattern or adding cleanup in test teardown to avoid leaving test artifacts.
💡 Optional: Use tempdir for automatic cleanup
+use tempfile::tempdir; + #[test] fn stdio_broken_symlink_no_follow_roundtrip() { setup(); - let base = PathBuf::from("stdio_broken_symlink_no_follow"); + let base = tempdir().unwrap(); + let base = base.path(); let source = base.join("source");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/tests/cli/stdio/symlink.rs` around lines 26 - 47, The test helper functions init_broken_resource and init_h_upper_resource create directories but never remove them; change them to use temporary directories (e.g., return or accept a tempfile::TempDir) or ensure each test registers teardown cleanup (remove_dir_all) after assertions so artifacts are not left on disk; specifically update these functions to either return a TempDir handle to let the caller drop it or add a companion cleanup step that removes the created dir (use the same Path passed into init_broken_resource/init_h_upper_resource) to guarantee automatic deletion after tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cli/tests/cli/stdio/symlink.rs`:
- Around line 26-47: The test helper functions init_broken_resource and
init_h_upper_resource create directories but never remove them; change them to
use temporary directories (e.g., return or accept a tempfile::TempDir) or ensure
each test registers teardown cleanup (remove_dir_all) after assertions so
artifacts are not left on disk; specifically update these functions to either
return a TempDir handle to let the caller drop it or add a companion cleanup
step that removes the created dir (use the same Path passed into
init_broken_resource/init_h_upper_resource) to guarantee automatic deletion
after tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 06ff9190-ef36-4b77-b289-7fa74f110624
📒 Files selected for processing (11)
cli/src/command/append.rscli/src/command/core.rscli/src/command/create.rscli/src/command/stdio.rscli/src/command/update.rscli/src/utils/os/windows/acl.rscli/src/utils/os/windows/fs.rscli/src/utils/os/windows/security.rscli/tests/cli/stdio.rscli/tests/cli/stdio/symlink.rspna/src/fs.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- cli/src/command/update.rs
- cli/src/command/append.rs
- cli/src/utils/os/windows/acl.rs
87d6551 to
c1b323d
Compare
Use handle-based Windows security APIs (GetSecurityInfo/SetSecurityInfo) with FILE_FLAG_OPEN_REPARSE_POINT to read and write metadata on symlinks without following them. This fixes broken symlinks losing their mode, ownership, and ACL information during archival. Changes: - Add FileHandle RAII wrapper and open_path with follow_symlink control - Add SecurityDescriptor::try_from_handle and apply_by_handle - Switch lchown and chmod to handle-based APIs for symlink correctness - Add set_facl_nofollow for symlink-aware ACL writes - Extract build_acl_buffer to share between set_d_acl and set_d_acl_by_handle - Use nofollow metadata collection in core.rs for symlink entries
1081cd2 to
9cad013
Compare
Remove path field from SecurityDescriptor and eliminate all path-based Win32 APIs (GetNamedSecurityInfoW/SetNamedSecurityInfoW). All security operations now go through handle-based APIs, making follow_links control purely a handle-opening concern. - Fix UB in build_acl_buffer by setting Vec length after InitializeAcl - Remove SecurityDescriptor::path field and try_from(path) constructor - Extract apply_security_info helper to eliminate duplication - Consolidate ACL constructors from 3 to 1 (try_from_handle) - Unify get_facl/set_facl with follow_links parameter across platforms - Wire set_facl nofollow into extraction path for symlink ACL restoration - Remove inline #[cfg] from core.rs by using unified ACL API - Add Debug derive to FileHandle - Add broken symlink permission preservation test
9cad013 to
d6d7511
Compare
Summary by CodeRabbit