Skip to content

🐛 Preserve Windows broken symlink metadata#2819

Merged
ChanTsune merged 2 commits into
mainfrom
codex/windows-broken-symlink-metadata
Apr 8, 2026
Merged

🐛 Preserve Windows broken symlink metadata#2819
ChanTsune merged 2 commits into
mainfrom
codex/windows-broken-symlink-metadata

Conversation

@ChanTsune
Copy link
Copy Markdown
Owner

@ChanTsune ChanTsune commented Mar 13, 2026

Summary by CodeRabbit

  • Improvements
    • Windows: more robust metadata and ACL handling that uses handle-based reads and differentiates symlinks vs targets.
    • Windows: normalized symlink path separators for more reliable create/restore behavior.
    • Link-following: honoring both general and command-line follow options across create/append/update flows.
  • Tests
    • Added comprehensive symlink integration tests covering archive creation, extraction, targets, and permission preservation.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces path-based Windows metadata/stat logic with handle-based APIs, adds no-follow ACL retrieval and SecurityDescriptor::try_from_handle, changes HardlinkResolver to honor follow_command_links, adds symlink-focused integration tests, and normalizes Windows symlink paths in pna.

Changes

Cohort / File(s) Summary
Core apply_metadata
cli/src/command/core.rs
Switches Windows metadata retrieval to use open_read_metadata handle, file_information, and mode_from_file_information; chooses no-follow vs follow ACL retrieval based on symlink status.
Windows ACL helpers
cli/src/utils/os/windows/acl.rs
Adds get_facl_nofollow and ACL::try_from_nofollow that build ACLs from handles opened without following symlinks; uses handle-based SecurityDescriptor construction.
Windows FS utilities
cli/src/utils/os/windows/fs.rs
Adds FileHandle, open_path/open_read_metadata, file_information, mode_from_file_information, symlink-aware chmod, constants for mode bits; removes legacy stat.
Windows security descriptor
cli/src/utils/os/windows/security.rs
Adds SecurityDescriptor::try_from_handle(handle, path) which obtains owner/group/DACL via GetSecurityInfo on a handle.
Hardlink resolver usage
cli/src/command/...
cli/src/command/append.rs, cli/src/command/create.rs, cli/src/command/stdio.rs, cli/src/command/update.rs
Constructs HardlinkResolver with `(follow_links
Symlink tests
cli/tests/cli/stdio.rs, cli/tests/cli/stdio/symlink.rs
Adds symlink test module and comprehensive tests for broken-symlink round-trips and -H behavior with helpers to inspect archived entries and permissions.
PNA symlink normalization
pna/src/fs.rs
Normalizes Windows path separators for original and link targets, resolves relative targets against normalized parent, and adjusts symlink creation calls for Windows.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Core CLI
    participant FS as Windows FS (open_read_metadata, file_information, mode_from_file_information)
    participant SEC as SecurityDescriptor (try_from_handle / try_from)
    participant ACL as ACL helpers (get_facl / get_facl_nofollow)

    CLI->>FS: open_read_metadata(path, follow_symlink)
    FS-->>CLI: FileHandle
    CLI->>FS: file_information(handle)
    FS-->>CLI: BY_HANDLE_FILE_INFORMATION
    CLI->>FS: mode_from_file_information(path, info, is_symlink)
    FS-->>CLI: file mode

    alt is symlink
        CLI->>ACL: get_facl_nofollow(path)
        ACL->>FS: open_read_metadata(path, follow_symlink=false)
        FS-->>ACL: FileHandle
        ACL->>SEC: try_from_handle(handle, path)
        SEC-->>ACL: SecurityDescriptor
    else not symlink
        CLI->>ACL: get_facl(path)
        ACL->>SEC: try_from(path)
        SEC-->>ACL: SecurityDescriptor
    end

    ACL-->>CLI: chunk::Acl
    CLI->>CLI: apply_metadata(mode, ACL)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I nibbled handles, chased each link so spry,
I fetched modes from handles, not the sky,
No-follow ACLs hop into place,
Tests tiptoe through symlinks with soft grace,
Hoppy bytes and tidy traces—hooray! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: preserving Windows broken symlink metadata, which aligns with the PR's core modifications to handle-based metadata retrieval and symlink-aware ACL operations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/windows-broken-symlink-metadata

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a long-standing issue on Windows where metadata, particularly ACLs and permissions, for symbolic links (especially broken ones) was not being correctly preserved. The changes introduce a more robust mechanism for interacting with Windows file system objects at a lower level, allowing the tool to read and apply metadata directly to symlinks without attempting to dereference them. This significantly improves the reliability of archiving and extracting files that include symbolic links on Windows systems.

Highlights

  • Windows Symlink Metadata Preservation: The core logic for applying metadata on Windows has been refactored to correctly handle symbolic links, especially broken ones. This ensures that owner, group, and file mode information can be retrieved and applied directly to the symlink itself, rather than its target.
  • ACL Handling for Symlinks: New functionality has been introduced to retrieve Access Control Lists (ACLs) for symbolic links on Windows without following them. This prevents errors when dealing with symlinks pointing to non-existent targets and ensures the ACLs of the symlink itself are preserved.
  • Windows File System Utilities: Several new low-level Windows file system utilities have been added, including functions to open file handles with specific access rights and symlink following behavior, retrieve detailed file information by handle, and derive file modes from this information. The chmod function was also updated to correctly modify attributes of symlinks directly.
  • New Symlink Tests: A new test module and associated tests have been added to verify the correct behavior of symlink metadata preservation, broken symlink handling, and the -H option (which controls symlink following) during archive creation and extraction on Windows.
Changelog
  • cli/src/command/core.rs
    • Updated Windows metadata application logic to use handle-based security descriptor retrieval.
    • Modified ACL retrieval for Windows to explicitly handle symlinks using a new no-follow function.
  • cli/src/utils/os/windows/acl.rs
    • Implemented get_facl_nofollow to retrieve ACLs of symlinks without dereferencing them.
    • Added ACL::try_from_nofollow to create an ACL object from a symlink's security descriptor.
  • cli/src/utils/os/windows/fs.rs
    • Introduced FileHandle struct for managing Windows file handles.
    • Added open_path and open_read_metadata functions for opening files/symlinks with specific access and symlink following behavior.
    • Implemented file_information to retrieve BY_HANDLE_FILE_INFORMATION for a given handle.
    • Added mode_from_file_information to derive file mode from Windows file information.
    • Modified chmod to correctly apply permissions to symlinks on Windows by directly manipulating their attributes.
    • Removed the deprecated stat function.
  • cli/src/utils/os/windows/security.rs
    • Added try_from_handle method to SecurityDescriptor to retrieve security information using a file handle.
  • cli/tests/cli/stdio.rs
    • Included a new test module symlink for symlink-related tests.
  • cli/tests/cli/stdio/symlink.rs
    • Added stdio_broken_symlink_no_follow_roundtrip test to verify broken symlink metadata preservation.
    • Added stdio_option_h_follows_only_command_line_symlinks test to confirm correct symlink following behavior with the -H option.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses an issue with preserving metadata for broken symbolic links on Windows by replacing CRT functions with native Windows APIs. The changes are well-structured, introducing helpers like FileHandle for resource management and specific functions to handle symlinks without following them. The addition of comprehensive tests for broken symlinks and symlink-following options is also a great improvement. I have a couple of minor suggestions regarding the use of recently stabilized Rust APIs, which might affect compatibility with older toolchains. Overall, this is a high-quality contribution.

Comment thread cli/src/utils/os/windows/fs.rs
Comment thread cli/src/utils/os/windows/fs.rs Outdated
@github-actions github-actions Bot added the cli This issue is about cli application label Mar 13, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cli/tests/cli/stdio/symlink.rs (2)

49-58: Prevent silent overwrite of duplicate archive paths in test indexing.

At Line 57, HashMap::insert will replace an existing entry with the same path. If duplicates slip into an archive, this helper can mask it and make assertions misleading.

Proposed tightening
 fn archive_entries(path: &Path) -> HashMap<String, (DataKind, Option<String>)> {
     let mut entries = HashMap::new();
     for_each_entry(path, |entry| {
         let kind = entry.header().data_kind();
         let link_target = match kind {
             DataKind::SymbolicLink => Some(normalize_link_target(&read_symlink_target(&entry))),
             _ => None,
         };
-        entries.insert(entry.header().path().to_string(), (kind, link_target));
+        let prev = entries.insert(entry.header().path().to_string(), (kind, link_target));
+        assert!(prev.is_none(), "duplicate archive path detected in test fixture");
     })
     .unwrap();
     entries
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cli/tests/cli/stdio/symlink.rs` around lines 49 - 58, The helper
archive_entries uses HashMap::insert which silently overwrites duplicate paths;
change it to detect duplicates and fail loudly: in archive_entries (inside the
for_each_entry closure) check whether entries already contains the key (use
entries.entry(entry.header().path().to_string()) or entries.contains_key) before
inserting, and if present panic/unwrap with a clear message including the
duplicate path (and optionally the previous and new DataKind/link_target) so
test indexing cannot silently mask duplicate archive entries.

169-307: Consider extracting shared CLI invocation helpers in this test module.

The create/extract command scaffolding is repeated several times (e.g., Line 169–184, Line 187–201, Line 209–225, Line 228–242, Line 294–308). Pulling that into tiny helpers will make future option changes less error-prone.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cli/tests/cli/stdio/symlink.rs` around lines 169 - 307, The test repeats
identical CLI invocations for creating and extracting archives; add small helper
functions (e.g., create_archive and extract_archive) that wrap
cargo_bin_cmd!("pna").args(...).assert().success() and accept the varying
parameters (archive path, input dir, flags like "-H", list of paths, and
out-dir) and then replace the repeated blocks (the sequences creating
test1_archive/test2_archive/test3_archive and extracting to
test1_out/test2_out/test3_out) with calls to these helpers; reference and update
uses around cargo_bin_cmd!("pna") invocations, test*_archive/test*_out
variables, and keep assertions (assert_symlink and archive_entries) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cli/tests/cli/stdio/symlink.rs`:
- Around line 49-58: The helper archive_entries uses HashMap::insert which
silently overwrites duplicate paths; change it to detect duplicates and fail
loudly: in archive_entries (inside the for_each_entry closure) check whether
entries already contains the key (use
entries.entry(entry.header().path().to_string()) or entries.contains_key) before
inserting, and if present panic/unwrap with a clear message including the
duplicate path (and optionally the previous and new DataKind/link_target) so
test indexing cannot silently mask duplicate archive entries.
- Around line 169-307: The test repeats identical CLI invocations for creating
and extracting archives; add small helper functions (e.g., create_archive and
extract_archive) that wrap cargo_bin_cmd!("pna").args(...).assert().success()
and accept the varying parameters (archive path, input dir, flags like "-H",
list of paths, and out-dir) and then replace the repeated blocks (the sequences
creating test1_archive/test2_archive/test3_archive and extracting to
test1_out/test2_out/test3_out) with calls to these helpers; reference and update
uses around cargo_bin_cmd!("pna") invocations, test*_archive/test*_out
variables, and keep assertions (assert_symlink and archive_entries) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 12b45c80-cf02-49d1-9b68-c6605da1ebe4

📥 Commits

Reviewing files that changed from the base of the PR and between 1ffe4ab and c86b581.

📒 Files selected for processing (2)
  • cli/tests/cli/stdio/symlink.rs
  • pna/src/fs.rs

@ChanTsune ChanTsune force-pushed the codex/windows-broken-symlink-metadata branch from c86b581 to 9767939 Compare March 15, 2026 02:10
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cli/tests/cli/stdio/symlink.rs (1)

26-47: Consider cleaning up test directories.

The test setup functions remove and recreate directories but don't clean up after tests complete. Consider using a test fixture pattern or adding cleanup in test teardown to avoid leaving test artifacts.

💡 Optional: Use tempdir for automatic cleanup
+use tempfile::tempdir;
+
 #[test]
 fn stdio_broken_symlink_no_follow_roundtrip() {
     setup();
 
-    let base = PathBuf::from("stdio_broken_symlink_no_follow");
+    let base = tempdir().unwrap();
+    let base = base.path();
     let source = base.join("source");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cli/tests/cli/stdio/symlink.rs` around lines 26 - 47, The test helper
functions init_broken_resource and init_h_upper_resource create directories but
never remove them; change them to use temporary directories (e.g., return or
accept a tempfile::TempDir) or ensure each test registers teardown cleanup
(remove_dir_all) after assertions so artifacts are not left on disk;
specifically update these functions to either return a TempDir handle to let the
caller drop it or add a companion cleanup step that removes the created dir (use
the same Path passed into init_broken_resource/init_h_upper_resource) to
guarantee automatic deletion after tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cli/tests/cli/stdio/symlink.rs`:
- Around line 26-47: The test helper functions init_broken_resource and
init_h_upper_resource create directories but never remove them; change them to
use temporary directories (e.g., return or accept a tempfile::TempDir) or ensure
each test registers teardown cleanup (remove_dir_all) after assertions so
artifacts are not left on disk; specifically update these functions to either
return a TempDir handle to let the caller drop it or add a companion cleanup
step that removes the created dir (use the same Path passed into
init_broken_resource/init_h_upper_resource) to guarantee automatic deletion
after tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06ff9190-ef36-4b77-b289-7fa74f110624

📥 Commits

Reviewing files that changed from the base of the PR and between c86b581 and 9767939.

📒 Files selected for processing (11)
  • cli/src/command/append.rs
  • cli/src/command/core.rs
  • cli/src/command/create.rs
  • cli/src/command/stdio.rs
  • cli/src/command/update.rs
  • cli/src/utils/os/windows/acl.rs
  • cli/src/utils/os/windows/fs.rs
  • cli/src/utils/os/windows/security.rs
  • cli/tests/cli/stdio.rs
  • cli/tests/cli/stdio/symlink.rs
  • pna/src/fs.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • cli/src/command/update.rs
  • cli/src/command/append.rs
  • cli/src/utils/os/windows/acl.rs

@ChanTsune ChanTsune force-pushed the codex/windows-broken-symlink-metadata branch 2 times, most recently from 87d6551 to c1b323d Compare March 28, 2026 14:37
Use handle-based Windows security APIs (GetSecurityInfo/SetSecurityInfo)
with FILE_FLAG_OPEN_REPARSE_POINT to read and write metadata on symlinks
without following them. This fixes broken symlinks losing their mode,
ownership, and ACL information during archival.

Changes:
- Add FileHandle RAII wrapper and open_path with follow_symlink control
- Add SecurityDescriptor::try_from_handle and apply_by_handle
- Switch lchown and chmod to handle-based APIs for symlink correctness
- Add set_facl_nofollow for symlink-aware ACL writes
- Extract build_acl_buffer to share between set_d_acl and set_d_acl_by_handle
- Use nofollow metadata collection in core.rs for symlink entries
@ChanTsune ChanTsune force-pushed the codex/windows-broken-symlink-metadata branch 4 times, most recently from 1081cd2 to 9cad013 Compare April 8, 2026 03:19
Remove path field from SecurityDescriptor and eliminate all
path-based Win32 APIs (GetNamedSecurityInfoW/SetNamedSecurityInfoW).
All security operations now go through handle-based APIs, making
follow_links control purely a handle-opening concern.

- Fix UB in build_acl_buffer by setting Vec length after InitializeAcl
- Remove SecurityDescriptor::path field and try_from(path) constructor
- Extract apply_security_info helper to eliminate duplication
- Consolidate ACL constructors from 3 to 1 (try_from_handle)
- Unify get_facl/set_facl with follow_links parameter across platforms
- Wire set_facl nofollow into extraction path for symlink ACL restoration
- Remove inline #[cfg] from core.rs by using unified ACL API
- Add Debug derive to FileHandle
- Add broken symlink permission preservation test
@ChanTsune ChanTsune force-pushed the codex/windows-broken-symlink-metadata branch from 9cad013 to d6d7511 Compare April 8, 2026 03:55
@ChanTsune ChanTsune merged commit 9ef6ad4 into main Apr 8, 2026
118 of 120 checks passed
@ChanTsune ChanTsune deleted the codex/windows-broken-symlink-metadata branch April 8, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli This issue is about cli application

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant