Skip to content

Commit f04c033

Browse files
CopilotjamesadevineCopilot
authored
fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3 (#549)
* Initial plan * fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3 Replace is_file() (follows symlinks) with symlink_metadata() checks in collect_changes_from_worktree and collect_changes_from_diff_tree so that symlinks in the applied patch are silently skipped rather than followed to arbitrary filesystem paths (e.g. /proc/self/environ). Also add symlink mode (120000) detection to validate_patch_paths as a belt-and-suspenders defence: patches that create or convert files to symlinks are rejected before git apply is attempted. Adds two unit tests covering the new rejection paths. Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/99d4d7f5-eaea-4a34-87e3-ac34bab53ddb Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com> * refactor(safeoutputs): extract push_file_change_skipping_symlinks helper Collapse 7 duplicated symlink_metadata match blocks introduced by the symlink-exfiltration fix into a single helper. Same security semantics: regular file → push change; symlink → warn and skip; other → ignore. Centralizes the use of symlink_metadata so the no-follow invariant lives in one place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(safeoutputs): surface metadata errors and allow symlink-removal patches Two review fixes for the symlink-exfiltration PR: 1. push_file_change_skipping_symlinks: split the catch-all arm so IO errors from symlink_metadata (e.g. permissions denied) are surfaced via warn! rather than silently swallowed. Non-file, non-symlink entries (directories, fifos, sockets) are still silently skipped. 2. validate_patch_paths: only reject patch lines that INTRODUCE a symlink (new file mode 120000, new mode 120000). Patches that convert a symlink into a regular file (old mode 120000 + new mode 100644) or delete an existing symlink (deleted file mode 120000) are now allowed — they are legitimate cleanup operations and produce a symlink-free worktree, so they pose no exfiltration risk. Adds two positive tests covering symlink→file conversion and symlink deletion; existing negative tests still pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(safeoutputs): surface skipped symlinks in PR description and tighten patch check Addresses PR #549 review feedback ("Address bugs and suggestions"): 1. Skipped symlinks are no longer invisible to the PR author. collect_changes_from_worktree and collect_changes_from_diff_tree now return (Vec<changes>, Vec<skipped_symlink_paths>). When non-empty, an explicit `> [!WARNING]` block listing each skipped path is appended to the PR description before posting, so the agent can see that some intended file content was deliberately dropped for safety. The Stage 3 warn! log is also retained for infrastructure observability. 2. Tightened patch-validation against suffix bypass. validate_patch_paths now compares the trimmed line for exact equality with "new file mode 120000" / "new mode 120000", rather than using starts_with. This eliminates any ambiguity from hypothetical future mode strings sharing the 120000 prefix, while still catching the real attack vectors. A new test exercises trailing whitespace and patch-body content containing '120000' to confirm neither bypasses nor false-rejects occur. 3. New tests: - test_validate_patch_paths_symlink_mode_suffix_not_bypass - test_append_skipped_symlink_notice_empty_is_passthrough - test_append_skipped_symlink_notice_lists_paths All 48 create_pull_request tests pass; full bin suite (1510 tests) green; cargo clippy clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(safeoutputs): sanitize symlink paths embedded in PR description markdown Addresses follow-up review feedback on PR #549. The append_skipped_symlink_notice helper previously embedded raw filenames inside an inline-code span in the PR description. Filenames may legally contain backticks (e.g. `foo�ar`) or control characters (newlines, tabs), which would terminate the code span and garble or break out of the blockquote. The agent is the adversary in this code path, so the risk is display-only (no secondary exfiltration vector), but the previous output was malformed when adversarial filenames were involved. CommonMark code spans do NOT honour backslash escapes — the backtick-count rule terminates the span instead — so the naive `path.replace('`', `\\`)` suggested in review is not actually an escape. Instead, sanitize_path_for_markdown: - Replaces backticks with apostrophes (visually clear, terminator-safe). - Collapses all ASCII control characters (newline, CR, tab, etc.) to '?'. Display-only sanitisation: the canonical path the agent originally requested is unchanged in the upload pipeline; only the markdown rendering of the skipped-symlinks notice is affected. Adds four targeted tests covering backticks, control characters, pass-through of normal paths, and end-to-end sanitisation through append_skipped_symlink_notice. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(safeoutputs): trim() mode-line check and differentiate NotFound metadata errors Two minor review fixes on PR #549. 1. validate_patch_paths now compares mode lines after a full trim() rather than trim_end(). trim_end() leaves leading whitespace intact, so a line like ' new file mode 120000' would silently bypass the check. Git's own format-patch never produces leading-indented mode lines so this was not a realistic attack path, but trim() costs nothing and closes the gap. The existing test now also covers the leading-whitespace and CRLF cases. 2. push_file_change_skipping_symlinks now distinguishes io::ErrorKind::NotFound from other metadata errors. NotFound is a normal transient condition (worktree mid-rebase, file pruned by git apply, etc.) and is logged at debug level. PermissionDenied and other unusual kinds remain at warn so triage isn't drowned by alert fatigue from benign races. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(safeoutputs): surface symlink-skip count and anchor mode check to non-ws header lines Three follow-up review fixes on PR #549. 1. All-symlinks early-return now reports the cause. When every file collected for the PR is a symlink, `changes` is empty and the executor previously returned the generic `No changes detected after applying patch` message ÔÇö the agent had no way to see that its files were dropped for safety (the per-symlink `warn!` only reached Stage 3 infrastructure logs). The IfNoChanges Error/Warn/Ignore branches now append ` (N symlink(s) were skipped for safety: path1, path2, ÔǪ)` to both the log and the ExecutionResult message whenever skipped_symlinks is non-empty. 2. validate_patch_paths mode check anchored to no-leading-whitespace lines. Previously `line.trim() == "new file mode 120000"` could be triggered by a diff CONTEXT line ( `" new file mode 120000"`, with a single leading space) ÔÇö after trim() the two are indistinguishable. Real git header lines never start with whitespace, while diff context lines always do, so the check now also requires the first character of the raw line to be non-whitespace. Trailing whitespace and `\r` still bypass-proofed by `trim()` on the right-hand side of the equality. Added a new explicit test that a context line whose body is literally `new file mode 120000` is allowed through. (The test patch is built with `String + &str` concatenation rather than backslash line-continuation, because Rust eats next-line leading whitespace after `\`, which would silently defeat the property under test.) 3. Documented TOCTOU assumption. Added a doc comment to push_file_change_skipping_symlinks acknowledging the theoretical TOCTOU window between symlink_metadata (lstat) and the subsequent tokio::fs::read inside read_file_change. Not exploitable in Stage 3's deployment model (no concurrent writer to the worktree), but the assumption is now explicit, with a pointer to the proper mitigation (O_NOFOLLOW fd reads) should the deployment model change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(safeoutputs): filter bidi controls in symlink paths and sanitize executor messages Two follow-up review fixes on PR #549. 1. sanitize_path_for_markdown now filters Unicode bidi/zero-width formatting characters in addition to ASCII control chars and backticks. char::is_control() only covers U+0000-U+001F and U+007F-U+009F, so characters like U+202E (RIGHT-TO-LEFT OVERRIDE), U+2066 (LRI), and U+FEFF (BOM) previously passed through unchanged. A filename containing U+202E could visually reverse part of its displayed name inside the GitHub code span and deceive a PR author into misreading the skipped-symlinks warning. No exfiltration vector (the symlink is already blocked) but a real display-spoofing concern; now collapsed to '?' alongside other formatting chars. Explicit ranges filtered: U+200B-U+200F (ZW joiners + LRM/RLM), U+202A-U+202E (LRE/RLE/PDF/LRO/RLO), U+2066-U+2069 (LRI/RLI/FSI/PDI), U+FEFF. Ordinary non-ASCII letters/emoji/CJK pass through unchanged. 2. ExecutionResult symlink_suffix paths now reuse sanitize_path_for_markdown. The early-return failure/success/warning messages previously embedded raw skipped-symlink filenames via skipped_symlinks.join(', '). Filenames containing commas, control chars, or bidi controls could garble or spoof the message. Sanitizing per-path before joining keeps the message readable and consistent with the PR-description block. Two new tests: - test_sanitize_path_for_markdown_filters_bidi_controls (U+202E + sample set) - test_sanitize_path_for_markdown_keeps_normal_unicode (cafe / Japanese / emoji passthrough) Full suite (1516 tests) green; cargo clippy clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com> Co-authored-by: James Devine <devinejames@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 91d39a9 commit f04c033

1 file changed

Lines changed: 505 additions & 41 deletions

File tree

0 commit comments

Comments
 (0)