Skip to content

Commit 6d55eed

Browse files
jamesadevineCopilot
andcommitted
fix(safeoutputs): sanitize symlink paths embedded in PR description markdown
Addresses follow-up review feedback on PR #549. The append_skipped_symlink_notice helper previously embedded raw filenames inside an inline-code span in the PR description. Filenames may legally contain backticks (e.g. `foo�ar`) or control characters (newlines, tabs), which would terminate the code span and garble or break out of the blockquote. The agent is the adversary in this code path, so the risk is display-only (no secondary exfiltration vector), but the previous output was malformed when adversarial filenames were involved. CommonMark code spans do NOT honour backslash escapes — the backtick-count rule terminates the span instead — so the naive `path.replace('`', `\\`)` suggested in review is not actually an escape. Instead, sanitize_path_for_markdown: - Replaces backticks with apostrophes (visually clear, terminator-safe). - Collapses all ASCII control characters (newline, CR, tab, etc.) to '?'. Display-only sanitisation: the canonical path the agent originally requested is unchanged in the upload pipeline; only the markdown rendering of the skipped-symlinks notice is affected. Adds four targeted tests covering backticks, control characters, pass-through of normal paths, and end-to-end sanitisation through append_skipped_symlink_notice. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 1b953bf commit 6d55eed

1 file changed

Lines changed: 69 additions & 1 deletion

File tree

src/safeoutputs/create_pull_request.rs

Lines changed: 69 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1905,12 +1905,39 @@ fn append_skipped_symlink_notice(body: &str, skipped_symlinks: &[String]) -> Str
19051905
);
19061906
for path in skipped_symlinks {
19071907
notice.push_str("> - `");
1908-
notice.push_str(path);
1908+
notice.push_str(&sanitize_path_for_markdown(path));
19091909
notice.push_str("`\n");
19101910
}
19111911
format!("{}{}", body, notice)
19121912
}
19131913

1914+
/// Make an arbitrary filesystem path safe to embed inside an inline-code span
1915+
/// in a markdown PR description.
1916+
///
1917+
/// The agent that produced the PR is the adversary in this code path: it can
1918+
/// plant filenames containing characters that would otherwise break out of the
1919+
/// inline-code context and garble (or HTML-inject into) the description.
1920+
/// CommonMark code spans do NOT honour backslash escapes (the backtick-count
1921+
/// rule terminates the span instead), so naive `path.replace('`', "\\`")` is
1922+
/// not actually an escape — it just inserts a stray backslash. Instead we:
1923+
///
1924+
/// - Replace backticks with an apostrophe (visually clear, terminator-safe).
1925+
/// - Collapse all ASCII control characters — including newlines, carriage
1926+
/// returns, and tabs — to a single `?` so a multi-line filename can't break
1927+
/// the blockquote.
1928+
///
1929+
/// This is display-only sanitisation; the canonical path the agent actually
1930+
/// requested is unchanged in the upload pipeline.
1931+
fn sanitize_path_for_markdown(path: &str) -> String {
1932+
path.chars()
1933+
.map(|c| match c {
1934+
'`' => '\'',
1935+
c if c.is_control() => '?',
1936+
c => c,
1937+
})
1938+
.collect()
1939+
}
1940+
19141941
/// Read a file and produce an ADO push change entry.
19151942
/// Handles both text (rawtext) and binary (base64encoded) content.
19161943
async fn read_file_change(
@@ -2483,6 +2510,47 @@ index 0000000..abcdefg
24832510
assert!(out.contains("`subdir/leak`"));
24842511
}
24852512

2513+
#[test]
2514+
fn test_sanitize_path_for_markdown_replaces_backticks() {
2515+
// Backticks would otherwise terminate the inline-code span and let the
2516+
// adversarial filename break out of the PR description's blockquote.
2517+
let out = sanitize_path_for_markdown("foo`bar`baz");
2518+
assert!(!out.contains('`'), "all backticks must be removed: {out}");
2519+
assert_eq!(out, "foo'bar'baz");
2520+
}
2521+
2522+
#[test]
2523+
fn test_sanitize_path_for_markdown_collapses_control_chars() {
2524+
// Newlines, carriage returns, tabs, and other ASCII control characters
2525+
// are all replaced with '?' so a multi-line filename can't break the
2526+
// blockquote / list layout.
2527+
let out = sanitize_path_for_markdown("evil\nname\rwith\ttabs\x07bell");
2528+
assert_eq!(out, "evil?name?with?tabs?bell");
2529+
}
2530+
2531+
#[test]
2532+
fn test_sanitize_path_for_markdown_passthrough_normal() {
2533+
let out = sanitize_path_for_markdown("src/safeoutputs/create_pull_request.rs");
2534+
assert_eq!(out, "src/safeoutputs/create_pull_request.rs");
2535+
}
2536+
2537+
#[test]
2538+
fn test_append_skipped_symlink_notice_sanitizes_paths() {
2539+
// End-to-end: backticks and newlines in a filename must not break the
2540+
// blockquote.
2541+
let body = "Some PR description";
2542+
let skipped = vec!["evil`name\nwith\rnewlines".to_string()];
2543+
let out = append_skipped_symlink_notice(body, &skipped);
2544+
// The sanitised path appears in the listing,
2545+
assert!(out.contains("`evil'name?with?newlines`"));
2546+
// and the raw garbling characters are not anywhere in the notice.
2547+
let notice_only = &out[body.len()..];
2548+
assert!(
2549+
!notice_only.contains('\n') || notice_only.lines().all(|l| l.is_empty() || l.starts_with('>')),
2550+
"every non-empty line of the notice must remain inside the blockquote"
2551+
);
2552+
}
2553+
24862554
#[test]
24872555
fn test_validate_single_path_valid() {
24882556
assert!(validate_single_path("src/main.rs").is_ok());

0 commit comments

Comments
 (0)