Skip to content

feat(safe-outputs): add upload-artifact safe output#363

Closed
Copilot wants to merge 3 commits into
mainfrom
copilot/add-upload-artifact-safeoutput
Closed

feat(safe-outputs): add upload-artifact safe output#363
Copilot wants to merge 3 commits into
mainfrom
copilot/add-upload-artifact-safeoutput

Conversation

Copilot AI commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Adds a new upload-artifact safe output that lets agents publish a workspace file as an Azure DevOps pipeline artifact attached to the current build run — the closest ADO equivalent to GitHub Actions' actions/upload-artifact.

Summary

Today an agent has no clean way to surface a build deliverable (generated report, screenshot, log bundle, compiled binary). upload-attachment only attaches files to a work item; there is no analogue for the run-summary Artifacts list. gh-aw's upload-asset solves the same problem for GitHub by committing files to an orphaned branch, and it points users at actions/upload-artifact for the run-scoped variant. ADO needs its own first-class tool because the agent runs inside the AWF network-isolated sandbox and can't invoke PublishPipelineArtifact@1 directly.

The flow is split across two stages, mirroring how create-pull-request stages its patch file:

Stage 1 (MCP, in the agent sandbox):

  1. Validates the agent-supplied UploadArtifactParamsartifact_name (charset, length, no leading .) and file_path (no traversal, no absolute paths, no .git, no symlink escape outside the workspace). Validation leverages the shared primitives in src/validate.rs (is_valid_version for the artifact-name charset, is_safe_path_segment per file-path component) instead of inlining char allowlists.
  2. Resolves file_path against bounding_directory (the agent's workspace root), canonicalises it, and rejects symlink escapes.
  3. Rejects directories — single files only.
  4. Copies the file into the safe-outputs output_directory under a generated unique name (upload-artifact-<artifact_name>-<short-id>.<ext>, extension preserved for the Stage 3 allow-list check). This is essential because the agent's sandbox workspace is no longer accessible at execution time.
  5. Constructs the result via UploadArtifactResult::new(artifact_name, file_path, staged_file, file_size) and writes the NDJSON record.

Stage 3 (executor, outside the sandbox):

  1. Reads the staged file from ctx.working_directory.join(staged_file) and re-canonicalises it inside the working directory (defense in depth).
  2. Applies operator-supplied name-prefix, then enforces allowed-artifact-names, allowed-extensions, and max-file-size limits.
  3. Rejects text files containing ADO pipeline-command sequences (##vso[ or ##[) using contains_pipeline_command; binary files skip that scan.
  4. Rejects non-UTF-8 paths and any path containing \n/\r/] (using contains_newline) so the logging command can't be malformed/escaped.
  5. Emits ##vso[artifact.upload artifactname=NAME]<staged-abs-path> on stdout. The build agent picks this up and asynchronously attaches the file to the current run.

Dry-run summarises the action without emitting the command.

Front-matter shape

safe-outputs:
  upload-artifact:
    max-file-size: 52428800              # default: 50 MB
    allowed-extensions: [".png", ".pdf", ".log"]
    allowed-artifact-names: ["agent-*"]  # `*` suffix → prefix match
    name-prefix: "agent-"
    max: 5

Files changed

  • src/safeoutputs/upload_artifact.rs — new tool: UploadArtifactParams, UploadArtifactResult (with artifact_name, file_path, staged_file, file_size fields and an explicit ::new() constructor), internal UploadArtifactResultFields mirror to satisfy the tool_result! macro, validation, config, and Stage 3 executor (+19 unit tests, including 3 executor tests covering staged-file success, missing-file failure, and ##vso[-injection rejection). Uses shared validators from src/validate.rs for charset, path-segment, pipeline-command and newline checks.
  • src/safeoutputs/mod.rs — module, re-export, WRITE_REQUIRING_SAFE_OUTPUTS, ALL_KNOWN_SAFE_OUTPUTS, consistency test.
  • src/execute.rs — dispatcher entry and per-run budget registration.
  • src/mcp.rs — MCP tool method (upload-artifact) now performs Stage 1 validation, workspace-bounded canonicalisation, directory rejection, and copies the file into output_directory before writing the NDJSON record.
  • docs/safe-outputs.md — reference section under the existing upload-attachment entry, updated to describe the Stage 1 staging and Stage 3 validation split.

Notes / future work

  • Single-file only for now; directory uploads can be layered on later by emitting one logging command per file or by switching to containerfolder.
  • Uses ADO's classic build-artifact logging command rather than the modern pipeline-artifact REST API. The result still appears in the run's Artifacts list. We can switch to PublishPipelineArtifact later without changing the agent-facing API if needed.

Test plan

  • cargo build
  • cargo test ✅ (1034 tests, 0 failures, including 19 new tests for this tool covering name/path validation, config defaults, YAML round-trip, serialization, and Stage 3 executor behaviour against a staged file in a temporary working directory)
  • cargo clippy --all-targets --all-features ✅ (no warnings on the new file)
  • Code-review feedback addressed:
    • Clearer "directory not supported" wording.
    • Explicit UTF-8 path check (replacing to_string_lossy()).
    • Refactored to leverage built-in validators from src/validate.rs (is_valid_version, is_safe_path_segment, contains_pipeline_command, contains_newline) instead of inlining char-allowlist / injection-detection logic. The ##vso[ scan now also catches ##[, a strict improvement.
    • Restructured to stage the agent's file into the safe-outputs output_directory during the MCP call (Stage 1) and read it back from ctx.working_directory in Stage 3, mirroring the create-pull-request pattern. Without this, the Stage 3 executor would have had nothing to publish because the sandbox workspace is no longer accessible by then.

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/0d0cdabb-69cb-46a4-b5b8-b2dba48abf00

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
@jamesadevine jamesadevine marked this pull request as ready for review April 30, 2026 10:39
@jamesadevine

Copy link
Copy Markdown
Collaborator

/rust-review

@github-actions

github-actions Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Rust PR Reviewer completed successfully!

@github-actions

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Good security-conscious design overall, but has one meaningful security gap and two configuration footguns worth addressing before merge.

Findings

🔒 Security Concerns

  • src/safeoutputs/upload_artifact.rs:216-228 — Extension filter uses agent-supplied path, not canonical path (symlink bypass)

    The allowed-extensions check runs against self.file_path (the agent's unresolved string) before symlink resolution. An agent can create a symlink inside the workspace — e.g. report.pdf → malware.exe — and the filter sees .pdf while the pipeline artifact receives .exe content. The fix is to apply the extension check against canonical (already computed at line 235) rather than self.file_path:

    // after computing `canonical`:
    let has_valid_ext = config.allowed_extensions.iter().any(|ext| {
        canonical
            .to_string_lossy()
            .to_lowercase()
            .ends_with(&ext.to_lowercase())
    });

    Or more robustly, check canonical.extension(). Since canonical is always valid UTF-8 at this point (checked at line 284), this is straightforward.

⚠️ Suggestions

  • src/safeoutputs/upload_artifact.rs:217-221ends_with extension matching without enforcing a leading dot

    allowed_extensions entries are not validated to start with .. An operator who writes allowed-extensions: ["pdf"] instead of [".pdf"] gets a suffix check: "apdf" and "logpdf" both pass. Should either (a) normalise entries to start with . at config load, or (b) validate that each entry starts with . and surface a clear error if not.

  • src/safeoutputs/upload_artifact.rs:52-76\n/\r not rejected in file_path during validate()

    Newlines are only checked in abs_path after canonicalization (line 300). Paths containing \n are technically valid on Linux; source_directory.join(path_with_newline) and canonicalize() can succeed if the file exists, putting a newline in the log output at line 168. Adding ensure!(!self.file_path.contains('\n') && !self.file_path.contains('\r'), ...) in validate() is cheap defense-in-depth consistent with blocking \0.

✅ What Looks Good

  • The symlink-escape guard using canonical.starts_with(&canonical_base) is correct — Rust's Path::starts_with does component-wise comparison, not string prefix, so /workspace-evil/file can't spoof /workspace.
  • Re-validating final_name's charset (line 184-195) after applying the operator-controlled name-prefix is the right defensive pattern; a malicious prefix containing ] or ##vso[ is caught here before the logging command is emitted.
  • Explicit UTF-8 path check (line 284) and ]/\n/\r guard on abs_path (line 300) are well-targeted; this forecloses the logging-command injection vectors correctly.
  • Whole-file read for ##vso[ scanning is bounded by max_file_size (checked first at line 261), so worst-case memory usage is deterministic.
  • The max: N budget field flows through the executor dispatch machinery rather than being duplicated in the config struct — consistent with all other safe-output tools.

Generated by Rust PR Reviewer for issue #363 · ● 1.1M ·

))
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot we have many built in validators - try and leverage them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored in 63b598c to use the built-in validators from src/validate.rs:

  • artifact_name charset → is_valid_version ([A-Za-z0-9._-], non-empty)
  • file_path per-component → is_safe_path_segment (covers empty/absolute, .., embedded separators, leading . — including .git)
  • File-content injection scan → contains_pipeline_command (now also catches ##[, not just ##vso[)
  • abs_path newline guard → contains_newline

Inline checks now only remain for things validators don't cover (length cap, leading . on artifact name, null byte / : in path, and the ] guard for the logging command).

Comment thread src/mcp.rs
"Artifact '{}' queued from file '{}'. The file will be published during safe output processing.",
result.artifact_name, result.file_path
))]))
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot right now this will do nothing, in the end. We need to save the agent's provided file to the output_directory, and then retrieve it in the executor phase. This is a similar idea to create-pull-request tool.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restructured in fb479e5 to mirror the create-pull-request staging pattern:

Stage 1 (MCP, mcp.rs) now:

  1. Validates UploadArtifactParams.
  2. Resolves file_path against bounding_directory (workspace root) and canonicalizes to guard against symlink escape.
  3. Rejects directories.
  4. Copies the file into self.output_directory under a generated unique name upload-artifact-<artifact_name>-<short-id>.<ext> (extension preserved for the Stage 3 allow-list check).
  5. Constructs the result via UploadArtifactResult::new(artifact_name, file_path, staged_file, file_size) and writes the NDJSON record.

Stage 3 (upload_artifact.rs::execute_impl) now reads from ctx.working_directory.join(&self.staged_file) and re-canonicalizes (defense in depth), then runs the existing size / extension / name-allow-list / name-prefix / ##vso[-injection / abs-path-charset checks before emitting ##vso[artifact.upload artifactname=...]<staged-abs-path>.

The result struct gained staged_file and file_size fields and an explicit ::new() constructor (an internal UploadArtifactResultFields mirror keeps the tool_result! macro happy). Added 3 new executor tests (success, missing-file, ##vso[-rejection) on top of the existing 16 — all 1034 tests + clippy pass.

Copilot AI commented Apr 30, 2026

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • spsprodeus21.vssps.visualstudio.com
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 w2p68lww5ql6qymnxhf.0kc5tw5.rcgu.o 8xibsud4vdl1zfp0xf5.0kc5tw5.rcgu.o 5fwso3las68i1yt4v1y.0kc5tw5.rcgu.o 65snsk4r9ka7ns9oxa1.0kc5tw5.rcgu.o nt79chsxk0t8dkruebs.0kc5tw5.rcgu.o hqwtedbizwfardrdckl.0kc5tw5.rcgu.o 8kp4p7l5gve4ov6i9zn.0kc5tw5.rcgu.o krw7k4z8nw604v1xndo.0kc5tw5.rcgu.o m88hjs94rmehnpdtlpp.0kc5tw5.rcgu.o 9bqqpzuwbrfqmgtsv24.0kc5tw5.rcgu.o l614xgvwgtz0cyr5f4y.0kc5tw5.rcgu.o bgmv�� f104hcb02x5n2w8ijq6.0kc5tw5.rcgu.o plk8bd2oe0m70raahhk.0kc5tw5.rcgu.o known-linux-gnu//home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/libsyn-1500b8e8e9aaac8c.rlib /lto-wrapper uild.aa15b61421aclippy uild.aa15b61421a--all-targets 53f90v25swf.rcgu--all-features (dns block)
  • spsprodweu4.vssps.visualstudio.com
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 w2p68lww5ql6qymnxhf.0kc5tw5.rcgu.o 8xibsud4vdl1zfp0xf5.0kc5tw5.rcgu.o 5fwso3las68i1yt4v1y.0kc5tw5.rcgu.o 65snsk4r9ka7ns9oxa1.0kc5tw5.rcgu.o nt79chsxk0t8dkruebs.0kc5tw5.rcgu.o hqwtedbizwfardrdckl.0kc5tw5.rcgu.o 8kp4p7l5gve4ov6i9zn.0kc5tw5.rcgu.o krw7k4z8nw604v1xndo.0kc5tw5.rcgu.o m88hjs94rmehnpdtlpp.0kc5tw5.rcgu.o 9bqqpzuwbrfqmgtsv24.0kc5tw5.rcgu.o l614xgvwgtz0cyr5f4y.0kc5tw5.rcgu.o bgmv�� f104hcb02x5n2w8ijq6.0kc5tw5.rcgu.o plk8bd2oe0m70raahhk.0kc5tw5.rcgu.o known-linux-gnu//home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/libsyn-1500b8e8e9aaac8c.rlib /lto-wrapper uild.aa15b61421aclippy uild.aa15b61421a--all-targets 53f90v25swf.rcgu--all-features (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

…ectory in Stage 3

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/c3b3f8ed-5b9d-45ed-aeaa-8d69a4dcb450

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>

Copilot AI commented Apr 30, 2026

Copy link
Copy Markdown
Contributor Author

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • spsprodeus21.vssps.visualstudio.com
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 cfg(feature, values(&#34;compact&#34;, &#34;default&#34;)) -C metadata=583dcf441e5062de g/de�� lib/rustlib/x86_-Wl,--as-needed hqsqgbanjoy9kawfl2pkxr6.1wu53mj.rcgu.o xsqflco2lxc9sur4wtv3os8.1wu53mj.rcgu.o 0fvpkkr9gyz1gs78w5xezep.1wu53mj.rcgu.o zmtb6p6w8tv9czk4vulrqkb.1wu53mj.rcgu.o 0ww9khhej8sw3fw5a66574n.1wu53mj.rcgu.o yltg7hhctc3l8wch8knkjr5.1wu53mj.rcgu.o zrxv1fxgkjips2h5l0jd4ez.1wu53mj.rcgu.o uarpsdjdn5m5aa4s05ijszs.1wu53mj.rcgu.o kj69li1lqbftyla3ul14coz.1wu53mj.rcgu.o eu67jlba58zk1rhnhm6d5ud.1wu53mj.rcgu.o gwcnyh1yz1al6x4py7fmot7.1wu53mj.rcgu.o zmcmt0kl6my624kva8yjjp3.1wu53mj.rcgu.o kg9nggsl0f6z94nnz1qxh9h.1wu53mj.rcgu.o hy1z9ndhr2albq2ppqg7p28.1wu53mj.rcgu.o (dns block)
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/derive_more_impl-46e9b1cf09dfa-C /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/derive_more_impl-46e9b1cf09dfadebuginfo=2 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/rustcGmbcyr/rmeta.o pE/list -Wl,--as-needed -Wl,-Bstatic /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/libconvert_case-e2adeda40af585--cfg mcp_macros.52012350d6ae5fae-cgu.0.rcgu.o mcp_macros.52012350d6ae5fae-cgu.1.rcgu.o mcp_macros.52012350d6ae5fae-cgu.2.rcgu.o mcp_macros.52012350d6ae5fae-cgu.3.rcgu.o mcp_macros.52012350d6ae5fae-cgu.4.rcgu.o mcp_macros.52012350d6ae5fae-cgu.5.rcgu.o target/debug/dep--json=diagnostic-rendered-ansi,artifacts,future-incompat 0pdea57r5az4pvselb3mo69g.rcgu.o target/debug/depcc target/debug/dep-Wl,--version-script=/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/rustcGmbc/home/REDACTED/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/subtle-2.6.1/src/lib.rs ib 6d.rlib (dns block)
  • spsprodweu4.vssps.visualstudio.com
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 cfg(feature, values(&#34;compact&#34;, &#34;default&#34;)) -C metadata=583dcf441e5062de g/de�� lib/rustlib/x86_-Wl,--as-needed hqsqgbanjoy9kawfl2pkxr6.1wu53mj.rcgu.o xsqflco2lxc9sur4wtv3os8.1wu53mj.rcgu.o 0fvpkkr9gyz1gs78w5xezep.1wu53mj.rcgu.o zmtb6p6w8tv9czk4vulrqkb.1wu53mj.rcgu.o 0ww9khhej8sw3fw5a66574n.1wu53mj.rcgu.o yltg7hhctc3l8wch8knkjr5.1wu53mj.rcgu.o zrxv1fxgkjips2h5l0jd4ez.1wu53mj.rcgu.o uarpsdjdn5m5aa4s05ijszs.1wu53mj.rcgu.o kj69li1lqbftyla3ul14coz.1wu53mj.rcgu.o eu67jlba58zk1rhnhm6d5ud.1wu53mj.rcgu.o gwcnyh1yz1al6x4py7fmot7.1wu53mj.rcgu.o zmcmt0kl6my624kva8yjjp3.1wu53mj.rcgu.o kg9nggsl0f6z94nnz1qxh9h.1wu53mj.rcgu.o hy1z9ndhr2albq2ppqg7p28.1wu53mj.rcgu.o (dns block)
    • Triggering command: /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/ado_aw-654342efa6a15a60 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/derive_more_impl-46e9b1cf09dfa-C /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/derive_more_impl-46e9b1cf09dfadebuginfo=2 /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/rustcGmbcyr/rmeta.o pE/list -Wl,--as-needed -Wl,-Bstatic /home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/libconvert_case-e2adeda40af585--cfg mcp_macros.52012350d6ae5fae-cgu.0.rcgu.o mcp_macros.52012350d6ae5fae-cgu.1.rcgu.o mcp_macros.52012350d6ae5fae-cgu.2.rcgu.o mcp_macros.52012350d6ae5fae-cgu.3.rcgu.o mcp_macros.52012350d6ae5fae-cgu.4.rcgu.o mcp_macros.52012350d6ae5fae-cgu.5.rcgu.o target/debug/dep--json=diagnostic-rendered-ansi,artifacts,future-incompat 0pdea57r5az4pvselb3mo69g.rcgu.o target/debug/depcc target/debug/dep-Wl,--version-script=/home/REDACTED/work/ado-aw/ado-aw/target/debug/deps/rustcGmbc/home/REDACTED/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/subtle-2.6.1/src/lib.rs ib 6d.rlib (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@jamesadevine

Copy link
Copy Markdown
Collaborator

/rust-review

@github-actions

github-actions Bot commented May 1, 2026

Copy link
Copy Markdown
Contributor

Rust PR Reviewer completed successfully!

@github-actions

github-actions Bot commented May 1, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — solid security posture with defense-in-depth, good test coverage, and clean integration with existing patterns. Two issues worth addressing before merge.

Findings

🐛 Bugs / Logic Issues

  • src/safeoutputs/upload_artifact.rs:362 — The emitted ADO logging command is missing the containerfolder property. The [documented format]((learn.microsoft.com/redacted) is:
    ##vso[artifact.upload containerfolder=FOLDER;artifactname=NAME]PATH
    ```
    The current code emits:
    ```
    ##vso[artifact.upload artifactname=NAME]PATH
    
    Without containerfolder, behavior is agent-version-dependent — the file may end up in an unnamed container root or the artifact may not appear as expected in the run's Artifacts tab. A reasonable default is to use final_name as both (i.e. containerfolder=NAME;artifactname=NAME), which is the pattern most ADO examples use for single-artifact uploads.

⚠️ Suggestions

  • src/safeoutputs/upload_artifact.rs:296–340 — The file content (up to 50 MB) is read into memory for the contains_pipeline_command scan before the ctx.dry_run early-exit check at line 345. In dry-run mode the scan result is discarded, so the read is wasted. Moving the dry_run guard to just before the file read (after the size check) would avoid allocating up to 50 MB unnecessarily. Note: upload_attachment.rs has the same ordering, so this isn't a new regression — but this new tool operates on potentially much larger files.

  • src/safeoutputs/upload_artifact.rs:256–265 — The extension allowlist check uses self.file_path (the agent-supplied value stored in the NDJSON record) rather than the staged file's actual extension. Stage 1 encodes the source extension in the staged filename (e.g. upload-artifact-name-<id>.pdf), so comparing against Path::new(&self.staged_file).extension() would be equivalent and slightly more tamper-resistant — if the NDJSON file_path field is modified between stages, the extension check could be bypassed while the staged file points to a different type. The staged file is still bounded to working_directory, limiting practical impact, but the staged extension is already available.

✅ What Looks Good

  • The two-stage copy-then-emit pattern (mirroring create-pull-request) is the right approach; the sandbox workspace genuinely isn't accessible at execution time.
  • Canonicalize + starts_with bounding checks in both Stage 1 (MCP) and Stage 3 are solid defense-in-depth against symlink escapes, even if the NDJSON staged_file field is tampered with.
  • Using shared is_valid_version, is_safe_path_segment, contains_pipeline_command, and contains_newline validators rather than inlining char allowlists is exactly the right convention for this codebase.
  • The ##[ catch (in addition to ##vso[) in contains_pipeline_command is a strict improvement over upload_attachment.rs.
  • 19 tests with executor tests that use tempfile and actually exercise the staged-file path are well above the bar for this kind of tool.

Generated by Rust PR Reviewer for issue #363 · ● 940.5K ·

jamesadevine added a commit that referenced this pull request May 2, 2026
* feat(safe-outputs): add unified upload-build-artifact safe output

Combines the concepts from #363 (upload-artifact, current build) and #373
(upload-build-artifact, arbitrary build) into a single tool that handles
both use cases via the ADO build attachments REST API.

Key design:
- build_id is Optional — omit to target the current pipeline run
  (resolved from BUILD_BUILDID in Stage 3)
- REST API (PUT .../builds/{buildId}/attachments/{type}/{name}) for all
  cases, replacing the ##vso[artifact.upload] logging command approach
- allowed-build-ids check is skipped when targeting the current build
  (implicitly trusted)
- Full Stage 1 → Stage 3 file staging contract: validate, canonicalize,
  bounds-check, stage file with extension preservation, re-canonicalize
  and re-verify in Stage 3

Supersedes #363 and #373.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: move file read after dry-run guard and use Path::extension() for allowlist check

- Move std::fs::read after the dry_run early return to avoid reading
  up to 50 MB into memory only to discard it in dry-run mode.
- Switch extension allowlist from suffix matching on the full path
  to Path::extension() comparison, preventing false matches like
  'catalog' matching allowed-extensions: ['log'].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: restore cache-memory heading in safe-outputs.md

The heading was accidentally dropped when the upload-build-artifact
section was inserted, leaving the cache-memory paragraph as a stray
note under the wrong section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: use async file read, add staged-file integrity check, clarify comments

- Switch std::fs::read to tokio::fs::read().await to avoid blocking
  the tokio runtime for files up to 50 MB.
- Add a warn!() when the live staged-file size differs from the size
  recorded at Stage 1, catching potential file tampering between stages.
- Add comments on is_valid_version() call sites noting intentional
  reuse for artifact-name and attachment-type charset validation.
- Add comment on staged filename explaining max-length reasoning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make staged-file size mismatch a hard failure

A size mismatch between Stage 1 and Stage 3 indicates the staged file
was modified between stages — fail hard rather than warning and
uploading potentially tampered content. Adds a test for the integrity
check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add attachment-type dot guard, Stage 1 size cap, executor tests

- Add starts_with('.') guard to attachment-type validation, matching
  the artifact_name check for consistency.
- Add defense-in-depth file size cap at Stage 1 (MCP) using the
  default 50 MB limit to prevent staging-disk exhaustion before
  Stage 3 enforces the operator's configured limit.
- Add three missing executor tests: extension allowlist rejection,
  artifact-name allowlist rejection, and name-prefix application.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add SHA-256 cross-stage integrity verification

Add SHA-256 hash recording at Stage 1 and verification at Stage 3 for
both file-staging safe outputs:

- upload-build-artifact: staged_sha256 field recorded after copying
  the file, verified before uploading to ADO. Catches same-size file
  swaps that the size-only check would miss.
- create-pull-request: patch_sha256 field recorded after writing the
  patch, verified before applying. Field is Optional for backward
  compatibility with existing NDJSON records.

Also adds sha256_hex() helper in upload_build_artifact module, reused
by both tools and the MCP handlers.

A separate issue (#381) tracks adding the full staging pattern
(including SHA-256) to upload-workitem-attachment, which currently
reads directly from source_directory without staging.

Also fixes: name-prefix length cap (50 chars), attachment-type
leading-dot guard, comment typo in MCP handler.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make patch_sha256 a required field on CreatePrResult

Pipelines are versioned with their YAML — no need for backward
compatibility with old NDJSON records. Making the hash optional
created a security hole where a missing hash would skip verification
entirely. Now patch_sha256 is a required String and verification is
unconditional.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: move sha256_hex to src/hash.rs, fix prefix slice panic

- Move sha256_hex() from upload_build_artifact.rs to a dedicated
  src/hash.rs module, removing the coupling between create_pr,
  mcp.rs and the upload_build_artifact module.
- Fix potential panic in name-prefix error path: &prefix[..20]
  can panic on multi-byte UTF-8 boundaries. Use
  prefix.chars().take(20).collect() instead.
- Revert upload_build_artifact module visibility back to private.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add SHA-256 mismatch tests, fix TOCTOU, extract is_valid_artifact_name

- Add non-dry-run SHA-256 mismatch tests for both upload-build-artifact
  and create-pull-request (the mismatch path fires before any HTTP call,
  so no credentials are needed).
- Fix TOCTOU in MCP handler: hash the source bytes before writing to
  the staging directory, instead of re-reading the staged copy.
- Extract is_valid_artifact_name() in validate.rs to decouple artifact
  name validation from version string validation, preventing silent
  breakage if is_valid_version is ever tightened.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: import DEFAULT_MAX_FILE_SIZE in MCP handler

Replace the floating STAGE1_MAX_FILE_SIZE copy with a direct import
of DEFAULT_MAX_FILE_SIZE from upload_build_artifact, ensuring the
Stage 1 cap stays in sync if the default is ever changed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants