Skip to content

feat(safe-outputs): add upload-build-artifact safe output#373

Closed
Copilot wants to merge 2 commits into
mainfrom
copilot/add-attach-build-artifact
Closed

feat(safe-outputs): add upload-build-artifact safe output#373
Copilot wants to merge 2 commits into
mainfrom
copilot/add-attach-build-artifact

Conversation

Copilot AI commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds upload-build-artifact, a safe output that lets an agent attach a workspace file to an arbitrary (typically already-finished) Azure DevOps build via the build attachments REST API. Fills the gap left by upload-artifact (#363), which can only target the current run via ##vso[artifact.upload].

The split mirrors create-pull-request and upload-artifact:

  • Stage 1 (MCP, in sandbox) — validates build_id / artifact_name / file_path, canonicalises the file inside the agent workspace (rejecting symlink escapes and directories), and copies it into output_directory under a generated unique name. The sandbox workspace is gone by Stage 3, so the file has to be staged.
  • Stage 3 (executor) — re-canonicalises the staged file inside working_directory, applies operator policy, then PUTs the bytes to …/_apis/build/builds/{buildId}/attachments/{type}/{name}?api-version=7.1-preview.1.

Files

  • src/safeoutputs/upload_build_artifact.rs — new tool: UploadBuildArtifactParams, UploadBuildArtifactResult (with build_id, artifact_name, file_path, staged_file, file_size and an explicit ::new() constructor), internal UploadBuildArtifactResultFields to satisfy the tool_result! macro, validation reusing is_valid_version / is_safe_path_segment from src/validate.rs, and the Stage 3 REST executor (+23 unit/executor tests).
  • src/safeoutputs/mod.rs — module, re-export, WRITE_REQUIRING_SAFE_OUTPUTS, ALL_KNOWN_SAFE_OUTPUTS, requires-write test.
  • src/execute.rs — dispatcher entry and per-run budget registration.
  • src/mcp.rsupload-build-artifact MCP tool: Stage 1 validation, workspace-bounded canonicalisation, directory rejection, file copy into output_directory, NDJSON write.
  • docs/safe-outputs.md — reference section.

Front-matter shape

safe-outputs:
  upload-build-artifact:
    max-file-size: 52428800              # default: 50 MB
    allowed-extensions: [".png", ".pdf", ".log"]
    allowed-artifact-names: ["agent-*"]  # `*` suffix → prefix match
    allowed-build-ids: [12345, 67890]    # empty = any build the token can reach
    name-prefix: "agent-"
    attachment-type: "agent-artifact"    # used as {type} in the URL; default "agent-artifact"
    max: 3

Notes / future work

  • Single-file only; directory uploads can be layered on later.
  • attachment-type is operator-controlled and re-validated against the artifact-name charset before being interpolated into the URL.
  • The default attachment-type is agent-artifact so the executor's contributions are visually distinguishable from the build's own pipeline-artifact list.

Test plan

  • cargo build
  • cargo test ✅ (all green; 23 new tests covering name/path/build-id validation, config defaults + YAML round-trip, serialization, and Stage 3 executor behaviour against a staged file in a temporary working directory — including allow-list, oversize, and missing-file rejection paths)
  • cargo clippy --all-targets --all-features ✅ (no new warnings on the new file)
  • Code-review feedback addressed: log a warning when the ADO build-attachment response JSON can't be parsed instead of silently swallowing the error.

Copilot AI and others added 2 commits May 1, 2026 10:41
@jamesadevine

Copy link
Copy Markdown
Collaborator

/rust-review

@github-actions

github-actions Bot commented May 1, 2026

Copy link
Copy Markdown
Contributor

Rust PR Reviewer completed successfully!

@github-actions

github-actions Bot commented May 1, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Well-implemented — the two-phase path canonicalization and defense-in-depth are solid. Two minor issues worth addressing before merge.


Findings

⚠️ Suggestions

  • src/safeoutputs/upload_build_artifact.rs — extension check operates on NDJSON file_path, not the staged file's actual extension

    In execute_impl, allowed-extensions are tested against self.file_path (the agent-supplied value stored in NDJSON):

    let has_valid_ext = config.allowed_extensions.iter().any(|ext| {
        self.file_path.to_lowercase().ends_with(&ext.to_lowercase())
    });

    The NDJSON field is agent-writable (Stage 1), so a malicious agent could write file_path: "report.pdf" while staging a .exe. Stage 2 detection is the safety net here, but checking the staged filename's extension (which Stage 1 derives from file_path and embeds in the staged name) would be strictly tamper-resistant:

    let has_valid_ext = config.allowed_extensions.iter().any(|ext| {
        self.staged_file.to_lowercase().ends_with(&ext.to_lowercase())
    });

    This matches how Stage 1 constructs staged_filename — the extension is already embedded there.

  • src/safeoutputs/upload_build_artifact.rsname_prefix not charset-validated at config parse time

    attachment_type gets an explicit is_valid_version check before URL interpolation (good). name_prefix does not — it's only sanitized by SanitizeConfig (strips control chars), and the combined final_name is validated at runtime instead. This means an operator who configures name-prefix: "my prefix " (with a space) gets an opaque runtime error saying "Resolved artifact name 'my prefix agent-report' is not valid" rather than a clear config-time error. Symmetry with attachment_type suggests validating name_prefix's charset at config load time (or in Validate).

✅ What Looks Good

  • Two-phase path safety — Stage 1 canonicalize + starts_with(bounding_directory) and Stage 3 canonicalize + starts_with(working_directory) are both correct and catch symlink escapes. The ordering (canonicalize first, then prefix-check) is right.
  • staged_file NDJSON value is canonicalize-checked at Stage 3 — even if the NDJSON is tampered with a path traversal value, the starts_with(canonical_base) guard catches it.
  • attachment_type re-validated before URL interpolation — the is_valid_version guard on attachment_type before it's embedded in the URL is solid defense-in-depth.
  • final_name re-validated at Stage 3is_valid_version(&final_name) after prefix concatenation prevents operator-prefix-induced charset violations from reaching the URL.
  • HTTP error handlingresponse.json().await.unwrap_or_else(|e| { warn!(...) }) correctly degrades gracefully rather than panicking on a malformed success response.
  • unwrap() uses are all test-only — production code paths consistently use ?/.context()/anyhow::bail!.

Generated by Rust PR Reviewer for issue #373 · ● 729.2K ·

jamesadevine added a commit that referenced this pull request May 2, 2026
* feat(safe-outputs): add unified upload-build-artifact safe output

Combines the concepts from #363 (upload-artifact, current build) and #373
(upload-build-artifact, arbitrary build) into a single tool that handles
both use cases via the ADO build attachments REST API.

Key design:
- build_id is Optional — omit to target the current pipeline run
  (resolved from BUILD_BUILDID in Stage 3)
- REST API (PUT .../builds/{buildId}/attachments/{type}/{name}) for all
  cases, replacing the ##vso[artifact.upload] logging command approach
- allowed-build-ids check is skipped when targeting the current build
  (implicitly trusted)
- Full Stage 1 → Stage 3 file staging contract: validate, canonicalize,
  bounds-check, stage file with extension preservation, re-canonicalize
  and re-verify in Stage 3

Supersedes #363 and #373.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: move file read after dry-run guard and use Path::extension() for allowlist check

- Move std::fs::read after the dry_run early return to avoid reading
  up to 50 MB into memory only to discard it in dry-run mode.
- Switch extension allowlist from suffix matching on the full path
  to Path::extension() comparison, preventing false matches like
  'catalog' matching allowed-extensions: ['log'].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: restore cache-memory heading in safe-outputs.md

The heading was accidentally dropped when the upload-build-artifact
section was inserted, leaving the cache-memory paragraph as a stray
note under the wrong section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: use async file read, add staged-file integrity check, clarify comments

- Switch std::fs::read to tokio::fs::read().await to avoid blocking
  the tokio runtime for files up to 50 MB.
- Add a warn!() when the live staged-file size differs from the size
  recorded at Stage 1, catching potential file tampering between stages.
- Add comments on is_valid_version() call sites noting intentional
  reuse for artifact-name and attachment-type charset validation.
- Add comment on staged filename explaining max-length reasoning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make staged-file size mismatch a hard failure

A size mismatch between Stage 1 and Stage 3 indicates the staged file
was modified between stages — fail hard rather than warning and
uploading potentially tampered content. Adds a test for the integrity
check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add attachment-type dot guard, Stage 1 size cap, executor tests

- Add starts_with('.') guard to attachment-type validation, matching
  the artifact_name check for consistency.
- Add defense-in-depth file size cap at Stage 1 (MCP) using the
  default 50 MB limit to prevent staging-disk exhaustion before
  Stage 3 enforces the operator's configured limit.
- Add three missing executor tests: extension allowlist rejection,
  artifact-name allowlist rejection, and name-prefix application.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add SHA-256 cross-stage integrity verification

Add SHA-256 hash recording at Stage 1 and verification at Stage 3 for
both file-staging safe outputs:

- upload-build-artifact: staged_sha256 field recorded after copying
  the file, verified before uploading to ADO. Catches same-size file
  swaps that the size-only check would miss.
- create-pull-request: patch_sha256 field recorded after writing the
  patch, verified before applying. Field is Optional for backward
  compatibility with existing NDJSON records.

Also adds sha256_hex() helper in upload_build_artifact module, reused
by both tools and the MCP handlers.

A separate issue (#381) tracks adding the full staging pattern
(including SHA-256) to upload-workitem-attachment, which currently
reads directly from source_directory without staging.

Also fixes: name-prefix length cap (50 chars), attachment-type
leading-dot guard, comment typo in MCP handler.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make patch_sha256 a required field on CreatePrResult

Pipelines are versioned with their YAML — no need for backward
compatibility with old NDJSON records. Making the hash optional
created a security hole where a missing hash would skip verification
entirely. Now patch_sha256 is a required String and verification is
unconditional.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: move sha256_hex to src/hash.rs, fix prefix slice panic

- Move sha256_hex() from upload_build_artifact.rs to a dedicated
  src/hash.rs module, removing the coupling between create_pr,
  mcp.rs and the upload_build_artifact module.
- Fix potential panic in name-prefix error path: &prefix[..20]
  can panic on multi-byte UTF-8 boundaries. Use
  prefix.chars().take(20).collect() instead.
- Revert upload_build_artifact module visibility back to private.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add SHA-256 mismatch tests, fix TOCTOU, extract is_valid_artifact_name

- Add non-dry-run SHA-256 mismatch tests for both upload-build-artifact
  and create-pull-request (the mismatch path fires before any HTTP call,
  so no credentials are needed).
- Fix TOCTOU in MCP handler: hash the source bytes before writing to
  the staging directory, instead of re-reading the staged copy.
- Extract is_valid_artifact_name() in validate.rs to decouple artifact
  name validation from version string validation, preventing silent
  breakage if is_valid_version is ever tightened.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: import DEFAULT_MAX_FILE_SIZE in MCP handler

Replace the floating STAGE1_MAX_FILE_SIZE copy with a direct import
of DEFAULT_MAX_FILE_SIZE from upload_build_artifact, ensuring the
Stage 1 cap stays in sync if the default is ever changed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants