Skip to content

Commit 5c33e40

Browse files
feat(audit): add ado-aw audit <build-id-or-url> command (#691)
* feat(audit): add `ado-aw audit <build-id-or-url>` command Single-run audit: download a build's artifacts, run every analyzer (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, missing tools/data/noops), and emit a Markdown or JSON report. ADO-side counterpart to `gh aw audit`. New module tree under `src/audit/`: - `model.rs` — `AuditData` (drift-compatible with gh-aw's top-level contract; adds ADO-specific `detection_analysis`, `safe_output_execution`, `rejected_safe_outputs` sections). - `url.rs` — parses bare IDs, dev.azure.com URLs, legacy visualstudio.com URLs, and on-prem Azure DevOps Server URLs (with optional `&j=`/`&t=`/`&s=` job/step anchors). - `cache.rs` — CLI-version-keyed `run-summary.json` with atomic writes. - `analyzers/{firewall,policy,mcp,otel,safe_outputs,detection,missing,jobs}.rs` — eight defensive NDJSON/REST analyzers. - `findings.rs` — eight heuristic rules emitting severity-rated findings + recommendations. - `render/{console,json}.rs` — two renderers; JSON shape is the public contract. - `cli.rs` — orchestration: URL parse → auth → metadata fetch → artifact download → analyzers → findings → cache → render. Unified rejection trace: when the aggregate `THREAT_DETECTION_RESULT` has any threat flag set, every proposal lands in `not_processed_due_to_aggregate_gate` carrying the aggregate `reasons[]`, exactly one severity-`high` `KeyFinding` is emitted, and a `rejected_safe_outputs` rollup appears at the top level. Pipeline-side runtime additions (so an `ado-aw audit` of an existing build has the data it needs): - `src/data/*-base.yml` (via `AdoAwMarkerExtension`): emits `staging/aw_info.json` at runtime with engine, model, agent name, source path, target, compiler version, and ADO build context. - `src/execute.rs`: writes a per-item `safe-outputs-executed.ndjson` in `<output-dir>` so the audit can show the proposed → detection → executed trace. CLI surface: ado-aw audit <build-id-or-url> -o, --output <dir> # default ./logs --json --org / --project / --pat --artifacts <agent,detection,safe-outputs> --no-cache New dependencies: `zip` (artifact unpack), `wiremock` (dev only — integration test mock server). Tests: 80 new audit unit tests + 3 integration tests against a fake ADO REST server (happy path, permission-denied, cache hit) using a thin `ADO_AW_TEST_ORG_URL` test seam. 1740 total tests pass. Docs: new `docs/audit.md`; updates to `docs/cli.md`, `README.md`, `AGENTS.md` index, and `prompts/debug-ado-agentic-workflow.md` (Step 1 first-move + new Step 2a-prime + `AuditData` reference + jq-diff fallback). Out of scope (explicit follow-ups): diff mode, cross-run trends, `--parse` log.md/firewall.md, job/step-anchored audit, MCP-exposed audit, per-item detection verdict (upstream coordination with gh-aw), partial-approval gating, AWF policy-manifest plumbing, AWF token-usage.jsonl, `audit-manifest.json` build inventory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(audit): address PR review feedback Three issues raised by the Rust PR Reviewer on #691: 1. **Lexicographic sort wrong for multi-digit run IDs.** Previously `find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix` picked the "lexicographically last" `<prefix>_<id>` directory, which sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the older verdict would be read and the run could be mis-classified as safe. New `crate::audit::cmp_numeric_suffix` extracts the trailing token after the final `_`, parses it as `u64`, and compares numerically with a lexicographic tie-breaker for non-numeric suffixes. All three call sites now use it. Regression tests added in mod.rs, detection.rs, and cli.rs. 2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.** The override was `#[doc(hidden)]` but not gated by build mode, so a stray env var (debugging leftover, hostile CI environment) could silently redirect ADO REST calls to an attacker-controlled URL in a release binary. Gated on `cfg(debug_assertions)`: debug builds (`cargo test`, `cargo run`) keep the override AND emit a loud `warn!` on every invocation; release builds (all published artifacts via `cargo build --release`) replace the body with a no-op so a stray env var has no effect. The integration test in `tests/audit_it.rs` continues to work because `cargo test` builds in debug mode. 3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs` had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`) using sync I/O from inside `async fn analyze_safe_outputs`. On a Tokio multi-thread runtime this blocks an executor thread for the duration of the directory walk. Both helpers converted to `async fn` using `tokio::fs::read_dir`. The recursive `collect_named_files` uses `Box::pin` to satisfy the async-recursion shape (consistent with the existing pattern in `crate::detect::scan_directory`). Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5 new regression tests for the numeric-suffix bug). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(audit): replace brittle status detection with structural flags Two issues in src/execute.rs that made the executed-NDJSON manifest silently mis-classify entries: 1. is_budget_exhausted parsed the human-readable message ("Skipped...maximum...already reached"). Any phrasing tweak to check_budget would have silently downgraded budget-exhausted records to status: "failed" in every audit log, with no compile-time signal. 2. is_warning() entries (e.g. noop / missing-tool that succeeded without ADO credentials) were emitted as status: "skipped", conflating successful-but-no-op runs with the rejection bucket used by the audit rollup. Fixes: - Add ExecutionResult.budget_exhausted: bool with budget_exhausted() constructor and is_budget_exhausted() accessor. check_budget now emits a structurally-tagged result; execution_record_status keys off the flag. Refactor-safe. - Map warning results to status: "warning" (distinct from "skipped"). Add SafeOutputStatus::Warning; counted toward executed_count, never toward rejected_by_execution_count or the rejection rollup. - Update affected unit tests; add coverage for budget_exhausted serialization round-trip and warning-status analyzer mapping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(audit): attach auth to artifact downloads and guard against PAT exfil Two issues surfaced by the adversarial review: 1. download_build_artifact discarded its �uth argument src/ado/mod.rs::download_build_artifact took &AdoAuth but called client.get(download_url).send() with no �uth.apply(...) wrapper, silenced by let _ = auth;. For ADO artifact resource types whose downloadUrl is not pre-signed (legacy/Container artifacts and many on-prem ADO Server topologies) the request returned 401/403, and the 401 branch then misleadingly told the user to check their PAT scopes — which had never been sent. Fix: wrap with �uth.apply(...) to match list_build_artifacts / get_build. Pre-signed Artifact Services URLs continue to work because they ignore the Authorization header in favor of their sig= query string. 2. �udit <URL> sent the local PAT to any host in the URL src/audit/url.rs accepted on-prem URLs with an arbitrary host; apply_parsed_context_overrides plumbed that host straight into ctx.org_url; and AdoAuth::apply then attached the PAT via HTTP Basic Auth — to whatever host the URL named. A user social-engineered into running �do-aw audit https://attacker.example.com/Coll/Proj/_build/results?buildId=1 would silently exfiltrate their PAT. Fix: validate_audit_url_host() now runs before any auth-bearing request. Microsoft-managed cloud hosts (dev.azure.com, *.visualstudio.com) are always trusted. Any other host must match the host in the trusted ADO context, which is itself resolved from --org or the local git remote — both explicit, locally controlled trust anchors. The check is host-suffix anchored (rejects .visualstudio.com bare-suffix and �isualstudio.com.attacker.example lookalikes) and case-insensitive. Adds 11 unit tests covering: cloud-host allowlist (positive and case-insensitive), bare-suffix / lookalike rejection, trusted-context match, trusted-context mismatch, no-trusted-context fallback, and host extraction edge cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(audit): decouple trust-anchor resolution from --project requirement Follow-up to ea9c032. The previous fix derived the trusted host from resolve_ado_context(), which requires BOTH --org AND --project when running outside a git repo. That created a UX regression: when a user ran �do-aw audit https://onprem.example.com/coll/proj/_build/results?buildId=42 from an arbitrary folder with --org https://onprem.example.com/coll, the trust anchor still came back as None (because --project wasn't passed), and validate_audit_url_host then rejected the URL host telling the user to "pass --org" — which they already had. Fix: introduce resolve_trusted_host(cwd, org_flag) that derives a host from --org if provided (any form normalize_org_url accepts), else from the git remote of cwd, else None. validate_audit_url_host now takes Option<&str> directly. Full-context resolution still runs afterward for its original purpose (supplying defaults the URL overrides supersede). Adds 4 regression tests, including one for the original failure mode: running in an arbitrary folder with --org alone (no --project) must yield a usable trust anchor. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 9c500e3 commit 5c33e40

34 files changed

Lines changed: 10775 additions & 192 deletions

AGENTS.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -227,7 +227,10 @@ index to jump to the right page.
227227
in `src/data/base.yml`, `src/data/1es-base.yml`, `src/data/job-base.yml`, and `src/data/stage-base.yml` and how it is replaced.
228228
- [`docs/cli.md`](docs/cli.md)`ado-aw` CLI commands (`init`, `compile`,
229229
`check`, `mcp`, `mcp-http`, `execute`, `secrets`, `enable`, `disable`,
230-
`remove`, `list`, `status`, `run`; `configure` is a deprecated hidden alias).
230+
`remove`, `list`, `status`, `run`, `audit`; `configure` is a deprecated hidden alias).
231+
- [`docs/audit.md`](docs/audit.md)`ado-aw audit`: accepted build-id / URL
232+
forms, artifact layout, cache behavior, rejection tracing, and `AuditData`
233+
report shape.
231234
- [`docs/mcp.md`](docs/mcp.md) — MCP server configuration (stdio containers,
232235
HTTP servers, env passthrough).
233236
- [`docs/mcpg.md`](docs/mcpg.md) — MCP Gateway architecture and pipeline

Cargo.lock

Lines changed: 154 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ base64 = "0.22.1"
3535
glob-match = "0.2.1"
3636
similar = "3.1.0"
3737
sha2 = "0.11.0"
38+
zip = { version = "8.6.0", default-features = false, features = ["deflate"] }
3839

3940
[dev-dependencies]
4041
reqwest = { version = "0.12", features = ["blocking"] }
42+
wiremock = "0.6"

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,8 @@ network:
496496

497497
## CLI Reference
498498

499+
- `audit <build-id-or-url>` - Audit a single Azure DevOps build: download artifacts, analyze logs, render Markdown or JSON report. See [`docs/audit.md`](docs/audit.md).
500+
499501
```
500502
ado-aw [OPTIONS] <COMMAND>
501503
@@ -513,6 +515,7 @@ Commands:
513515
list List matched ADO definitions with their latest-run state
514516
status Per-pipeline status block for matched ADO definitions
515517
run Queue builds for matched ADO definitions (optionally poll to completion)
518+
audit Audit a single Azure DevOps build: download artifacts, analyze logs, render a report
516519
517520
Options:
518521
-v, --verbose Enable info-level logging

0 commit comments

Comments
 (0)