Skip to content

Commit 37f4160

Browse files
docs(site): add ado-aw audit reference page (#826)
1 parent 21fa57c commit 37f4160

3 files changed

Lines changed: 163 additions & 0 deletions

File tree

site/astro.config.mjs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ export default defineConfig({
7070
{ label: 'Filter IR', slug: 'reference/filter-ir' },
7171
{ label: 'ado-script', slug: 'reference/ado-script' },
7272
{ label: 'Codemods', slug: 'reference/codemods' },
73+
{ label: 'Audit', slug: 'reference/audit' },
7374
],
7475
},
7576
{
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
---
2+
title: "ado-aw audit"
3+
description: "Audit a completed Azure DevOps agentic pipeline build: download artifacts, run analyzers, and render a structured report."
4+
---
5+
6+
import { Steps } from '@astrojs/starlight/components';
7+
8+
`ado-aw audit` inspects one completed Azure DevOps build at a time. It downloads the three audit artifact families (agent outputs, detection outputs, safe outputs), runs the built-in analyzers (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, and missing-tool / missing-data / noop extraction), and renders a structured console report or the raw `AuditData` JSON.
9+
10+
## Usage
11+
12+
```
13+
ado-aw audit <build-id-or-url> [options]
14+
```
15+
16+
## Accepted input formats
17+
18+
| Input | Example |
19+
|---|---|
20+
| Numeric build ID | `12345` |
21+
| dev.azure.com URL | `https://dev.azure.com/my-org/My%20Project/_build/results?buildId=12345` |
22+
| dev.azure.com URL with job/step anchors | `...?buildId=12345&j=<guid>&t=<guid>` (accepted; the build-level audit still runs) |
23+
| Legacy visualstudio.com URL | `https://my-org.visualstudio.com/proj/_build/results?buildId=12345` |
24+
| On-prem Azure DevOps Server URL | `https://onprem.example.com/DefaultCollection/MyProject/_build/results?buildId=12345` |
25+
26+
URL-encoded project segments are decoded automatically. Both `t=` and `s=` are accepted as step-anchor parameters.
27+
28+
## Flags
29+
30+
| Flag | Default | Behavior |
31+
|---|---|---|
32+
| `-o, --output <dir>` | `./logs` | Directory under which `<dir>/build-<id>/` is written. |
33+
| `--json` | off | Emit the full `AuditData` as JSON to stdout. Suppresses the trailing `Audit complete` stderr line. |
34+
| `--org <url>` | auto | ADO organization override for bare build IDs. Full build URLs supply this directly. |
35+
| `--project <name>` | auto | ADO project override for bare build IDs. Full build URLs supply this directly. |
36+
| `--pat <token>` | env | Personal Access Token. Also reads `AZURE_DEVOPS_EXT_PAT`. Falls back to the Azure CLI auth chain when omitted. |
37+
| `--artifacts <set,...>` | all | Restrict download + analysis to a subset. Valid values: `agent`, `detection`, `safe-outputs` (`safe_outputs` is also accepted). |
38+
| `--no-cache` | off | Force re-processing even if `<dir>/build-<id>/run-summary.json` already exists. |
39+
40+
## Behavior
41+
42+
- **Input resolution.** Bare IDs use `--org` / `--project` or git-remote auto-detection. Full build URLs contribute host, org, and project — those URL-derived values win over CLI flags.
43+
- **Artifact scope.** Only `agent_outputs*`, `analyzed_outputs*`, and `safe_outputs*` are fetched. All other published build artifacts are ignored.
44+
- **Artifact refresh.** If a local artifact directory already exists, it is renamed aside before re-download and restored if the download fails — no data is lost on a network error.
45+
- **Analyzer failures are soft.** The command records a warning, keeps any successfully-derived sections, and still renders the report.
46+
- **Multiple directories.** When multiple local directories share one recognized prefix, the lexicographically last match wins.
47+
48+
## Output layout
49+
50+
```
51+
<output>/build-<id>/
52+
├── run-summary.json # Cached AuditData, CLI-version-keyed
53+
├── agent_outputs[_<BuildId>]/ # Agent stage artifacts
54+
│ ├── staging/
55+
│ │ ├── safe_outputs.ndjson # Agent's safe-output proposals
56+
│ │ ├── aw_info.json # Runtime engine / agent / source metadata
57+
│ │ └── otel.jsonl # Copilot OTel (when emitted)
58+
│ └── logs/
59+
│ ├── firewall/ # AWF Squid proxy logs
60+
│ ├── mcpg/ # MCP Gateway logs
61+
│ ├── safeoutputs.log # SafeOutputs HTTP server log
62+
│ └── agent-output.txt # Filtered agent stdout
63+
├── analyzed_outputs[_<BuildId>]/ # Detection stage artifacts
64+
│ ├── threat-analysis.json # Aggregate verdict + reasons
65+
│ └── threat-analysis-output.txt
66+
└── safe_outputs[_<BuildId>]/ # SafeOutputs stage artifacts
67+
└── safe-outputs-executed.ndjson # Per-item execution log
68+
```
69+
70+
`aw_info.json`, `otel.jsonl`, and `safe_outputs.ndjson` are searched in `staging/` first, then at the artifact top level, so older artifact layouts still audit cleanly.
71+
72+
## Report shape (`AuditData`)
73+
74+
Optional sections are omitted from `--json` output when empty.
75+
76+
| Key | Source |
77+
|---|---|
78+
| `overview` | ADO build metadata + `aw_info.json` (engine, model, agent name, source, target). |
79+
| `task_domain` | Audit heuristics over the run's prompts and outputs. |
80+
| `behavior_fingerprint` | Higher-level heuristics over the run's behavior patterns. |
81+
| `agentic_assessments` | Higher-level assessments emitted by the analyzers. |
82+
| `metrics` | OTel JSONL (`otel.jsonl`) plus audit-time warning/error counts. |
83+
| `key_findings` | Heuristic rules + analyzer findings (e.g. aggregate-gate rejection). |
84+
| `recommendations` | Follow-up actions derived from findings. |
85+
| `performance_metrics` | Derived from `metrics`, runtime duration, tool usage, and firewall counts. |
86+
| `engine_config` | Runtime engine configuration from `aw_info.json`. |
87+
| `safe_output_summary` | Counts of proposed / executed / rejected / not-processed items. |
88+
| `safe_output_execution` | Per-item trace joining proposal + detection + execution. |
89+
| `rejected_safe_outputs` | Rollup of rejections by reason/threat flag. |
90+
| `detection_analysis` | Contents of `threat-analysis.json`. |
91+
| `mcp_server_health` | MCPG logs aggregated per server. |
92+
| `mcp_tool_usage` | MCPG logs aggregated per `(server, tool)`. |
93+
| `mcp_failures` | MCPG `tool_error` / `server_error` events. |
94+
| `jobs` | ADO `/timeline` records filtered to `type: Job`. |
95+
| `firewall_analysis` | AWF Squid proxy logs aggregated by domain. |
96+
| `policy_analysis` | AWF policy artifacts aggregated into allow/deny summaries. |
97+
| `missing_tools` / `missing_data` / `noops` | NDJSON entries from the corresponding SafeOutputs MCP tools. |
98+
| `downloaded_files` | One entry per file under `<output>/build-<id>/`. |
99+
| `errors` / `warnings` | Run-level error/warning aggregates. |
100+
| `tool_usage` | High-level tool-usage rollups derived from telemetry. |
101+
| `created_items` | Successfully executed items with extracted id/url/title. |
102+
103+
## Rejected safe-output trace
104+
105+
When `threat-analysis.json` reports any threat flag, the audit treats the entire SafeOutputs batch as rejected by the aggregate gate and records each proposal with:
106+
107+
- `status: not_processed_due_to_aggregate_gate`
108+
- `applies_to_whole_batch: true`
109+
- `rejection_reason`: the aggregate `reasons[]` from `threat-analysis.json`, joined with `; `
110+
111+
One severity-`high` finding is also emitted summarizing the gate decision: which threat flags fired, how many proposals were dropped, and the full aggregate reasons.
112+
113+
:::note[Per-item verdicts]
114+
`threat-analysis.json` currently emits an aggregate verdict only. Per-item detection verdicts are a planned follow-up.
115+
:::
116+
117+
## Cache behavior
118+
119+
`<output>/build-<id>/run-summary.json` is written after each successful run.
120+
121+
| Scenario | Behavior |
122+
|---|---|
123+
| Cached `ado_aw_version` matches current CLI | Report rendered from cache; download/analysis skipped. |
124+
| Cache missing, unparseable, or from a different version | Cache ignored; build reprocessed from scratch. |
125+
| `--no-cache` passed | Always reprocesses. |
126+
127+
The cache-hit info line is printed only in console mode (not with `--json`).
128+
129+
## Permission failures
130+
131+
- The initial build-metadata fetch is live ADO only. A 401/403 at this step is fatal.
132+
- If artifact listing or download returns 401/403 and at least one recognized artifact family exists locally, the audit continues from local cache and records a warning.
133+
- If artifact listing or download returns 401/403 and no local cache exists, the command emits a structured error pointing at the manual escape hatch:
134+
135+
```bash
136+
az pipelines runs artifact download --run-id <id> --path <dir>
137+
```
138+
139+
## Related
140+
141+
- [CLI Commands](/ado-aw/setup/cli/) — full CLI reference
142+
- [Safe Outputs](/ado-aw/reference/safe-outputs/) — what agent proposals look like
143+
- [Network](/ado-aw/reference/network/) — AWF firewall configuration
144+
- [ado-aw-debug](/ado-aw/reference/ado-aw-debug/) — debug-only front-matter knobs

site/src/content/docs/setup/cli.mdx

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,24 @@ Options:
209209
- `--org`, `--project`, `--pat` -- same as `enable`
210210
- `--dry-run` -- preview the planned queue body without calling the ADO API
211211

212+
### `audit <build-id-or-url>`
213+
214+
Audit one completed Azure DevOps agentic pipeline build. Downloads the three audit artifact families (agent outputs, detection outputs, safe outputs), runs the built-in analyzers, and renders a structured console report.
215+
216+
```bash
217+
ado-aw audit <build-id-or-url> [--json] [--output <dir>] [--artifacts <set,...>] [--no-cache]
218+
```
219+
220+
Options:
221+
222+
- `--json` -- emit the full `AuditData` as JSON to stdout instead of the console report
223+
- `-o, --output <dir>` -- local directory for downloaded artifacts and the cached report (default: `./logs`)
224+
- `--artifacts <set,...>` -- restrict download to `agent`, `detection`, and/or `safe-outputs`
225+
- `--no-cache` -- re-process even when a cached `run-summary.json` already exists
226+
- `--org`, `--project`, `--pat` -- same as `enable`
227+
228+
See the [Audit reference](/ado-aw/reference/audit/) for accepted URL formats, report shape, cache behavior, and permission failure handling.
229+
212230
## Internal / pipeline runtime commands
213231

214232
These commands are used by the compiled pipeline itself and are not typically called by users directly.

0 commit comments

Comments
 (0)