Skip to content

Commit 69a634b

Browse files
jamesadevineCopilotCopilot
authored
feat(safeoutputs): add ado-aw-debug.create-issue for dogfood pipelines (#492)
* feat(safeoutputs): add ado-aw-debug.create-issue for dogfood pipelines Adds a new top-level `ado-aw-debug:` front-matter section that gates two debug-only knobs intended for dogfooding ado-aw pipelines from Azure DevOps back into `githubnext/ado-aw`: * `skip-integrity: bool` — OR-ed with the existing `--skip-integrity` CLI flag. * `create-issue:` — files a GitHub issue against an operator-configured target repository when the agent calls the `create-issue` MCP tool. The `create-issue` tool is **not** a regular safe output. It is default-deny at three independent layers: 1. The SafeOutputs MCP filter strips it from the tool router via a new `DEBUG_ONLY_TOOLS` constant unless explicitly enabled. 2. The compiler only emits `--enabled-tools create-issue` when `ado-aw-debug.create-issue:` is set, and rejects `safe-outputs.create-issue:` outright so the tool can't be smuggled in via the regular safe-outputs surface. 3. Stage 3 maintains an `ExecutionContext.debug_enabled_tools` set populated only from `ado-aw-debug:`. The executor refuses any `create-issue` NDJSON entry whose tool name is absent from the set, closing the gap where a forged entry could otherwise bypass the MCP-layer gate. Stage 3 authenticates against GitHub using a dedicated `ADO_AW_DEBUG_GITHUB_TOKEN` ADO pipeline variable surfaced through a new `github_token` field on `ExecutionContext`. The token is separate from the read-only `GITHUB_TOKEN` the agent sees in Stage 1. Other notable design choices: * `target-repo` is operator-only; the agent has no parameter to redirect issues elsewhere. * `allowed-labels` is **default-deny** — empty/absent rejects every agent-supplied label. Operators must opt in to unrestricted with `["*"]`. * The `target-repo` validator follows GitHub's login spec (no underscores or dots in owner segments). * Final issue title length is validated **after** `title-prefix` application. * Stage 3 error messages neutralise `##vso[…]` sequences in agent-supplied labels so a forged NDJSON entry can't echo a live pipeline command into stdout. Adds 30+ targeted tests across `safeoutputs::create_issue`, `mcp::tests`, `compile::common`, `compile::types`, and a fixture-based integration test asserting the YAML wiring. All 1356 dev-mode tests pass; clippy net-new errors: 0. See `docs/ado-aw-debug.md` for the full schema, security framing, and PAT setup instructions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(debug): gate create-issue enablement on successful config serialization Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/10030c23-a74f-4f97-b828-1242747b8ada Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
1 parent dcbdc29 commit 69a634b

18 files changed

Lines changed: 1893 additions & 46 deletions

AGENTS.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,9 @@ index to jump to the right page.
198198
- [`docs/safe-outputs.md`](docs/safe-outputs.md) — full reference for every
199199
safe-output tool agents can use to propose actions (PRs, work items, wiki
200200
pages, comments, etc.) plus their per-agent configuration.
201+
- [`docs/ado-aw-debug.md`](docs/ado-aw-debug.md) — debug-only `ado-aw-debug:`
202+
front-matter section (`skip-integrity`, `create-issue` for filing GitHub
203+
issues from dogfood pipelines). NOT a regular safe-output.
201204

202205
### Compiler internals & operations
203206

docs/ado-aw-debug.md

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# `ado-aw-debug:` — Debug-only front-matter section
2+
3+
_Part of the [ado-aw documentation](../AGENTS.md)._
4+
5+
> ⚠️ **This section is for dogfood pipelines only.**
6+
> Anything declared under `ado-aw-debug:` is **not** part of the regular
7+
> agent surface and is not recommended for general use. Knobs here exist so
8+
> the team can validate `githubnext/ado-aw` changes against real Azure
9+
> DevOps pipelines and file failures back to GitHub for triage. Each knob
10+
> bypasses or weakens a normal safety control.
11+
12+
The compiler accepts a top-level `ado-aw-debug:` block in agent front
13+
matter. Currently exposed knobs:
14+
15+
| Knob | Purpose | Default |
16+
| --- | --- | --- |
17+
| `skip-integrity` | Omit the "Verify pipeline integrity" step from the generated YAML. OR-ed with the `--skip-integrity` CLI flag. | `false` |
18+
| `create-issue` | Enable the [debug-only `create-issue`](#create-issue) safe output. | absent (disabled) |
19+
20+
Unrecognised keys under `ado-aw-debug:` cause a compile-time error
21+
(`#[serde(deny_unknown_fields)]`).
22+
23+
## `create-issue`
24+
25+
Files a GitHub issue against an operator-configured target repository.
26+
Used to surface failures from ADO-hosted dogfood pipelines back to
27+
`githubnext/ado-aw` for triage.
28+
29+
### Why it's gated
30+
31+
`create-issue` is **default-deny** at three layers:
32+
33+
1. **MCP layer.** The SafeOutputs MCP server lists `create-issue` in
34+
[`DEBUG_ONLY_TOOLS`][debug-only-tools], so the route is removed from
35+
the tool router unless the compiler explicitly opts in via
36+
`--enabled-tools`.
37+
2. **Compiler layer.** `--enabled-tools create-issue` is only emitted
38+
when `ado-aw-debug.create-issue:` is present in front matter. The
39+
compiler also rejects `safe-outputs.create-issue:` outright, so the
40+
tool can't be smuggled in via the regular safe-outputs surface.
41+
3. **Executor layer.** Stage 3 maintains a separate
42+
`ExecutionContext.debug_enabled_tools` set populated only from
43+
`ado-aw-debug:`. The executor refuses any NDJSON `create-issue`
44+
entry that isn't in that set, so a forged or smuggled NDJSON entry
45+
fails closed before any token is read.
46+
47+
[debug-only-tools]: ../src/safeoutputs/mod.rs
48+
49+
### Front-matter schema
50+
51+
```yaml
52+
ado-aw-debug:
53+
create-issue:
54+
target-repo: githubnext/ado-aw # REQUIRED. Operator-only; agent has no override.
55+
title-prefix: "[pipeline-failure] " # Optional; prepended to every agent title.
56+
labels: # Optional; static labels always applied.
57+
- pipeline-failure
58+
- automated
59+
allowed-labels: # Optional; default-deny — see below.
60+
- "agent-*"
61+
- "pipeline-failure"
62+
assignees: # Optional; static assignees always applied.
63+
- "jamesdevine"
64+
max: 3 # Optional; per-run budget. Default 1.
65+
```
66+
67+
* **`target-repo`** is required. Format `owner/repo`. The agent has no
68+
parameter to override it; you cannot redirect issues to a different
69+
repository at runtime.
70+
* **`title-prefix`** is appended at execution time. The final title length
71+
(prefix + agent title) must be ≤ 256 characters; longer titles fail at
72+
Stage 3.
73+
* **`labels`** are applied unconditionally to every issue, on top of any
74+
agent-supplied labels that pass `allowed-labels`.
75+
* **`allowed-labels`** is **default-deny**: an empty or absent list means
76+
**no agent-supplied labels are accepted**. To accept any agent label,
77+
set `allowed-labels: ["*"]` explicitly. Patterns may include `*`
78+
wildcards (e.g. `"agent-*"`).
79+
* **Allowed-label matching is case-insensitive.** It uses the same
80+
`tag_matches_pattern` helper as ADO tag allow-lists. GitHub labels are
81+
case-sensitive, so `allowed-labels: ["safe"]` will also admit
82+
`SAFE` and `Safe` — keep that in mind when modelling policy.
83+
* **`assignees`** are merged with agent-supplied assignees. There is
84+
intentionally no `allowed-assignees` allowlist in v1; if you need
85+
one, configure assignees only via the static `assignees:` list and
86+
skip the agent parameter.
87+
* **`max`** controls per-run budget the same way it does for other
88+
safe-output tools.
89+
90+
### Agent-supplied parameters
91+
92+
The agent calls the `create-issue` MCP tool with:
93+
94+
```jsonc
95+
{
96+
"title": "Pipeline failure on main",
97+
"body": "<markdown body, ≥ 30 chars>",
98+
"labels": ["pipeline-failure"], // optional
99+
"assignees": ["copilot"] // optional
100+
}
101+
```
102+
103+
The MCP-side `Validate` impl rejects ADO pipeline-command sequences in
104+
labels and assignees (see [src/safeoutputs/create_issue.rs](../src/safeoutputs/create_issue.rs)).
105+
Stage 3 also neutralises `##vso[…]` in any error messages it produces, so
106+
agent-supplied content cannot escape the executor's stdout.
107+
108+
### Pipeline variable: `ADO_AW_DEBUG_GITHUB_TOKEN`
109+
110+
Stage 3 authenticates against GitHub using the
111+
**`ADO_AW_DEBUG_GITHUB_TOKEN`** ADO pipeline variable. The compiler emits
112+
113+
```yaml
114+
env:
115+
SYSTEM_ACCESSTOKEN: $(SC_WRITE_TOKEN) # if write permissions: are set
116+
ADO_AW_DEBUG_GITHUB_TOKEN: $(ADO_AW_DEBUG_GITHUB_TOKEN) # only when ado-aw-debug.create-issue is set
117+
```
118+
119+
into the executor step's `env:` block. The token is **not** exposed to
120+
the agent in Stage 1 — the read-only `GITHUB_TOKEN` the agent sees is a
121+
separate variable wired through `engine.env` and used only for GitHub
122+
MCP read access.
123+
124+
### Setting up the PAT
125+
126+
1. **Generate a fine-grained PAT** scoped to **only** `target-repo` (e.g.
127+
`githubnext/ado-aw`). Required permissions:
128+
* Repository access: only the target repo.
129+
* Permissions: **Issues** = Read and write. Nothing else.
130+
2. **Store as a secret pipeline variable** named exactly
131+
`ADO_AW_DEBUG_GITHUB_TOKEN`. Mark it secret. Do **not** copy it into
132+
`engine.env` or any non-secret variable.
133+
3. **Confirm the operator-configured target-repo matches the PAT scope.**
134+
The compiler validator only checks shape (`owner/repo`); it cannot
135+
verify the PAT has access. If the PAT lacks Issues:write, the Stage 3
136+
call fails with the GitHub API error and Stage 3 reports
137+
`succeeded with issues`.
138+
4. `ado-aw configure` does **not** automate this variable today — set it
139+
manually in the ADO pipeline definition.
140+
141+
### Auto-footer
142+
143+
Every issue gets an auto-appended traceability footer that looks like:
144+
145+
```markdown
146+
<!-- ado-aw -->
147+
---
148+
Pipeline: `dogfood-failure-reporter`
149+
Run: <https://dev.azure.com/myorg/MyProject/_build/results?buildId=42>
150+
Trigger: `Manual`
151+
```
152+
153+
The `<!-- ado-aw -->` marker is stable so that future tooling can locate
154+
the generated content without parsing prose. The footer is built from
155+
`BUILD_BUILDID`, `BUILD_DEFINITIONNAME`, `BUILD_REASON`,
156+
`SYSTEM_TEAMFOUNDATIONCOLLECTIONURI` and `SYSTEM_TEAMPROJECT` — these are
157+
present whenever Stage 3 runs inside an ADO pipeline.
158+
159+
If your pipeline / org / project names are sensitive, do not enable
160+
`create-issue` against a public repo.
161+
162+
### Security checklist
163+
164+
- [ ] Target repo's GitHub PAT is scoped to that repo only and only has
165+
Issues:write.
166+
- [ ] `ADO_AW_DEBUG_GITHUB_TOKEN` is stored as a secret pipeline
167+
variable, never hard-coded or printed.
168+
- [ ] `allowed-labels` is set explicitly. Empty means default-deny;
169+
`["*"]` accepts any agent label — pick deliberately.
170+
- [ ] `target-repo` is private if the agent's prompts or pipeline
171+
metadata are sensitive (the auto-footer publishes ADO run URLs and
172+
pipeline names).
173+
- [ ] `skip-integrity` is **not** enabled in pipelines triggered by
174+
untrusted PRs.
175+
176+
## `skip-integrity`
177+
178+
Equivalent to passing `--skip-integrity` on the `ado-aw compile` CLI.
179+
Setting either OR setting both omits the `Verify pipeline integrity`
180+
step from the generated YAML.
181+
182+
The integrity step downloads the same `ado-aw` binary the pipeline was
183+
compiled with and runs `ado-aw check` against the committed pipeline
184+
file. Without it, a tampered `*.yml` won't be caught at run time.
185+
186+
Use this only for short-lived dogfood pipelines where you're iterating
187+
on the compiler and re-compiling frequently.
188+
189+
## See also
190+
191+
- [`docs/safe-outputs.md`](safe-outputs.md) — regular safe-outputs
192+
surface (`create-issue` is **not** in it).
193+
- [`docs/cli.md`](cli.md) — `--skip-integrity` CLI flag.
194+
- [`docs/template-markers.md`](template-markers.md) — `{{ executor_ado_env }}`
195+
and `{{ integrity_check }}` markers and their conditional behaviour.

docs/cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Global flags (apply to all subcommands): `--verbose, -v` (enable info-level logg
1313
- The agent auto-downloads the ado-aw compiler and handles the full lifecycle (create → compile → check)
1414
- `compile [<path>]` - Compile a markdown file to Azure DevOps pipeline YAML. If no path is given, auto-discovers and recompiles all detected agentic pipelines in the current directory.
1515
- `--output, -o <path>` - Optional output path for the generated YAML (only valid when a path is provided). If the path is an existing directory, the compiled YAML is written inside that directory using the default filename derived from the markdown source (e.g. `foo.md``<dir>/foo.lock.yml`).
16-
- `--skip-integrity` - *(debug builds only)* Omit the "Verify pipeline integrity" step from the generated pipeline. Useful during local development when the compiled output won't match a released compiler version. This flag is not available in release builds.
16+
- `--skip-integrity` - *(debug builds only)* Omit the "Verify pipeline integrity" step from the generated pipeline. Useful during local development when the compiled output won't match a released compiler version. This flag is not available in release builds. OR-ed with `ado-aw-debug.skip-integrity:` in front matter — either is sufficient. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
1717
- `--debug-pipeline` - *(debug builds only)* Include MCPG debug diagnostics in the generated pipeline: `DEBUG=*` environment variable for verbose MCPG logging, stderr streaming to log files, and a "Verify MCP backends" step that probes each backend with MCP initialize + tools/list before the agent runs. This flag is not available in release builds.
1818
- For `target: job` and `target: stage`, the output is an ADO YAML template (not a complete pipeline). Job names are prefixed with the agent name for uniqueness. Triggers configured via `on:` are ignored with a warning.
1919
- `check <pipeline>` - Verify that a compiled pipeline matches its source markdown

docs/safe-outputs.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22

33
_Part of the [ado-aw documentation](../AGENTS.md)._
44

5+
> ℹ️ The debug-only `create-issue` tool (used by dogfood pipelines to file
6+
> failure reports back to GitHub) is **not** a safe output and is not
7+
> configurable here. It is gated by a separate `ado-aw-debug:` front-matter
8+
> section and stripped from the SafeOutputs MCP server unless explicitly
9+
> enabled. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
10+
511
## Safe Outputs Configuration
612

713
The front matter supports a `safe-outputs:` field for configuring specific tool behaviors:

docs/template-markers.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,7 @@ Generates the "Verify pipeline integrity" pipeline step that downloads the relea
264264

265265
The step sets `workingDirectory: {{ trigger_repo_directory }}` so that the relative `{{ pipeline_path }}` argument resolves correctly when `repos:` produces a multi-repo `$(Build.SourcesDirectory)` layout, and so `ado-aw check`'s internal recompile can infer the ADO org from the trigger repo's git remote.
266266

267-
When the compiler is built with `--skip-integrity` (debug builds only), this placeholder is replaced with an empty string and the integrity step is omitted from the generated pipeline.
267+
When the compiler is built with `--skip-integrity` (debug builds only) **OR** when the agent's front matter sets `ado-aw-debug.skip-integrity: true`, this placeholder is replaced with an empty string and the integrity step is omitted from the generated pipeline. The two flags are OR-ed — either is sufficient. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
268268

269269
## {{ mcpg_debug_flags }}
270270

@@ -443,9 +443,12 @@ If `permissions.write` is not configured, this marker is replaced with an empty
443443

444444
## {{ executor_ado_env }}
445445

446-
Generates the complete `env:` block (including the `env:` key) for the Stage 3 executor step when `permissions.write` is configured. Sets `SYSTEM_ACCESSTOKEN` to the write service connection token (`SC_WRITE_TOKEN`).
446+
Generates the complete `env:` block (including the `env:` key) for the Stage 3 executor step. The block contains zero, one, or two lines depending on which features are configured:
447447

448-
If `permissions.write` is not configured, this marker is replaced with an empty string so that no `env:` block is emitted at all. Note: `System.AccessToken` is never used directly — all ADO tokens come from explicitly configured service connections.
448+
* `SYSTEM_ACCESSTOKEN: $(SC_WRITE_TOKEN)` — emitted when `permissions.write` is configured. Provides the write-capable ADO token to the executor.
449+
* `ADO_AW_DEBUG_GITHUB_TOKEN: $(ADO_AW_DEBUG_GITHUB_TOKEN)` — emitted when `ado-aw-debug.create-issue` is configured. Provides the GitHub PAT used by the debug-only `create-issue` safe output. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
450+
451+
If neither feature is configured, this marker is replaced with an empty string so that no `env:` block is emitted at all. Note: `System.AccessToken` is never used directly — all ADO tokens come from explicitly configured service connections, and the GitHub PAT is sourced from a dedicated pipeline variable separate from the read-only `GITHUB_TOKEN` the agent sees in Stage 1.
449452

450453
## {{ compiler_version }}
451454

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
name: "Dogfood Failure Reporter"
3+
description: "Files a GitHub issue on githubnext/ado-aw when a dogfood pipeline run fails"
4+
on:
5+
schedule: daily
6+
permissions:
7+
read: my-read-arm-connection
8+
ado-aw-debug:
9+
create-issue:
10+
target-repo: githubnext/ado-aw
11+
title-prefix: "[pipeline-failure] "
12+
labels:
13+
- pipeline-failure
14+
- automated
15+
allowed-labels:
16+
- "agent-*"
17+
- "pipeline-failure"
18+
assignees:
19+
- jamesdevine
20+
max: 3
21+
---
22+
23+
## Dogfood Failure Reporter
24+
25+
You are a dogfood failure-reporting agent for `githubnext/ado-aw`. You run
26+
in Azure DevOps inside an AWF-isolated sandbox.
27+
28+
### Tasks
29+
30+
1. Read the pipeline run logs available under `$BUILD_SOURCESDIRECTORY`
31+
for any signs of recent failures.
32+
2. For each distinct failure, file **one** GitHub issue using the
33+
`create-issue` MCP tool with:
34+
- A concise `title` describing the failure.
35+
- A markdown `body` with reproduction steps, log excerpts, and links
36+
to relevant ADO build URLs.
37+
- `labels: ["pipeline-failure"]` (must match the `allowed-labels` allowlist
38+
configured by the operator).
39+
3. Limit yourself to **at most 3** issues per run (the `max` budget).
40+
4. If you cannot file an issue (e.g., the failure isn't reproducible),
41+
call `report-incomplete` instead — do **not** invent details.
42+
43+
### Important
44+
45+
- Do not attempt to redirect issues to a different repository — the agent
46+
has no `target_repo` parameter and the target is fixed by the operator.
47+
- The `ADO_AW_DEBUG_GITHUB_TOKEN` PAT is **not** visible to you; it is
48+
used only by Stage 3 to authenticate against GitHub.
49+
- Issues are reviewed for prompt injection by Stage 2 before they are
50+
filed, so do not include text that looks like ADO pipeline commands
51+
(`##vso[...]`) — they will be flagged and the run rejected.

0 commit comments

Comments
 (0)