|
| 1 | +# Execution Context |
| 2 | + |
| 3 | +_Part of the [ado-aw documentation](../AGENTS.md)._ |
| 4 | + |
| 5 | +The **execution-context plugin** stages per-run context (changed files, |
| 6 | +diffs, base/head SHAs, file snapshots, metadata) on disk in a stable |
| 7 | +layout under `aw-context/` *before* the agent starts. The agent then |
| 8 | +reads these files instead of running its own `git fetch` / `git diff` |
| 9 | +plumbing. |
| 10 | + |
| 11 | +This is an always-on compiler extension. There is no `tools:` entry to |
| 12 | +enable it; per-trigger contributors gate themselves based on the |
| 13 | +agent's `on:` configuration. |
| 14 | + |
| 15 | +> Background and motivation: this feature was tracked in |
| 16 | +> [issue #860](https://github.com/githubnext/ado-aw/issues/860). |
| 17 | +
|
| 18 | +## Why this exists |
| 19 | + |
| 20 | +PR-reviewer agents almost always need the same precondition: a fully |
| 21 | +fetched target branch, resolved base / head SHAs, a unified diff, and |
| 22 | +optionally pre / post snapshots of touched files. ADO's default |
| 23 | +`checkout: self` is shallow (`fetchDepth: 1`), doesn't fetch the PR |
| 24 | +target branch, and (deliberately) does not persist credentials into |
| 25 | +`.git/config` for OAuth bearer reuse. Every PR-reviewer agent has |
| 26 | +historically rebuilt the same ~120 lines of bash to work around this. |
| 27 | + |
| 28 | +The execution-context plugin owns that step centrally: |
| 29 | + |
| 30 | +- One canonical implementation that evolves with the framework. |
| 31 | +- Driven by ADO's predefined `System.PullRequest.*` variables — no |
| 32 | + manual ref discovery. |
| 33 | +- Inside the trust boundary: the bearer token used to fetch is |
| 34 | + scoped to the precompute step's process env and never reaches the |
| 35 | + agent container or `.git/config`. |
| 36 | + |
| 37 | +## v1 contributors |
| 38 | + |
| 39 | +| Contributor | Trigger | Output layout | |
| 40 | +|-------------|----------------|--------------------------| |
| 41 | +| `pr` | `on.pr` | `aw-context/pr/*` | |
| 42 | + |
| 43 | +Future trigger contributors (pipeline-completion, schedule, manual) |
| 44 | +plug in via the same internal `ContextContributor` trait without |
| 45 | +breaking the agent-facing layout. |
| 46 | + |
| 47 | +## Front-matter surface |
| 48 | + |
| 49 | +```yaml |
| 50 | +execution-context: |
| 51 | + enabled: true # master switch; defaults to true |
| 52 | + pr: # PR contributor configuration |
| 53 | + enabled: true # defaults to true when `on.pr` is configured |
| 54 | + scope: # pathspecs scoping diff + snapshots |
| 55 | + - "src/**" |
| 56 | + - "docs/**" |
| 57 | + - ":(top,glob)*.yml" |
| 58 | + unified: 3 # `-U` lines of context for diff.patch |
| 59 | + max-diff-bytes: 524288 # truncate diff.patch beyond this many bytes |
| 60 | + snapshots: true # write head-files/ and base-files/ |
| 61 | +``` |
| 62 | +
|
| 63 | +All keys are optional. When the `execution-context:` block is omitted |
| 64 | +entirely, defaults are *"on for the triggers configured in `on:`"*. |
| 65 | + |
| 66 | +### Fields |
| 67 | + |
| 68 | +- **`enabled`** (`bool`, default `true`) — master switch. When `false`, |
| 69 | + no contributor runs and no `aw-context/` is staged. |
| 70 | +- **`pr.enabled`** (`bool`, default `true` when `on.pr` is set) — |
| 71 | + whether to activate the PR contributor. Set `false` to opt out on |
| 72 | + huge monorepos where the targeted fetch + diff cost is unacceptable |
| 73 | + (the agent then has to roll its own equivalent). |
| 74 | +- **`pr.scope`** (`list[string]`, default `[]` = all paths) — pathspecs |
| 75 | + passed to `git diff -- <scope>` for both `changed-files-in-scope.txt` |
| 76 | + and `diff.patch`. Sanitised at compile time. |
| 77 | +- **`pr.unified`** (`u32`, default `3`) — `-U` lines of context for |
| 78 | + `diff.patch`. |
| 79 | +- **`pr.max-diff-bytes`** (`u64`, default `524288` / 512 KiB) — cap on |
| 80 | + `diff.patch` size. When exceeded, the file ends with a literal |
| 81 | + marker line `--- TRUNCATED at <N> bytes; full diff suppressed ---` |
| 82 | + so the agent knows it is reading a partial diff. |
| 83 | +- **`pr.snapshots`** (`bool`, default `true`) — whether to write per-file |
| 84 | + pre / post snapshots under `head-files/` and `base-files/`. Disable on |
| 85 | + large changes if you only need the diff. |
| 86 | + |
| 87 | +## Agent-visible layout |
| 88 | + |
| 89 | +For PR-triggered builds, the precompute step stages files under |
| 90 | +`$(Build.SourcesDirectory)/aw-context/` (i.e. relative to the agent's |
| 91 | +working directory): |
| 92 | + |
| 93 | +``` |
| 94 | +aw-context/ |
| 95 | + status.txt # OK | (errors propagate to per-contributor files) |
| 96 | + trigger.txt # pr (today; future: pipeline / schedule / manual) |
| 97 | + metadata.txt # build_id, build_reason, repository, source_branch |
| 98 | + pr/ |
| 99 | + status.txt # OK | NO_PR_CONTEXT | DIFF_RESOLUTION_FAILED |
| 100 | + metadata.txt # pr_id, source_branch, target_branch, base_sha, head_sha |
| 101 | + changed-files.txt # full `git diff --name-status` |
| 102 | + changed-files-in-scope.txt # name-status restricted to `scope` |
| 103 | + diff.patch # unified diff, scoped, capped, may end with TRUNCATED marker |
| 104 | + head-files/<path> # post-PR snapshots of A/M/T/R*/C* files in scope |
| 105 | + base-files/<path> # pre-PR snapshots of D files in scope |
| 106 | + error.txt # only present when pr/status.txt != OK |
| 107 | +``` |
| 108 | + |
| 109 | +**Agents MUST read `aw-context/pr/status.txt` first** and act on its |
| 110 | +value: |
| 111 | + |
| 112 | +- `OK` — `aw-context/pr/*` is fully populated. Prefer reading those |
| 113 | + files over running `git fetch` / `git diff` yourself. |
| 114 | +- `NO_PR_CONTEXT` — the build is not a PR (e.g. manual queue of a |
| 115 | + PR-triggered pipeline). Skip PR-specific logic. |
| 116 | +- `DIFF_RESOLUTION_FAILED` — the precompute step ran but could not |
| 117 | + resolve the base / head SHAs. See `aw-context/pr/error.txt` for the |
| 118 | + reason. Surface this in your output rather than silently producing |
| 119 | + an empty review. |
| 120 | +- `CONTEXT_GENERATION_FAILED` — base / head SHAs resolved, but at |
| 121 | + least one of the `git diff` commands that populates the staged |
| 122 | + files failed. The `metadata.txt` file is still trustworthy, but |
| 123 | + `changed-files.txt`, `changed-files-in-scope.txt`, or `diff.patch` |
| 124 | + may be empty or partial. See `aw-context/pr/error.txt`. |
| 125 | + |
| 126 | +If `aw-context/pr/status.txt` does not exist at all (e.g. when the |
| 127 | +extension is disabled), treat it as `NO_PR_CONTEXT`. |
| 128 | + |
| 129 | +## What the precompute step does |
| 130 | + |
| 131 | +The PR contributor's generated bash step: |
| 132 | + |
| 133 | +1. **Reads `System.PullRequest.*` from the environment.** No manual ref |
| 134 | + discovery — ADO already populates `SourceBranch`, `TargetBranch`, |
| 135 | + and `PullRequestId`. If they are missing, writes `NO_PR_CONTEXT` |
| 136 | + and exits 0. |
| 137 | +2. **Detects merge-commit shape first.** If `HEAD` has two parents |
| 138 | + (the synthetic merge commit ADO checks out for PR builds), uses |
| 139 | + `HEAD^1` / `HEAD^2` as base / head and skips the target-branch |
| 140 | + fetch entirely. Otherwise: |
| 141 | +3. **Fetches the PR target branch with progressive deepening** — |
| 142 | + `--depth=200`, then `500`, then `2000`, then finally `--unshallow`. |
| 143 | + **After each successful fetch, attempts `git merge-base |
| 144 | + origin/<target> HEAD`** and continues to the next depth if it |
| 145 | + cannot resolve yet. Bounded bandwidth on the common case; covers |
| 146 | + the long-tail PR-against-old-base case. On exhaustion writes |
| 147 | + `DIFF_RESOLUTION_FAILED`. |
| 148 | +4. **Writes `metadata.txt`, `changed-files.txt`, |
| 149 | + `changed-files-in-scope.txt`, `diff.patch`.** The diff is scoped to |
| 150 | + `pr.scope` (or all paths if empty) and truncated at `pr.max-diff-bytes` |
| 151 | + with a literal marker. If any of these `git diff` invocations fails, |
| 152 | + the status becomes `CONTEXT_GENERATION_FAILED` rather than `OK`. |
| 153 | +5. **Snapshots** (when `pr.snapshots: true`) — for each in-scope file: |
| 154 | + `head-files/<path>` for `A`/`M`/`T`/`R*`/`C*` entries, |
| 155 | + `base-files/<path>` for `D` entries. |
| 156 | +6. **Writes the final status** to `pr/status.txt` and `status.txt`. |
| 157 | + |
| 158 | +The step is gated by `condition: eq(variables['Build.Reason'], |
| 159 | +'PullRequest')` so it is a no-op on manual or scheduled queues of a |
| 160 | +PR-triggered pipeline. |
| 161 | + |
| 162 | +## Trust boundary |
| 163 | + |
| 164 | +The PR contributor must fetch the PR target branch (which the default |
| 165 | +checkout does not), but doing so requires an OAuth bearer. ado-aw |
| 166 | +preserves the Stage 1 read-only invariant with these design choices: |
| 167 | + |
| 168 | +| Mechanism | Decision | |
| 169 | +|---------------------------------------------|----------| |
| 170 | +| Override `checkout: self` with `persistCredentials: true` | **Rejected.** It would write the build identity's bearer into `.git/config` inside the workspace, which is then mounted into the AWF sandbox where the agent could read and exfiltrate it. | |
| 171 | +| Override `checkout: self` with `fetchDepth: 0` | **Rejected.** Unnecessary — the precompute fetches exactly the two refs it needs. | |
| 172 | +| In-step `SYSTEM_ACCESSTOKEN` + bash bearer wrapper | **Adopted.** `SYSTEM_ACCESSTOKEN` is mapped from `$(System.AccessToken)` only into the precompute step's process env. A `git_fetch` wrapper injects `git -c http.extraheader="Authorization: bearer ${SYSTEM_ACCESSTOKEN}" fetch …`. The token lives only in the bash step's process memory and is never written to disk. | |
| 173 | + |
| 174 | +After the precompute step exits, the bearer is gone from the runtime |
| 175 | +environment the agent inherits, `.git/config` contains no |
| 176 | +`http.extraheader` line, and the agent container is started by AWF |
| 177 | +with its own (read-only) MI from the ARM service connection. |
| 178 | + |
| 179 | +The compile-time test `test_execution_context_pr_does_not_leak_system_accesstoken` |
| 180 | +asserts that generated YAML never contains `persistCredentials: true`, |
| 181 | +never writes to `.git/config`, and that `SYSTEM_ACCESSTOKEN` appears |
| 182 | +only in the execution-context prepare step. |
| 183 | + |
| 184 | +## Migrating from a hand-rolled precompute |
| 185 | + |
| 186 | +If you have an existing PR-reviewer agent with a `steps:` block that |
| 187 | +manually fetches the target branch, resolves merge-base, and emits a |
| 188 | +diff: delete that block, ensure `on.pr` is configured, and read from |
| 189 | +`aw-context/pr/*` in your agent prompt. The prompt supplement is |
| 190 | +appended automatically — you do not need to mention the layout in your |
| 191 | +own markdown body. |
| 192 | + |
| 193 | +## Notes and edge cases |
| 194 | + |
| 195 | +- **`AW_PR_*` env vars are not surfaced.** ado-aw's agent-env-var |
| 196 | + channel rejects ADO `$(...)` expressions for injection-defence |
| 197 | + reasons, and bouncing values through pipeline output variables |
| 198 | + introduces a second source of truth. Agents read everything from |
| 199 | + `aw-context/pr/metadata.txt`. |
| 200 | +- **No `git` / `cat` / `ls` is added to the agent's bash allow-list.** |
| 201 | + The agent reads `aw-context/*` using its normal file-reading |
| 202 | + mechanism (the `edit` tool, native copilot reads, etc.), not via |
| 203 | + shell. This avoids silently widening the bash capability surface |
| 204 | + when the user has restricted bash. |
| 205 | +- **Non-`self` checkouts in `repos:`.** v1 only diffs the `self` |
| 206 | + checkout. The PR contributor does not currently produce contexts |
| 207 | + for additional repository checkouts. |
| 208 | +- **Workspace alias.** When `workspace:` points to a non-`self` alias, |
| 209 | + `aw-context/` is still relative to `$(Build.SourcesDirectory)` — |
| 210 | + i.e. the pipeline's working directory, not the workspace alias's |
| 211 | + directory. |
| 212 | +- **Ordering.** The precompute step runs after the standard |
| 213 | + `- checkout: self` and before any user `steps:`, so user `steps:` |
| 214 | + can also read `aw-context/` if needed. |
| 215 | + |
| 216 | +## Compiler internals |
| 217 | + |
| 218 | +- Always-on `ExecContextExtension` in |
| 219 | + `src/compile/extensions/exec_context/mod.rs` (`ExtensionPhase::Tool`). |
| 220 | +- Internal `ContextContributor` trait in `contributor.rs`. v1 ships one |
| 221 | + contributor: `PrContextContributor` in `pr.rs`. |
| 222 | +- Front-matter types: `ExecutionContextConfig` and `PrContextConfig` in |
| 223 | + `src/compile/types.rs`. |
| 224 | +- Compile tests live in `tests/compiler_tests.rs` (search for |
| 225 | + `test_execution_context_pr_*`). |
| 226 | +- The generated bash is shellchecked by `tests/bash_lint_tests.rs` via |
| 227 | + the `execution-context-agent.md` fixture. |
0 commit comments