Skip to content

Commit 74206d8

Browse files
jamesadevineCopilot
andcommitted
refactor(compile): detect az at pipeline time so missing azure-cli no longer crashes 1ES
Reviewer-requested fix: static AWF bind-mounts for /opt/az and /usr/bin/az would break `docker run` on runners without azure-cli pre-installed (notably some 1ES self-hosted pools), failing the pipeline before the agent ever started. Replace the static mounts with a runtime detection prepare step that sets the ADO pipeline variable AW_AZ_MOUNTS via `##vso[task.setvariable]` when both /usr/bin/az and /opt/az exist on the host, or emits a `task.logissue` warning and leaves the variable unset otherwise. The AWF invocation in the compiled YAML now includes a single `$(AW_AZ_MOUNTS) \` line in the --mount chain. ADO interpolates the variable at step start: present -> the two --mount args appear; absent -> the line collapses to whitespace. No new trait method is added; only the existing `prepare_steps` hook is used. - AzureCliExtension: required_awf_mounts() now returns []; prepare_steps() emits the detection bash step - generate_awf_mounts: appends `$(AW_AZ_MOUNTS) \` when AzureCli is present - Tests: rewrite static-mount assertions to assert the detection step + the pipeline variable injection, plus a regression guard that no static az mount is emitted - Docs: docs/network.md and docs/tools.md updated with the runtime-detection design and operator implications Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 1b641ee commit 74206d8

6 files changed

Lines changed: 356 additions & 96 deletions

File tree

docs/network.md

Lines changed: 48 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -51,18 +51,54 @@ The following domains are always allowed. Most are defined in `CORE_ALLOWED_HOST
5151

5252
## Always-on Azure CLI (`az`)
5353

54-
Every compiled pipeline mounts the host's `az` binary (from `/opt/az` and
55-
`/usr/bin/az`) into the AWF container and adds the Azure auth and
56-
management hosts listed above (`login.microsoftonline.com`,
57-
`login.windows.net`, `management.azure.com`, `graph.microsoft.com`,
58-
`aka.ms`) to the allowlist. This mirrors gh-aw's "assume `gh` is on the
59-
runner" model: agents can call `az` from their bash tool without
60-
opting in.
61-
62-
The host is assumed to have `azure-cli` pre-installed. Microsoft-hosted
63-
`ubuntu-latest` agents satisfy this; 1ES self-hosted pool operators must
64-
bake `azure-cli` into their images. If `/opt/az` is missing on the host,
65-
the AWF mount will fail at runtime with a clear error.
54+
Every compiled pipeline adds the Azure auth and management hosts listed
55+
above (`login.microsoftonline.com`, `login.windows.net`,
56+
`management.azure.com`, `graph.microsoft.com`, `aka.ms`) to the AWF
57+
allowlist and emits a small *Detect Azure CLI on host* prepare step
58+
that runs early in the Agent job. This mirrors gh-aw's "assume `gh` is
59+
on the runner" model: agents can call `az` from their bash tool
60+
without opting in — *when the runner has it*.
61+
62+
### Runtime detection and graceful degradation
63+
64+
Because `azure-cli` is not universally pre-installed on every ADO
65+
runner image (notably some 1ES self-hosted pools), the compiler does
66+
**not** declare static AWF bind-mounts for `/opt/az` and `/usr/bin/az`.
67+
Static mounts would cause `docker run` to fail with "bind source path
68+
does not exist" on runners without `az`, breaking the pipeline before
69+
the agent ever started.
70+
71+
Instead, the prepare step does the detection itself at pipeline time:
72+
73+
* If both `/usr/bin/az` (the launcher shim) and `/opt/az` (the Python
74+
venv that `az` actually runs in) exist on the host, the step sets
75+
the ADO pipeline variable
76+
`AW_AZ_MOUNTS=--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro`
77+
via `##vso[task.setvariable]`.
78+
* If either is missing, the step emits a
79+
`##vso[task.logissue type=warning]` explaining `az` won't be
80+
available inside the agent sandbox and leaves `AW_AZ_MOUNTS` unset
81+
(which expands to the empty string).
82+
83+
The AWF invocation in the compiled YAML then includes a literal
84+
`$(AW_AZ_MOUNTS) \` line on its own in the `--mount` chain.
85+
At step start, ADO interpolates that pipeline variable into the bash
86+
script: when az is present the two `--mount` args appear; when it's
87+
absent the line collapses to empty whitespace + the `\` continuation,
88+
which is a no-op.
89+
90+
### Operator implications
91+
92+
- **Microsoft-hosted `ubuntu-latest`**: `az` is detected, mounted, and
93+
available inside the agent sandbox. Nothing to do.
94+
- **1ES self-hosted runners *with* azure-cli baked in**: same as above.
95+
- **1ES self-hosted runners *without* azure-cli**: the pipeline runs
96+
successfully, but agents that invoke `az` get the standard
97+
`command not found` inside the sandbox. The warning emitted by the
98+
prepare step is visible in the ADO log as a yellow-flagged issue on
99+
the build summary; treat it as a signal to either ignore (if no
100+
agent on that runner needs `az`) or to install `azure-cli` on the
101+
runner image.
66102

67103
See [`docs/tools.md`](tools.md#built-in-clis) for the agent-facing
68104
contract (auth scope, available subcommands).

docs/tools.md

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -94,18 +94,37 @@ host is presumed to have each binary pre-installed.
9494

9595
### Azure CLI (`az`)
9696

97-
Every compiled pipeline mounts the host's `az` binary into the AWF
98-
container (`/opt/az` + `/usr/bin/az`, read-only) and adds the Azure
99-
auth and management hosts (`login.microsoftonline.com`,
100-
`login.windows.net`, `management.azure.com`, `graph.microsoft.com`,
101-
`aka.ms`) to the AWF allowlist. The compiler does not install `az` —
102-
the host is assumed to already have `azure-cli` installed.
97+
Every compiled pipeline adds the Azure auth and management hosts
98+
(`login.microsoftonline.com`, `login.windows.net`,
99+
`management.azure.com`, `graph.microsoft.com`, `aka.ms`) to the AWF
100+
allowlist and emits a *Detect Azure CLI on host* prepare step in the
101+
Agent job. The compiler does not install `az`.
102+
103+
**Runtime detection + graceful degradation.** The detection step does
104+
two things at pipeline time:
105+
106+
1. If `/usr/bin/az` (the launcher shim) and `/opt/az` (the Python
107+
venv that `az` runs in) both exist on the runner, it sets the
108+
pipeline variable
109+
`AW_AZ_MOUNTS=--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro`.
110+
2. If either is missing, it emits a yellow ADO warning
111+
(`##vso[task.logissue type=warning]`) and leaves the variable
112+
unset.
113+
114+
The AWF invocation includes a `$(AW_AZ_MOUNTS) \` line in its
115+
`--mount` chain. ADO expands the variable at step start: present →
116+
the two mounts appear; absent → the line collapses to nothing. No
117+
static `--mount` is emitted for `/opt/az` or `/usr/bin/az`, so the
118+
pipeline never crashes `docker run` with "bind source path does not
119+
exist" on runners without `az`. See
120+
[`docs/network.md`](network.md#always-on-azure-cli-az) for the full
121+
design.
103122

104123
| Host posture | What you get |
105124
| ------------------------------------- | --------------------------------------------------------- |
106-
| Microsoft-hosted `ubuntu-latest` | Works out of the box (`az` is pre-installed) |
107-
| 1ES self-hosted pool image | Works if the pool operator baked `azure-cli` into the image |
108-
| Host missing `/opt/az` | AWF mount fails at runtime with a clear error |
125+
| Microsoft-hosted `ubuntu-latest` | Detected → mounted → `az` available in the sandbox |
126+
| 1ES self-hosted pool with `azure-cli` | Same as above |
127+
| 1ES self-hosted pool *without* `az` | Pipeline runs; warning in ADO log; `az` is `command not found` inside the sandbox |
109128

110129
**Auth scope (important).** The compiler does not authenticate `az` for
111130
general use. Two paths are supported:

src/compile/common.rs

Lines changed: 33 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2909,15 +2909,33 @@ pub fn generate_awf_mounts(extensions: &[super::extensions::Extension]) -> Strin
29092909
.flat_map(|ext| ext.required_awf_mounts())
29102910
.collect();
29112911

2912-
if mounts.is_empty() {
2912+
// When the always-on AzureCli extension is enabled, append a
2913+
// pipeline-variable reference that expands at pipeline time to
2914+
// either `--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro`
2915+
// (when the runner has azure-cli installed) or to nothing (when it
2916+
// doesn't). The detection + setvariable happens in
2917+
// `AzureCliExtension::prepare_steps`. This avoids static bind-mounts
2918+
// that would crash `docker run` on 1ES self-hosted runners without
2919+
// azure-cli pre-installed.
2920+
let inject_az_var = extensions
2921+
.iter()
2922+
.any(|ext| matches!(ext, super::extensions::Extension::AzureCli(_)));
2923+
2924+
if mounts.is_empty() && !inject_az_var {
29132925
return "\\".to_string();
29142926
}
29152927

2916-
mounts
2928+
let mut lines: Vec<String> = mounts
29172929
.iter()
29182930
.map(|m| format!("--mount \"{}\" \\", m))
2919-
.collect::<Vec<_>>()
2920-
.join("\n")
2931+
.collect();
2932+
if inject_az_var {
2933+
// Unquoted on purpose: bash word-splits the pipeline-var value
2934+
// into separate `--mount <spec>` tokens. The value contains only
2935+
// path chars + `:` + spaces, no shell metachars.
2936+
lines.push("$(AW_AZ_MOUNTS) \\".to_string());
2937+
}
2938+
lines.join("\n")
29212939
}
29222940

29232941
/// Generates a dedicated pipeline step that writes a `GITHUB_PATH` file
@@ -6779,20 +6797,24 @@ safe-outputs:
67796797
#[test]
67806798
fn test_generate_awf_mounts_no_extensions() {
67816799
// Even with a minimal front matter, the always-on Azure CLI
6782-
// extension contributes its two AWF mounts (/opt/az + /usr/bin/az).
6783-
// The "no mounts" name is historical; this test now verifies the
6784-
// always-on baseline.
6800+
// extension contributes a `$(AW_AZ_MOUNTS) \` injection line
6801+
// (no static mounts — those are runtime-detected by the
6802+
// AzureCli prepare step which sets the pipeline variable).
6803+
// The "no mounts" name is historical; this test now verifies
6804+
// the always-on baseline.
67856805
let fm = minimal_front_matter();
67866806
let exts = crate::compile::extensions::collect_extensions(&fm);
67876807
let _ctx = crate::compile::extensions::CompileContext::for_test(&fm);
67886808
let result = generate_awf_mounts(&exts);
67896809
assert!(
6790-
result.contains(r#"--mount "/opt/az:/opt/az:ro""#),
6791-
"always-on Azure CLI mount /opt/az should be present: {result}"
6810+
result.contains("$(AW_AZ_MOUNTS) \\"),
6811+
"always-on Azure CLI injection line $(AW_AZ_MOUNTS) \\ should be present \
6812+
(so the AzureCli prepare step's pipeline variable expands into runtime mounts): {result}"
67926813
);
67936814
assert!(
6794-
result.contains(r#"--mount "/usr/bin/az:/usr/bin/az:ro""#),
6795-
"always-on Azure CLI mount /usr/bin/az should be present: {result}"
6815+
!result.contains(r#"--mount "/opt/az:/opt/az:ro""#),
6816+
"must NOT emit a static /opt/az --mount — that would crash docker run on \
6817+
runners without azure-cli. The mount is contributed via $(AW_AZ_MOUNTS) instead: {result}"
67966818
);
67976819
assert!(
67986820
result.ends_with(" \\"),

0 commit comments

Comments
 (0)