Skip to content

Commit 7fe562f

Browse files
jamesadevineCopilot
andcommitted
fix(compile): graceful-degradation bug + cleanup per PR review
Address two findings from the Rust PR Reviewer bot on PR #873. 1. CRITICAL — AW_AZ_MOUNTS undefined when az is missing: The runtime-detection step in AzureCliExtension only set the AW_AZ_MOUNTS pipeline variable in the detected branch. In the missing-az branch the variable was left undefined. ADO leaves an undefined $(VAR) as the LITERAL STRING "$(VAR)" in subsequent bash steps (it does NOT expand to empty). Bash sees $(AW_AZ_MOUNTS), interprets it as a $(...) command substitution, tries to execute a program named AW_AZ_MOUNTS, gets exit 127, and the AWF invocation step dies under `set -e` — the exact 1ES failure mode the refactor set out to prevent. Fix: always emit `##vso[task.setvariable variable=AW_AZ_MOUNTS]`, with an empty value in the missing branch. ADO then expands $(AW_AZ_MOUNTS) to nothing and the trailing `\` line becomes a harmless continuation no-op. Regression guards (both lock this in): - azure_cli.rs::test_azure_cli_prepare_steps_defines_aw_az_mounts_in_else_branch counts `setvariable` occurrences (must be 2) and asserts the else block contains an empty-value setvariable line. - compiler_tests.rs::test_default_pipeline_mounts_az_and_allows_azure_hosts asserts the same 2× count on the compiled lock.yml. 2. Cleanup — delete WRITE_REQUIRING_SAFE_OUTPUTS: The const was retained with #[allow(dead_code)] after the removal of `validate_write_permissions`, but its only consumers left were the two tests that exercised the const itself. Each `*Result` type already carries `REQUIRES_WRITE: bool` for any caller (compiler, audit, runtime) that needs the same info. Deleting the const removes a dead-code annotation and one otherwise-purposeless list to maintain when adding new tools. Test cleanup: removed `test_write_requiring_subset_of_all_known` (purely exercised the deleted const) and rewrote `test_all_known_completeness` to use a HashSet-based duplicate check on ALL_KNOWN_SAFE_OUTPUTS plus the ALWAYS_ON/NON_MCP disjointness check (preserves the meaningful invariants). Validation: - cargo build: clean - cargo test: 1748 unit + 119 compiler + all integration pass - cargo clippy --all-targets --all-features -- -D warnings: clean - tests/safe-outputs/azure-cli.lock.yml regenerated (+1 line) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 74206d8 commit 7fe562f

4 files changed

Lines changed: 97 additions & 70 deletions

File tree

src/compile/extensions/azure_cli.rs

Lines changed: 65 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,16 @@ use super::{AwfMount, CompilerExtension, CompileContext, ExtensionPhase};
2626
/// sets the ADO pipeline variable `AW_AZ_MOUNTS` to
2727
/// `--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro`
2828
/// via `##vso[task.setvariable]`.
29-
/// * Otherwise, the step emits a `##vso[task.logissue type=warning]`
30-
/// explaining `az` won't be available inside the agent sandbox and
31-
/// leaves `AW_AZ_MOUNTS` unset (expands to the empty string).
29+
/// * Otherwise, the step sets `AW_AZ_MOUNTS` to the **empty string**
30+
/// (still via `##vso[task.setvariable]`) and emits a
31+
/// `##vso[task.logissue type=warning]` explaining `az` won't be
32+
/// available inside the agent sandbox. Setting the variable to empty
33+
/// is important: ADO leaves an *undefined* `$(VAR)` as the literal
34+
/// string `$(VAR)` in later bash steps, where bash would interpret
35+
/// it as a command substitution (`$(...)`) and fail under
36+
/// `set -e` with exit 127. An empty-but-defined variable expands to
37+
/// nothing, and the `$(AW_AZ_MOUNTS) \` line in the AWF chain
38+
/// becomes a harmless `\`-continuation no-op.
3239
///
3340
/// The AWF invocation in `base.yml`/`1es-base.yml`/etc. then includes a
3441
/// `$(AW_AZ_MOUNTS) \` line (injected by
@@ -107,6 +114,15 @@ impl CompilerExtension for AzureCliExtension {
107114
// contains only path chars, `:`, and spaces — no shell
108115
// metachars — so unquoted expansion is safe.
109116
//
117+
// Both branches MUST set the variable (the else branch sets it
118+
// to empty string). If left undefined, ADO leaves the literal
119+
// `$(AW_AZ_MOUNTS)` in subsequent bash steps, where bash
120+
// interprets it as a `$(...)` command substitution, tries to
121+
// run a program named `AW_AZ_MOUNTS`, gets exit 127, and the
122+
// AWF invocation step dies under `set -e` — the opposite of
123+
// graceful degradation. Defining the variable as empty makes
124+
// ADO expand it to nothing, leaving a harmless `\`-continuation.
125+
//
110126
// Warning text is intentionally short and operator-facing.
111127
// Agents that don't invoke `az` are unaffected; agents that do
112128
// will get a normal "command not found" inside the sandbox.
@@ -116,6 +132,7 @@ impl CompilerExtension for AzureCliExtension {
116132
echo "##vso[task.setvariable variable=AW_AZ_MOUNTS]--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro"
117133
echo "Azure CLI detected on host; mounting /opt/az and /usr/bin/az into AWF sandbox."
118134
else
135+
echo "##vso[task.setvariable variable=AW_AZ_MOUNTS]"
119136
echo "##vso[task.logissue type=warning]Azure CLI not detected on this runner (missing /usr/bin/az or /opt/az). The az command will not be available inside the agent sandbox. Install azure-cli on the runner image to enable it."
120137
fi
121138
displayName: "Detect Azure CLI on host (for AWF mount)"
@@ -240,6 +257,51 @@ mod tests {
240257
);
241258
}
242259

260+
#[test]
261+
fn test_azure_cli_prepare_steps_defines_aw_az_mounts_in_else_branch() {
262+
// Regression guard for the graceful-degradation bug:
263+
// if the `else` branch doesn't explicitly setvariable on
264+
// AW_AZ_MOUNTS, ADO leaves the literal `$(AW_AZ_MOUNTS)` in
265+
// the subsequent AWF bash step, bash interprets it as a
266+
// `$(...)` command substitution, tries to execute a program
267+
// named AW_AZ_MOUNTS, gets exit 127, and `set -e` kills the
268+
// step — exactly the failure mode this PR set out to prevent.
269+
let ext = AzureCliExtension;
270+
let fm = fm();
271+
let ctx = CompileContext::for_test(&fm);
272+
let step = ext.prepare_steps(&ctx).into_iter().next().unwrap();
273+
274+
// Count setvariable occurrences — must be 2 (one per branch).
275+
let setvar_count = step
276+
.matches("##vso[task.setvariable variable=AW_AZ_MOUNTS]")
277+
.count();
278+
assert_eq!(
279+
setvar_count, 2,
280+
"AW_AZ_MOUNTS must be set in BOTH branches of the if/else (got {setvar_count}); \
281+
leaving it undefined in the missing-az branch causes bash to interpret \
282+
the literal `$(AW_AZ_MOUNTS)` as command substitution and fail under set -e. \
283+
Step:\n{step}"
284+
);
285+
286+
// Verify the else branch sets it to empty (no `--mount` chars
287+
// after the `]`). We slice the step from "else" to "fi" and
288+
// assert the else block contains a setvariable line that ends
289+
// with `]"` (closing-bracket-then-quote = empty value).
290+
let else_start = step.find("else").expect("must have else branch");
291+
let fi_end = step[else_start..].find("fi").expect("must have fi");
292+
let else_block = &step[else_start..else_start + fi_end];
293+
assert!(
294+
else_block.contains("##vso[task.setvariable variable=AW_AZ_MOUNTS]\""),
295+
"else branch must set AW_AZ_MOUNTS to empty string (line must end with `]\"`), got:\n{else_block}"
296+
);
297+
// And the else branch must NOT include any --mount arg (would
298+
// mean we're accidentally setting non-empty when az is missing).
299+
assert!(
300+
!else_block.contains("--mount"),
301+
"else branch must not contain --mount args (those belong to the detected branch only): {else_block}"
302+
);
303+
}
304+
243305
#[test]
244306
fn test_azure_cli_prepare_steps_uses_pipefail() {
245307
// Bash steps in this repo's lint policy require `set -eo

src/safeoutputs/mod.rs

Lines changed: 13 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -26,41 +26,6 @@ pub const ALWAYS_ON_TOOLS: &[&str] = tool_names![
2626
ReportIncompleteResult,
2727
];
2828

29-
/// Safe-output tools that perform write operations against ADO.
30-
/// Compile-time derived from tool types via `ToolResult::NAME`.
31-
///
32-
/// **Informational only.** The Stage 3 executor always receives a
33-
/// `SYSTEM_ACCESSTOKEN` (sourced from the pipeline's built-in
34-
/// `$(System.AccessToken)` by default, or from a `permissions.write` ARM
35-
/// service connection when one is configured). This list is kept so other
36-
/// compiler/runtime code (e.g. audit) can identify write-bearing tools, but
37-
/// the compiler no longer fails when one is configured without an ARM SC.
38-
///
39-
/// Adding a new write-requiring tool: create the struct with `tool_result!{ write = true, ... }`,
40-
/// then add its type to this list.
41-
#[allow(dead_code)]
42-
pub const WRITE_REQUIRING_SAFE_OUTPUTS: &[&str] = tool_names![
43-
CreateWorkItemResult,
44-
CommentOnWorkItemResult,
45-
UpdateWorkItemResult,
46-
CreatePrResult,
47-
CreateWikiPageResult,
48-
UpdateWikiPageResult,
49-
AddPrCommentResult,
50-
LinkWorkItemsResult,
51-
QueueBuildResult,
52-
CreateGitTagResult,
53-
AddBuildTagResult,
54-
CreateBranchResult,
55-
UpdatePrResult,
56-
UploadBuildAttachmentResult,
57-
UploadPipelineArtifactResult,
58-
UploadWorkitemAttachmentResult,
59-
SubmitPrReviewResult,
60-
ReplyToPrCommentResult,
61-
ResolvePrThreadResult,
62-
];
63-
6429
/// Non-MCP safe-output keys handled by the compiler/executor, not the MCP server.
6530
/// These must not appear in `--enabled-tools` or they cause real MCP tools to be
6631
/// filtered out (the router has no route for them).
@@ -810,17 +775,6 @@ pub use upload_workitem_attachment::*;
810775
mod tests {
811776
use super::*;
812777

813-
#[test]
814-
fn test_write_requiring_subset_of_all_known() {
815-
for name in WRITE_REQUIRING_SAFE_OUTPUTS {
816-
assert!(
817-
ALL_KNOWN_SAFE_OUTPUTS.contains(name),
818-
"WRITE_REQUIRING_SAFE_OUTPUTS entry '{}' is missing from ALL_KNOWN_SAFE_OUTPUTS",
819-
name
820-
);
821-
}
822-
}
823-
824778
#[test]
825779
fn test_always_on_subset_of_all_known() {
826780
for name in ALWAYS_ON_TOOLS {
@@ -876,40 +830,32 @@ mod tests {
876830
const { assert!(!ReportIncompleteResult::REQUIRES_WRITE); }
877831
}
878832

879-
/// Verify ALL_KNOWN_SAFE_OUTPUTS has exactly the right count:
880-
/// write tools + diagnostics + non-MCP keys.
833+
/// Verify ALL_KNOWN_SAFE_OUTPUTS contains no duplicate entries, and
834+
/// that the always-on and non-MCP sub-lists are disjoint.
881835
#[test]
882836
fn test_all_known_completeness() {
883-
// The three sub-lists must be disjoint — a tool in multiple lists would
884-
// be duplicated in ALL_KNOWN and the count would mismatch.
885-
for name in WRITE_REQUIRING_SAFE_OUTPUTS {
837+
// No duplicates: a tool name appearing twice in ALL_KNOWN would
838+
// mean `all_safe_output_names!` was given the same type twice
839+
// and would silently break tool routing.
840+
let mut seen = std::collections::HashSet::new();
841+
for name in ALL_KNOWN_SAFE_OUTPUTS {
886842
assert!(
887-
!ALWAYS_ON_TOOLS.contains(name),
888-
"Tool '{}' appears in both WRITE_REQUIRING and ALWAYS_ON — lists must be disjoint",
889-
name
890-
);
891-
assert!(
892-
!NON_MCP_SAFE_OUTPUT_KEYS.contains(name),
893-
"Tool '{}' appears in both WRITE_REQUIRING and NON_MCP — lists must be disjoint",
843+
seen.insert(*name),
844+
"ALL_KNOWN_SAFE_OUTPUTS contains duplicate entry '{}'",
894845
name
895846
);
896847
}
848+
849+
// ALWAYS_ON and NON_MCP must be disjoint — a diagnostic tool
850+
// that also appears as a non-MCP key would be both routed and
851+
// intercepted, giving inconsistent behaviour.
897852
for name in ALWAYS_ON_TOOLS {
898853
assert!(
899854
!NON_MCP_SAFE_OUTPUT_KEYS.contains(name),
900855
"Tool '{}' appears in both ALWAYS_ON and NON_MCP — lists must be disjoint",
901856
name
902857
);
903858
}
904-
905-
let expected = WRITE_REQUIRING_SAFE_OUTPUTS.len()
906-
+ ALWAYS_ON_TOOLS.len()
907-
+ NON_MCP_SAFE_OUTPUT_KEYS.len();
908-
assert_eq!(
909-
ALL_KNOWN_SAFE_OUTPUTS.len(),
910-
expected,
911-
"ALL_KNOWN_SAFE_OUTPUTS should be the union of write + diagnostic + non-MCP lists"
912-
);
913859
}
914860

915861
// ─── validate_git_ref_name ──────────────────────────────────────────────

tests/compiler_tests.rs

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4577,6 +4577,24 @@ fn test_default_pipeline_mounts_az_and_allows_azure_hosts() {
45774577
Compiled:\n{compiled}"
45784578
);
45794579

4580+
// (1a) Regression guard: `setvariable` for AW_AZ_MOUNTS must appear
4581+
// TWICE — once per branch of the if/else. If the missing-az branch
4582+
// skips the setvariable, ADO leaves the literal `$(AW_AZ_MOUNTS)`
4583+
// in the AWF bash step, where bash interprets it as a `$(...)`
4584+
// command substitution, attempts to run a program named
4585+
// `AW_AZ_MOUNTS`, gets exit 127, and `set -e` kills the pipeline —
4586+
// the exact failure mode this PR set out to prevent on runners
4587+
// without azure-cli installed.
4588+
let setvar_count = compiled
4589+
.matches("##vso[task.setvariable variable=AW_AZ_MOUNTS]")
4590+
.count();
4591+
assert_eq!(
4592+
setvar_count, 2,
4593+
"AW_AZ_MOUNTS must be set in BOTH branches of the detection step (got {setvar_count} \
4594+
occurrences); leaving it unset in the missing-az branch breaks `set -e` in the \
4595+
AWF invocation. See AzureCliExtension::prepare_steps for the rationale."
4596+
);
4597+
45804598
// (2) The AWF invocation must reference $(AW_AZ_MOUNTS) so the
45814599
// pipeline-variable value (the two --mount args, or empty) is
45824600
// word-split into the docker run command at runtime. Unquoted on

tests/safe-outputs/azure-cli.lock.yml

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)