Skip to content

Commit 92c5d4d

Browse files
jamesadevineCopilot
andcommitted
fix(compile): quote pipeline name in generated YAML to handle colons and quotes
Front-matter agent names like `Daily safe-output smoke: noop` produced invalid YAML when substituted bare into the top-level `name:` line: ADO YAML parsers (and `yaml.safe_load`) saw the second colon as a mapping indicator and rejected the file with "Mapping values are not allowed in this context." A sister bug affected hand-quoted `displayName: "{{ agent_name }}"` lines in 1es-base.yml:37 and stage-base.yml:6: an agent name containing an embedded `"` would produce `displayName: "My "special" agent"` — invalid YAML. Introduce a single `{{ pipeline_name }}` template marker that emits the front-matter agent name plus the `-$(BuildID)` suffix as a YAML double-quoted scalar with proper escaping for `\`, `"`, `\n`, `\r`, `\t`, and ASCII control characters. The new `yaml_double_quoted` helper does the encoding; `$(...)` ADO macros pass through untouched because `$` has no special meaning inside a YAML double-quoted scalar. The marker replaces: * `{{ agent_name }}-$(BuildID)` on the top-level `name:` line in base.yml and 1es-base.yml, * hand-quoted `displayName: "{{ agent_name }}"` on the stage display name line in 1es-base.yml and stage-base.yml. The stage `displayName:` now carries the `-$(BuildID)` suffix as well. This is a small cosmetic change — stage labels in the ADO UI read e.g. "My agent-12345" instead of "My agent" — but it provides a clear visual link between the stage label and the build run. Other usages of `{{ agent_name }}` are unchanged. The marker is still used inside src/data/threat-analysis.md (a markdown body, not parsed as YAML), which is the only remaining safe substitution position. Tests: * 8 new unit tests for `yaml_double_quoted` cover plain strings, the original colon bug, backslash / embedded-quote / control-character escaping, ADO macro passthrough, and unicode. * 2 new integration tests (`test_compiled_yaml_survives_tricky_agent_name_{standalone,1es}`) compile fixtures whose names contain both embedded `"` and `:`, parse the output via serde_yaml, and assert the escaped forms appear verbatim in the generated `name:` and `displayName:` lines. * `test_compiled_yaml_structure` updated to require the new marker. * All 26 smoke-suite lock files in tests/safe-outputs regenerated with properly quoted names. * docs/template-markers.md documents the new marker and tightens the `{{ agent_name }}` warning to make clear it's only safe in non-YAML positions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 29b221e commit 92c5d4d

34 files changed

Lines changed: 428 additions & 214 deletions

docs/template-markers.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,46 @@ This distinction allows resources (like templates) to be available as pipeline r
9797

9898
## {{ agent_name }}
9999

100-
Should be replaced with the human-readable name from the front matter (e.g., "Daily Code Review"). This is used for display purposes like stage names.
100+
Should be replaced with the human-readable name from the front matter
101+
(e.g., `Daily Code Review`). The value is substituted **as-is**, with
102+
no quoting or escaping — front-matter `name` values are free-form and
103+
have not been validated against YAML scalar rules.
104+
105+
> ⚠️ This marker is only safe inside a position that is **not parsed as
106+
> YAML** (currently only `src/data/threat-analysis.md`, which is a
107+
> markdown body). Every YAML position inside the generated pipelines
108+
> that takes the agent name uses
109+
> [`{{ pipeline_name }}`](#-pipeline_name-) instead, which emits a
110+
> fully-quoted-and-escaped double-quoted YAML scalar so colons,
111+
> embedded `"`, and other plain-scalar-unsafe characters in the agent
112+
> name cannot break parsing.
113+
114+
## {{ pipeline_name }}
115+
116+
Should be replaced with the front-matter agent name plus the
117+
`-$(BuildID)` suffix, always emitted as a **YAML double-quoted
118+
scalar** with proper escaping for `\`, `"`, `\n`, `\r`, `\t`, and other
119+
ASCII control characters. Used for every `name:` and `displayName:`
120+
position in the generated YAML that takes the agent name.
121+
122+
For an agent named `My "special": agent`, this expands to:
123+
124+
```yaml
125+
name: "My \"special\": agent-$(BuildID)"
126+
```
127+
128+
`$(BuildID)` is an ADO macro and is expanded at queue time after YAML
129+
parsing; `$` has no special meaning inside a YAML double-quoted scalar
130+
so the macro passes through untouched.
131+
132+
This marker is used:
133+
134+
- in `src/data/base.yml` and `src/data/1es-base.yml` for the top-level
135+
pipeline `name:` line (the ADO run name),
136+
- in `src/data/1es-base.yml` and `src/data/stage-base.yml` for the
137+
stage `displayName:` (so the stage label includes the build number,
138+
giving a clear visual link between the ADO UI stage label and the
139+
underlying run).
101140

102141
## {{ engine_install_steps }}
103142

src/compile/common.rs

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -996,6 +996,40 @@ pub fn sanitize_filename(name: &str) -> String {
996996
.join("-")
997997
}
998998

999+
/// Emit `s` as a YAML double-quoted scalar (always quoted, never plain).
1000+
///
1001+
/// We always quote because the value is substituted into YAML positions
1002+
/// where colons and other plain-scalar-unsafe characters are common in
1003+
/// agent names (e.g. `"Daily safe-output smoke: noop"`). A bare scalar
1004+
/// like `name: Daily safe-output smoke: noop-$(BuildID)` is invalid YAML
1005+
/// because the second colon is interpreted as a mapping indicator.
1006+
///
1007+
/// `$(...)` ADO macros pass through untouched — `$` has no special meaning
1008+
/// inside a YAML double-quoted scalar and ADO expands the macro at queue
1009+
/// time after YAML parsing.
1010+
///
1011+
/// `reject_pipeline_injection` already strips newlines and template /
1012+
/// pipeline-command sequences from front-matter `name` values, so the
1013+
/// escape table only has to cover `\` and `"`. Tabs and ASCII control
1014+
/// characters are escaped too as a belt-and-braces measure.
1015+
pub fn yaml_double_quoted(s: &str) -> String {
1016+
let mut out = String::with_capacity(s.len() + 2);
1017+
out.push('"');
1018+
for ch in s.chars() {
1019+
match ch {
1020+
'\\' => out.push_str("\\\\"),
1021+
'"' => out.push_str("\\\""),
1022+
'\n' => out.push_str("\\n"),
1023+
'\r' => out.push_str("\\r"),
1024+
'\t' => out.push_str("\\t"),
1025+
c if (c as u32) < 0x20 => out.push_str(&format!("\\x{:02x}", c as u32)),
1026+
c => out.push(c),
1027+
}
1028+
}
1029+
out.push('"');
1030+
out
1031+
}
1032+
9991033
/// Default self-hosted pool for 1ES templates.
10001034
pub const DEFAULT_ONEES_POOL: &str = "AZS-1ES-L-MMS-ubuntu-22.04";
10011035
/// Default Microsoft-hosted VM image for non-1ES templates.
@@ -2856,6 +2890,14 @@ pub async fn compile_shared(
28562890
let checkout_steps = generate_checkout_steps(&front_matter.checkout);
28572891
let checkout_self = generate_checkout_self();
28582892
let agent_name = sanitize_filename(&front_matter.name);
2893+
// Display string for every "name:"/"displayName:" position in the
2894+
// generated YAML that takes the agent name. Always emitted as a YAML
2895+
// double-quoted scalar with the `-$(BuildID)` suffix so:
2896+
// * colons and embedded `"` in the agent name can't break parsing,
2897+
// * ADO appends the build number to every run name, and
2898+
// * stage labels link visibly to the build via the same suffix.
2899+
let pipeline_name =
2900+
yaml_double_quoted(&format!("{}-$(BuildID)", front_matter.name));
28592901

28602902
// 3. Run extension validations
28612903
for ext in extensions {
@@ -3069,6 +3111,7 @@ pub async fn compile_shared(
30693111
("{{ checkout_repositories }}", &checkout_steps),
30703112
("{{ agent }}", &agent_name),
30713113
("{{ agent_name }}", &front_matter.name),
3114+
("{{ pipeline_name }}", &pipeline_name),
30723115
("{{ agent_description }}", &front_matter.description),
30733116
("{{ engine_run }}", &engine_run),
30743117
("{{ engine_run_detection }}", &engine_run_detection),
@@ -3996,6 +4039,69 @@ mod tests {
39964039
assert_eq!(sanitize_filename("test_case"), "test-case");
39974040
}
39984041

4042+
// ─── yaml_double_quoted ──────────────────────────────────────────────────
4043+
4044+
#[test]
4045+
fn test_yaml_double_quoted_plain_string() {
4046+
assert_eq!(yaml_double_quoted("hello"), r#""hello""#);
4047+
}
4048+
4049+
#[test]
4050+
fn test_yaml_double_quoted_string_with_colon_is_safe() {
4051+
// The bug this helper exists to fix: an agent name like
4052+
// "Daily safe-output smoke: noop" must not be emitted bare in the
4053+
// top-level pipeline `name:` line, where the second colon would
4054+
// be parsed as a YAML mapping indicator.
4055+
assert_eq!(
4056+
yaml_double_quoted("Daily safe-output smoke: noop-$(BuildID)"),
4057+
r#""Daily safe-output smoke: noop-$(BuildID)""#
4058+
);
4059+
}
4060+
4061+
#[test]
4062+
fn test_yaml_double_quoted_escapes_backslash() {
4063+
assert_eq!(yaml_double_quoted(r"a\b"), r#""a\\b""#);
4064+
}
4065+
4066+
#[test]
4067+
fn test_yaml_double_quoted_escapes_double_quote() {
4068+
assert_eq!(yaml_double_quoted(r#"say "hi""#), r#""say \"hi\"""#);
4069+
}
4070+
4071+
#[test]
4072+
fn test_yaml_double_quoted_escapes_whitespace_controls() {
4073+
assert_eq!(yaml_double_quoted("a\nb"), r#""a\nb""#);
4074+
assert_eq!(yaml_double_quoted("a\rb"), r#""a\rb""#);
4075+
assert_eq!(yaml_double_quoted("a\tb"), r#""a\tb""#);
4076+
}
4077+
4078+
#[test]
4079+
fn test_yaml_double_quoted_escapes_other_control_chars() {
4080+
// Bell (0x07) is a low ASCII control char — should escape as \x07.
4081+
assert_eq!(yaml_double_quoted("a\u{0007}b"), r#""a\x07b""#);
4082+
}
4083+
4084+
#[test]
4085+
fn test_yaml_double_quoted_passes_through_ado_macros() {
4086+
// $(BuildID), $(Build.SourcesDirectory) etc. have no special meaning
4087+
// inside a YAML double-quoted scalar; ADO expands them at queue time
4088+
// after YAML parsing.
4089+
assert_eq!(
4090+
yaml_double_quoted("$(Build.BuildId)/$(System.JobId)"),
4091+
r#""$(Build.BuildId)/$(System.JobId)""#
4092+
);
4093+
}
4094+
4095+
#[test]
4096+
fn test_yaml_double_quoted_passes_through_unicode() {
4097+
// Non-ASCII characters pass through as-is — YAML 1.2 supports UTF-8
4098+
// in double-quoted scalars natively.
4099+
assert_eq!(
4100+
yaml_double_quoted("résumé — 你好"),
4101+
r#""résumé — 你好""#
4102+
);
4103+
}
4104+
39994105
// ─── generate_pr_trigger ─────────────────────────────────────────────────
40004106

40014107
#[test]

src/data/1es-base.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# This template extends the 1ES Unofficial Pipeline Template with Copilot CLI,
33
# AWF network isolation, and MCP Gateway — matching the standalone pipeline model.
44

5-
name: {{ agent_name }}-$(BuildID)
5+
name: {{ pipeline_name }}
66
{{ parameters }}
77
{{ schedule }}
88
{{ pr_trigger }}
@@ -34,7 +34,7 @@ extends:
3434
runPrerequisitesOnImage: false # Pool image has 1ES prerequisites preinstalled
3535
stages:
3636
- stage: AgentStage
37-
displayName: "{{ agent_name }}"
37+
displayName: {{ pipeline_name }}
3838
jobs:
3939
{{ setup_job }}
4040

src/data/base.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
name: {{ agent_name }}-$(BuildID)
2+
name: {{ pipeline_name }}
33
{{ parameters }}
44
resources:
55
repositories:

src/data/stage-base.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
stages:
55
- stage: {{ stage_prefix }}
6-
displayName: "{{ agent_name }}"
6+
displayName: {{ pipeline_name }}
77
jobs:
88
{{ setup_job }}
99
- job: {{ stage_prefix }}_Agent

tests/compiler_tests.rs

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ fn assert_required_markers(content: &str) {
6969
"{{ checkout_repositories }}",
7070
"{{ allowed_domains }}",
7171
"{{ source_path }}",
72-
"{{ agent_name }}",
72+
"{{ pipeline_name }}",
7373
"{{ engine_run }}",
7474
"{{ compiler_version }}",
7575
"{{ integrity_check }}",
@@ -3528,6 +3528,46 @@ fn test_1es_compiled_output_is_valid_yaml() {
35283528
);
35293529
}
35303530

3531+
/// Names with embedded `"` and `:` must survive YAML escaping in both
3532+
/// the top-level `name:` line and any `displayName:` positions.
3533+
///
3534+
/// Regression: until `{{ pipeline_name }}` was introduced both positions
3535+
/// used a bare `{{ agent_name }}` substitution which broke if the
3536+
/// front-matter name contained colons (`name: a: b` parsed as a YAML
3537+
/// mapping) or embedded double quotes (`displayName: "a "b" c"` parsed
3538+
/// as broken scalars). Now both positions go through `yaml_double_quoted`
3539+
/// via a single `{{ pipeline_name }}` marker.
3540+
#[test]
3541+
fn test_compiled_yaml_survives_tricky_agent_name_standalone() {
3542+
let compiled = compile_fixture("tricky-name-agent.md");
3543+
assert_valid_yaml(&compiled, "tricky-name-agent.md");
3544+
3545+
// The top-level pipeline name must contain the escaped form of the
3546+
// embedded `"` (rendered as `\"`) AND retain the colon.
3547+
assert!(
3548+
compiled.contains(r#"name: "My \"special\": agent with quotes-$(BuildID)""#),
3549+
"standalone output should contain escaped pipeline name; got:\n{compiled}"
3550+
);
3551+
}
3552+
3553+
#[test]
3554+
fn test_compiled_yaml_survives_tricky_agent_name_1es() {
3555+
let compiled = compile_fixture("tricky-name-1es-agent.md");
3556+
assert_valid_yaml(&compiled, "tricky-name-1es-agent.md");
3557+
3558+
// Top-level pipeline name AND the 1ES stage displayName both go through
3559+
// the same `{{ pipeline_name }}` marker, so both carry the `-$(BuildID)`
3560+
// suffix and the escaped quotes.
3561+
assert!(
3562+
compiled.contains(r#"name: "My \"special\": agent with quotes (1ES)-$(BuildID)""#),
3563+
"1ES output should contain escaped pipeline name; got:\n{compiled}"
3564+
);
3565+
assert!(
3566+
compiled.contains(r#"displayName: "My \"special\": agent with quotes (1ES)-$(BuildID)""#),
3567+
"1ES output should contain escaped stage displayName; got:\n{compiled}"
3568+
);
3569+
}
3570+
35313571
/// Test that the minimal standalone fixture produces valid YAML with correct structure
35323572
#[test]
35333573
fn test_standalone_minimal_compiled_output_is_valid_yaml() {
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
name: 'My "special": agent with quotes (1ES)'
3+
description: "Fixture covering YAML escaping for embedded \" and : in name (1ES target)"
4+
target: 1es
5+
permissions:
6+
read: my-read-arm-connection
7+
---
8+
9+
## Tricky-Name Agent (1ES)
10+
11+
Fixture used by `tests/compiler_tests.rs` to verify that agent names
12+
containing embedded double quotes and colons survive YAML escaping for
13+
the 1ES target, where the stage's `displayName:` is generated via
14+
`{{ agent_display_name }}` and the top-level pipeline `name:` is
15+
generated via `{{ pipeline_display_name }}`.
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
name: 'My "special": agent with quotes'
3+
description: "Fixture covering YAML escaping for embedded \" and : in name"
4+
target: standalone
5+
permissions:
6+
read: my-read-arm-connection
7+
---
8+
9+
## Tricky-Name Agent
10+
11+
Fixture used by `tests/compiler_tests.rs` to verify that agent names
12+
containing embedded double quotes and colons survive YAML escaping in
13+
both the top-level `name:` line (via `{{ pipeline_display_name }}`) and
14+
any `displayName:` positions (via `{{ agent_display_name }}`).

tests/safe-outputs/add-build-tag.lock.yml

Lines changed: 8 additions & 8 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)