Skip to content

Commit 9647ebc

Browse files
committed
ci(prcheck): use 'azldev component changed' for affected-component detection
Replace the bash 'git diff | grep /sources' step with structured azldev commands for lock validation, change detection, and scoped rendering. Pipeline step order: 1. Lock check -- 'azldev component update -a -q -O json', fail if any component has changed == true. Lock-update JSON published as a pipeline artifact for triage. 2. Changed-component detection -- 'azldev component changed --include-unchanged' writes the full per-component JSON to disk, published via ob_outputDirectory. The --include-unchanged flag ensures the JSON contains every known component, which is needed for the renderable-set filter in step 4. 3. Source/identity consistency tripwire -- hard-fail if any component reports sourcesChange == true with a changeType not in the allow-list {added, changed, deleted}. Prevents unauthenticated rewrites of the rendered 'sources' file under an existing component's identity. Data path is severed (subsequent steps skip); PR check remains advisory until ADO task 19179 removes job-level continueOnError. 4. Scoped render -- render set is the union of components flagged by 'azldev component changed' (inputs differ) and components whose spec tree was touched directly in the PR (git diff under specs/, mapped back to component names by compute_render_set.py). Deleted and unknown components are excluded via a renderable-set filter built from the full --include-unchanged JSON. 5. Prcheck API -- switches from --components <csv> to --changed-components-file <path>, filtering to entries with sourcesChange == true and changeType in {added, changed} (allow-list, mirroring the consistency tripwire). Also: * Add --changed-components-file flag (mutually exclusive with --components) and _load_components_from_file() to run_prcheck.py. Uses an allow-list of changeType values for defense-in-depth. * Add compute_render_set.py for render-set computation. * Document AZLDEV_ALLOW_ROOT in ADO pipeline instructions (OneBranch containers run as root, azldev refuses by default). * Mark changedComponentsFile pipeline variable as isreadonly=true. * Switch API_BASE_URL to $(ApiBaseDirectUrl) (bypasses AFD).
1 parent 4fcb765 commit 9647ebc

7 files changed

Lines changed: 381 additions & 74 deletions

File tree

.github/instructions/ado-pipeline.instructions.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,18 @@ Avoid shell scripts beyond the smallest possible wiring (env exports, `##vso[...
175175

176176
Python scripts are easier to test locally, easier to review, and avoid the foot-guns of bash quoting / globbing.
177177

178+
### `azldev` in OneBranch
179+
180+
OneBranch runs all steps as `root`. `azldev` refuses to run many commands as root by default (a safety measure for developer workstations). To allow it in CI, set `AZLDEV_ALLOW_ROOT=1` -- but **set it inline as a prefix on each `azldev` invocation**, not at the step level. Inline scoping makes it obvious which calls actually need the override and avoids leaking the flag to unrelated commands in the same step.
181+
182+
```bash
183+
AZLDEV_ALLOW_ROOT=1 azldev component update --check-only -a -q
184+
```
185+
186+
Symptom that you need this: `ERR Error: this command may not be run as root`.
187+
188+
This is NOT safe for general use, only for disposable CI environments.
189+
178190
## Security hardening
179191

180192
Apply all of these unless there is a documented reason not to:

.github/workflows/ado/templates/sources-upload-stages.yml

Lines changed: 40 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -117,79 +117,48 @@ stages:
117117
env:
118118
AZLDEV_COMMIT: $(AzldevCommit)
119119
120+
# Verify lock files are current. --check-only validates without
121+
# writing, exits nonzero if any lock would change.
120122
- script: |
121123
set -euo pipefail
122-
123-
# Workaround for an ADO git config error during spec rendering.
124-
# The config key may not be present on every agent image, so tolerate its absence.
125-
git config --unset extensions.worktreeConfig || true
126-
127-
# Full history is needed for spec rendering to work.
128-
if [ "$(git rev-parse --is-shallow-repository)" = "true" ]; then
129-
echo "##[group]Fetching full git history"
130-
git fetch --unshallow
131-
echo "##[endgroup]"
132-
fi
133-
134-
specs_dir="$(azldev config dump -q -f json | jq -r '.project.renderedSpecsDir')"
135-
136-
echo "##[group]Specs rendering"
137-
azldev component render -q -a --clean-stale
138-
echo "##[endgroup]"
139-
140-
# Check for any new or modified files under the specs directory.
141-
if ! python3 .github/workflows/scripts/render-specs-check/check_rendered_specs.py --specs-dir "$specs_dir"; then
142-
echo "##[group]Git diff"
143-
git --no-pager diff -- "$specs_dir"
144-
echo "##[endgroup]"
145-
echo "##[error]Specs are not up to date. Run 'azldev component render -q -a --clean-stale' and try again."
146-
exit 1
147-
fi
148-
displayName: "Verify rendered specs are up to date"
149-
# TODO: Re-enabled failing once specs in the main branch are updated and stable.
150-
# ADO task: 19179
151-
continueOnError: true
152-
124+
.github/workflows/scripts/control-tower/verify_locks.sh \
125+
--output-file "$(Build.ArtifactStagingDirectory)/lock-update.json" \
126+
--publish-dir "$(ob_outputDirectory)"
127+
displayName: "Verify lock files are up to date"
128+
129+
# Detect changed components. azldev hard-fails if any component has
130+
# sourcesChange == true without a corresponding "identity change"
131+
# -- meaning the component's input fingerprint (.comp.toml + lock)
132+
# didn't move (changeType not in {added, changed, deleted}). That
133+
# combination would let attacker-supplied bytes ride into the
134+
# lookaside under an unchanged component identity, so we treat it
135+
# as hostile and fail closed (supply-chain drift tripwire).
136+
# --include-unchanged ensures the full component list is available
137+
# for downstream consumers.
153138
- script: |
154139
set -euo pipefail
155-
156-
cat << EOF
157-
NOTE: It's possible more components changed between the source and target commits than displayed below.
158-
This workflow only checks for updates in the 'sources' files and ignores all other changes.
159-
Only components present on the source commit are reported; components that exist only on the
160-
target branch are intentionally excluded. Pure renames are reported under their source-branch
161-
name, since the component name is part of the cache blob name we upload.
162-
EOF
163-
164-
# Diff direction is TARGET -> SOURCE so additions/modifications represent the source-side state:
165-
# * --no-renames decomposes a rename into an add (new path, source side) + delete (old path,
166-
# target side); we want the source-side name because the component name is part of the cache
167-
# blob name uploaded for the PR. It also silences git's diff.renameLimit warning on large diffs.
168-
# * --diff-filter=d (lowercase) excludes deletions, which here correspond to paths present only
169-
# on the target branch -- those components are not on the PR and must not be uploaded.
170-
# Net effect: the listed '*/sources' files are exactly those present on the source commit that
171-
# differ from target, including pure renames surfaced under the source-branch name.
172-
sources_files="$(
173-
git diff --no-renames --diff-filter=d "$TARGET_COMMIT" "$SOURCE_COMMIT" --name-only \
174-
| { grep '/sources$' || true; } \
175-
| sort -u
176-
)"
177-
178-
components=()
179-
while IFS= read -r line; do
180-
# Skip the spurious empty iteration when no '*/sources' files changed.
181-
if [[ -n "$line" ]]; then
182-
components+=("$(basename "$(dirname "$line")")")
183-
fi
184-
done <<< "$sources_files"
185-
186-
components="$(IFS=, ; echo "${components[*]}")"
187-
echo "Affected components: $components"
188-
echo "##vso[task.setvariable variable=components]$components"
189-
env:
190-
SOURCE_COMMIT: $(sourceCommit)
191-
TARGET_COMMIT: $(targetCommit)
192-
displayName: "Determine affected components"
140+
json_file="$(Build.ArtifactStagingDirectory)/changed-components/changed-components.json"
141+
.github/workflows/scripts/control-tower/compute_changed.sh \
142+
--output-file "$json_file" \
143+
--publish-dir "$(ob_outputDirectory)" \
144+
--source-commit "$(sourceCommit)" \
145+
--target-commit "$(targetCommit)"
146+
echo "##vso[task.setvariable variable=changedComponentsFile;isreadonly=true]$json_file"
147+
displayName: "Compute changed components"
148+
149+
# Render the components that need re-rendering (azldev-flagged plus
150+
# any with hand-edited spec files), in --check-only mode: azldev
151+
# renders to a staging area and diffs against the on-disk output
152+
# without writing. Exits nonzero if any component's rendered output
153+
# would change, catching both stale renders and direct hand-edits.
154+
- script: |
155+
set -euo pipefail
156+
.github/workflows/scripts/control-tower/render_and_verify.sh \
157+
--output-dir "$(Build.ArtifactStagingDirectory)/render-check" \
158+
--changed-components-file "$(changedComponentsFile)" \
159+
--source-commit "$(sourceCommit)" \
160+
--target-commit "$(targetCommit)"
161+
displayName: "Verify rendered specs are up to date"
193162
194163
- task: AzureCLI@2
195164
displayName: "Call Control Tower 'prcheck' API"
@@ -204,14 +173,14 @@ stages:
204173
--api-audience "$API_AUDIENCE" \
205174
--api-base-url "$API_BASE_URL" \
206175
--build-reason "$BUILD_REASON" \
207-
--components "$COMPONENTS" \
176+
--changed-components-file "$CHANGED_COMPONENTS_FILE" \
208177
--source-commit "$SOURCE_COMMIT" \
209178
--repo-uri "$UPSTREAM_REPO_URL"
210179
env:
211180
API_AUDIENCE: $(ApiAudience)
212181
API_BASE_URL: $(ApiBaseDirectUrl)
213182
BUILD_REASON: $(Build.Reason)
214-
COMPONENTS: $(components)
183+
CHANGED_COMPONENTS_FILE: $(changedComponentsFile)
215184
SOURCE_COMMIT: $(sourceCommit)
216185
# TODO: Target commit is not used. Will be needed once we move detection of affected components to CT.
217186
# ADO task: 18816
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
#!/usr/bin/env bash
2+
# Compute the set of changed components between source and target commits.
3+
#
4+
# Writes the JSON to <output-file> and copies it into <publish-dir>/changed-components/
5+
# for triage artifact publication.
6+
#
7+
# 'azldev component changed' compares the source and target commits and emits
8+
# one entry per component with its 'changeType' ('added', 'changed',
9+
# 'unchanged', or 'deleted') plus a 'sourcesChange' flag indicating whether
10+
# the rendered 'sources' file differs between commits.
11+
#
12+
# azldev hard-fails if any component has sourcesChange == true without a
13+
# corresponding identity change (added/changed/deleted) -- supply-chain
14+
# drift protection.
15+
#
16+
# Only components with sourcesChange == true AND changeType in {added, changed}
17+
# are forwarded to Control Tower for upload.
18+
19+
set -euo pipefail
20+
21+
usage() { echo "Usage: $0 --output-file FILE --publish-dir DIR --source-commit SHA --target-commit SHA" >&2; exit 1; }
22+
23+
while [[ $# -gt 0 ]]; do
24+
case "$1" in
25+
--output-file) output_file="$2"; shift 2 ;;
26+
--publish-dir) publish_dir="$2"; shift 2 ;;
27+
--source-commit) source_commit="$2"; shift 2 ;;
28+
--target-commit) target_commit="$2"; shift 2 ;;
29+
*) usage ;;
30+
esac
31+
done
32+
[[ -z "${output_file:-}" || -z "${publish_dir:-}" || -z "${source_commit:-}" || -z "${target_commit:-}" ]] && usage
33+
34+
mkdir -p "$(dirname "$output_file")"
35+
36+
# Publish the changed-components JSON for post-mortem triage on EVERY exit
37+
# path, not just success -- if azldev hard-fails on a consistency tripwire
38+
# the partial JSON is exactly what an operator needs to investigate.
39+
publish_artifact() {
40+
if [[ -s "$output_file" ]]; then
41+
mkdir -p "$publish_dir/changed-components"
42+
cp "$output_file" "$publish_dir/changed-components/" || true
43+
fi
44+
}
45+
trap publish_artifact EXIT
46+
47+
echo "##[group]Computing changed components"
48+
AZLDEV_ALLOW_ROOT=1 azldev component changed --from "$target_commit" --to "$source_commit" -a --include-unchanged -O json > "$output_file"
49+
echo "##[endgroup]"
50+
51+
echo "##[group]Changed components (non-unchanged only)"
52+
jq '[.[] | select(.changeType != "unchanged")]' "$output_file"
53+
echo "##[endgroup]"
54+
55+
echo "##[group]Upload set (sourcesChange == true, changeType in {added, changed})"
56+
upload_count=$(jq -r '[.[] | select(.sourcesChange == true and (.changeType | IN("added", "changed")))] | length' "$output_file")
57+
jq -r '.[] | select(.sourcesChange == true and (.changeType | IN("added", "changed"))) | .component' "$output_file" | sort
58+
echo "Total: $upload_count component(s) to upload."
59+
echo "##[endgroup]"
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
"""Compute the render set: components flagged by `azldev component changed`
2+
plus components whose spec tree was touched directly in the PR.
3+
4+
Emits one component name per line on stdout (azldev dedupes internally).
5+
"""
6+
7+
import argparse
8+
import json
9+
from pathlib import Path
10+
11+
12+
def _load_entries(path: Path) -> list[dict]:
13+
"""Load the changed-components JSON."""
14+
return json.loads(path.read_text(encoding="utf-8"))
15+
16+
17+
def _renderable_components(entries: list[dict]) -> set[str]:
18+
"""Component names that azldev can still render (everything except deleted)."""
19+
return {e["component"] for e in entries if e.get("changeType") != "deleted"}
20+
21+
22+
def from_changed(entries: list[dict]) -> list[str]:
23+
"""Components from `azldev component changed` JSON.
24+
25+
Includes anything that is not 'deleted' and either has a non-'unchanged'
26+
changeType or has sourcesChange=true (safety net).
27+
"""
28+
out = []
29+
for e in entries:
30+
change_type = e.get("changeType")
31+
sources_change = e.get("sourcesChange") is True
32+
if change_type == "deleted":
33+
continue
34+
if change_type == "unchanged" and not sources_change:
35+
continue
36+
out.append(e["component"])
37+
return out
38+
39+
40+
def from_specs_diff(path: Path, specs_dir: Path, renderable: set[str]) -> list[str]:
41+
"""Components with any modified file under specs_dir.
42+
43+
Spec layout is rigid: <specs_dir>/<first-char>/<component>/...
44+
so the component name is the second segment under specs_dir.
45+
46+
Only components in ``renderable`` are included -- a deleted component's
47+
.comp.toml is gone so ``azldev component render`` would fail, and a
48+
path that doesn't map to any known component is noise.
49+
"""
50+
prefix = str(specs_dir).rstrip("/") + "/"
51+
out = []
52+
for line in path.read_text(encoding="utf-8").splitlines():
53+
if not line.startswith(prefix):
54+
continue
55+
parts = line[len(prefix) :].split("/", 2)
56+
if len(parts) >= 2 and parts[1]:
57+
name = parts[1]
58+
if name in renderable:
59+
out.append(name)
60+
return out
61+
62+
63+
def main() -> None:
64+
p = argparse.ArgumentParser()
65+
p.add_argument("--changed-components-file", type=Path, required=True)
66+
p.add_argument("--specs-diff-file", type=Path, required=True)
67+
p.add_argument("--specs-dir", type=Path, required=True)
68+
args = p.parse_args()
69+
70+
entries = _load_entries(args.changed_components_file)
71+
renderable = _renderable_components(entries)
72+
73+
# Dedupe across both sources -- a component appearing in azldev's
74+
# changed list AND with hand-edited specs would otherwise print twice,
75+
# and a component with N modified spec files would print N times.
76+
# dict.fromkeys preserves first-seen order.
77+
names = dict.fromkeys([
78+
*from_changed(entries),
79+
*from_specs_diff(args.specs_diff_file, args.specs_dir, renderable),
80+
])
81+
for name in names:
82+
print(name)
83+
84+
85+
if __name__ == "__main__":
86+
main()
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
#!/usr/bin/env bash
2+
# Render changed specs and verify the rendered tree is clean.
3+
4+
set -euo pipefail
5+
6+
usage() { echo "Usage: $0 --output-dir DIR --changed-components-file FILE --source-commit SHA --target-commit SHA" >&2; exit 1; }
7+
8+
while [[ $# -gt 0 ]]; do
9+
case "$1" in
10+
--output-dir) output_dir="$2"; shift 2 ;;
11+
--changed-components-file) changed_components_file="$2"; shift 2 ;;
12+
--source-commit) source_commit="$2"; shift 2 ;;
13+
--target-commit) target_commit="$2"; shift 2 ;;
14+
*) usage ;;
15+
esac
16+
done
17+
[[ -z "${output_dir:-}" || -z "${changed_components_file:-}" || -z "${source_commit:-}" || -z "${target_commit:-}" ]] && usage
18+
19+
# azldev's renderedSpecsDir is absolute. Translate to repo-relative
20+
# so it matches git's output ('git diff --name-only' always emits
21+
# repo-relative paths regardless of the path arg form).
22+
specs_dir_abs="$(AZLDEV_ALLOW_ROOT=1 azldev config dump -q -f json | jq -r '.project.renderedSpecsDir')"
23+
specs_dir="$(realpath --relative-to="$(pwd)" "$specs_dir_abs")"
24+
25+
# Capture git diff under the specs tree so the render set can
26+
# include components whose specs were edited directly (which
27+
# azldev's input-fingerprint view of "changed" would miss).
28+
# --no-renames prevents collapse of delete+add into a rename
29+
# entry which would lose the old path. The Python script
30+
# filters out deleted/unknown components using the full
31+
# changed-components JSON.
32+
mkdir -p "$output_dir"
33+
specs_diff_file="$output_dir/specs-diff.txt"
34+
git diff --no-renames --name-only "$target_commit" "$source_commit" -- "$specs_dir" > "$specs_diff_file"
35+
36+
# Render set is the union of:
37+
# - components flagged by 'azldev component changed' (inputs differ)
38+
# - components whose spec tree was touched directly in the PR
39+
changed=$(python3 .github/workflows/scripts/control-tower/compute_render_set.py \
40+
--changed-components-file "$changed_components_file" \
41+
--specs-diff-file "$specs_diff_file" \
42+
--specs-dir "$specs_dir")
43+
44+
if [[ -z "$changed" ]]; then
45+
echo "No changed components -- skipping render."
46+
else
47+
changed_count=$(echo "$changed" | wc -l)
48+
echo "Rendering $changed_count component(s) (azldev dedupes internally)..."
49+
echo "##[group]Render set"
50+
# shellcheck disable=SC2001
51+
echo "$changed" | sed 's/^/ - /'
52+
echo "##[endgroup]"
53+
echo "##[group]Specs rendering + verification"
54+
# --check-only renders to a staging area and diffs against on-disk specs.
55+
# Exits nonzero if any rendered file differs from what's committed.
56+
printf '%s\n' "$changed" | xargs -d '\n' env AZLDEV_ALLOW_ROOT=1 azldev component render --check-only --
57+
echo "##[endgroup]"
58+
fi

0 commit comments

Comments
 (0)