Skip to content

Commit 023bba4

Browse files
committed
ci(prcheck): use 'azldev component changed' for affected-component detection
Replace the bash 'git diff | grep /sources' step with structured azldev commands for lock validation, change detection, and scoped rendering. Pipeline step order: 1. Lock check -- 'azldev component update -a -q -O json', fail if any component has changed == true. Lock-update JSON published as a pipeline artifact for triage. 2. Changed-component detection -- 'azldev component changed --include-unchanged' writes the full per-component JSON to disk, published via ob_outputDirectory. The --include-unchanged flag ensures the JSON contains every known component, which is needed for the renderable-set filter in step 4. 3. Source/identity consistency tripwire -- hard-fail if any component reports sourcesChange == true with a changeType not in the allow-list {added, changed, deleted}. Prevents unauthenticated rewrites of the rendered 'sources' file under an existing component's identity. Data path is severed (subsequent steps skip); PR check remains advisory until ADO task 19179 removes job-level continueOnError. 4. Scoped render -- render set is the union of components flagged by 'azldev component changed' (inputs differ) and components whose spec tree was touched directly in the PR (git diff under specs/, mapped back to component names by compute_render_set.py). Deleted and unknown components are excluded via a renderable-set filter built from the full --include-unchanged JSON. 5. Prcheck API -- switches from --components <csv> to --changed-components-file <path>, filtering to entries with sourcesChange == true and changeType in {added, changed} (allow-list, mirroring the consistency tripwire). Also: * Add --changed-components-file flag (mutually exclusive with --components) and _load_components_from_file() to run_prcheck.py. Uses an allow-list of changeType values for defense-in-depth. * Add compute_render_set.py for render-set computation. * Document AZLDEV_ALLOW_ROOT in ADO pipeline instructions (OneBranch containers run as root, azldev refuses by default). * Mark changedComponentsFile pipeline variable as isreadonly=true. * Switch API_BASE_URL to $(ApiBaseDirectUrl) (bypasses AFD).
1 parent 4fcb765 commit 023bba4

4 files changed

Lines changed: 319 additions & 52 deletions

File tree

.github/instructions/ado-pipeline.instructions.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,17 @@ Avoid shell scripts beyond the smallest possible wiring (env exports, `##vso[...
175175

176176
Python scripts are easier to test locally, easier to review, and avoid the foot-guns of bash quoting / globbing.
177177

178+
### `azldev` in OneBranch
179+
180+
OneBranch run all steps as `root`. `azldev` refuses to run many commands as root by default (a safety measure for developer workstations). To allow it in CI, set the environment variable `AZLDEV_ALLOW_ROOT=1` if azldev returns the error `ERR Error: this command may not be run as root`. This is NOT safe for general use, only for use in disposable CI environments. Set it in the `env:` block of the step, not inline in the script body.
181+
182+
```yaml
183+
env:
184+
# OneBranch containers run as root. azldev refuses to run as root
185+
# by default, disable the root security check for this step.
186+
AZLDEV_ALLOW_ROOT: "1"
187+
```
188+
178189
## Security hardening
179190

180191
Apply all of these unless there is a documented reason not to:

.github/workflows/ado/templates/sources-upload-stages.yml

Lines changed: 153 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -120,23 +120,163 @@ stages:
120120
- script: |
121121
set -euo pipefail
122122
123-
# Workaround for an ADO git config error during spec rendering.
123+
# Workaround for an ADO git config error.
124124
# The config key may not be present on every agent image, so tolerate its absence.
125125
git config --unset extensions.worktreeConfig || true
126126
127-
# Full history is needed for spec rendering to work.
127+
# Full history is needed for lock resolution and spec rendering.
128128
if [ "$(git rev-parse --is-shallow-repository)" = "true" ]; then
129129
echo "##[group]Fetching full git history"
130130
git fetch --unshallow
131131
echo "##[endgroup]"
132132
fi
133133
134-
specs_dir="$(azldev config dump -q -f json | jq -r '.project.renderedSpecsDir')"
134+
echo "##[group]Updating lock files"
135+
lock_update_file="$ARTIFACT_STAGING_DIR/lock-update.json"
136+
azldev component update -a -q -O json > "$lock_update_file"
137+
echo "##[endgroup]"
138+
139+
# Also copy into ob_outputDirectory so OneBranch auto-publishes the
140+
# lock-update artifact for post-mortem triage.
141+
ob_lock_dir="$OB_OUTPUT_DIR/lock-update"
142+
mkdir -p "$ob_lock_dir"
143+
cp "$lock_update_file" "$ob_lock_dir/"
144+
145+
# Check if any locks drifted -- azldev tells us directly.
146+
# If locks are not valid, we can't guarantee that any further steps are valid, so fail the PR check immediately.
147+
# azldev emits a JSON 'null' when nothing changed; '. // []' coerces that to an
148+
# empty array so the downstream '[.[] | ...]' pipeline is well-typed.
149+
drifted=$(jq '(. // []) | [.[] | select(.changed == true)] | length' "$lock_update_file")
150+
if [[ "$drifted" -gt 0 ]]; then
151+
echo "##[error]$drifted lock file(s) are not up to date. Run 'azldev component update -a' and commit the result."
152+
echo "##[group]Drifted components"
153+
jq -r '(. // []) | .[] | select(.changed == true) | .component' "$lock_update_file"
154+
echo "##[endgroup]"
155+
exit 1
156+
fi
157+
displayName: "Verify lock files are up to date"
158+
env:
159+
ARTIFACT_STAGING_DIR: $(Build.ArtifactStagingDirectory)
160+
# OneBranch containers run as root. azldev refuses to run as root
161+
# by default, disable the root security check for this step.
162+
AZLDEV_ALLOW_ROOT: "1"
163+
OB_OUTPUT_DIR: $(ob_outputDirectory)
164+
165+
- script: |
166+
set -euo pipefail
167+
168+
artifact_dir="$ARTIFACT_STAGING_DIR/changed-components"
169+
json_file="$artifact_dir/changed-components.json"
170+
mkdir -p "$artifact_dir"
171+
172+
# 'azldev component changed' compares the source and target commits and emits one
173+
# entry per component with its 'changeType' ('added', 'changed', 'unchanged', or
174+
# 'deleted') plus a 'sourcesChange' flag indicating whether the rendered 'sources'
175+
# file differs between commits. Only components with sourcesChange == true AND
176+
# changeType != 'deleted' are forwarded to Control Tower for upload.
135177
136-
echo "##[group]Specs rendering"
137-
azldev component render -q -a --clean-stale
178+
echo "##[group]Computing changed components"
179+
azldev component changed --from "$TARGET_COMMIT" --to "$SOURCE_COMMIT" -a --include-unchanged -O json > "$json_file"
138180
echo "##[endgroup]"
139181
182+
echo "##[group]Changed components (non-unchanged only)"
183+
jq '[.[] | select(.changeType != "unchanged")]' "$json_file"
184+
echo "##[endgroup]"
185+
186+
echo "##[group]Upload set (sourcesChange == true, not deleted)"
187+
upload_count=$(jq -r '(. // []) | [.[] | select(.sourcesChange == true and .changeType != "deleted")] | length' "$json_file")
188+
jq -r '(. // []) | .[] | select(.sourcesChange == true and .changeType != "deleted") | .component' "$json_file" | sort
189+
echo "Total: $upload_count component(s) to upload."
190+
echo "##[endgroup]"
191+
192+
# Also write into ob_outputDirectory so OneBranch auto-publishes it
193+
# as a build artifact (explicit PublishPipelineArtifact is forbidden).
194+
ob_dir="$OB_OUTPUT_DIR/changed-components"
195+
mkdir -p "$ob_dir"
196+
cp "$json_file" "$ob_dir/"
197+
198+
echo "##vso[task.setvariable variable=changedComponentsFile;isreadonly=true]$json_file"
199+
env:
200+
ARTIFACT_STAGING_DIR: $(Build.ArtifactStagingDirectory)
201+
# OneBranch containers run as root. azldev refuses to run as root
202+
# by default, disable the root security check for this step.
203+
AZLDEV_ALLOW_ROOT: "1"
204+
OB_OUTPUT_DIR: $(ob_outputDirectory)
205+
SOURCE_COMMIT: $(sourceCommit)
206+
TARGET_COMMIT: $(targetCommit)
207+
displayName: "Compute changed components"
208+
209+
- script: |
210+
set -euo pipefail
211+
212+
# Source/identity consistency tripwire. The rendered 'sources' file
213+
# describes the lookaside tarballs Control Tower will fetch and serve to
214+
# builds. If a PR mutates that file but the component's input fingerprint
215+
# is unchanged, we cannot tell from this side whether that's a hostile
216+
# supply-chain swap or an accidental hand-edit / renderer non-determinism
217+
# bug. We treat any such record as hostile-by-default and hard-fail, so
218+
# the new bytes never reach the lookaside under an existing identity.
219+
#
220+
# The legitimate cases are explicitly enumerated as an allow-list (added,
221+
# changed, deleted) so any future 'changeType' value -- e.g. 'renamed'
222+
# that doesn't pull the sources blob into the input fingerprint -- fails
223+
# closed until the policy here is reviewed.
224+
bogus_count=$(jq -r '(. // []) | [.[] | select(.sourcesChange == true and (.changeType | IN("added", "changed", "deleted") | not))] | length' "$CHANGED_COMPONENTS_FILE")
225+
if [[ "$bogus_count" -gt 0 ]]; then
226+
echo "##[error]$bogus_count component(s) report a sources change without an accompanying identity change. Refusing to forward to Control Tower."
227+
jq -r '(. // []) | .[] | select(.sourcesChange == true and (.changeType | IN("added", "changed", "deleted") | not)) | " - " + .component + " (changeType=" + (.changeType // "null") + ")"' "$CHANGED_COMPONENTS_FILE"
228+
exit 1
229+
fi
230+
echo "OK: every sources change is accompanied by an identity change."
231+
# When this step fails, subsequent steps in the job are skipped -- the
232+
# bad data never reaches Control Tower. However, the job-level
233+
# continueOnError: true means the PR check still reports green to
234+
# GitHub. The consistency tripwire becomes truly blocking once ADO
235+
# task 19179 removes the job-level flag.
236+
env:
237+
CHANGED_COMPONENTS_FILE: $(changedComponentsFile)
238+
displayName: "Source/identity consistency check"
239+
240+
- script: |
241+
set -euo pipefail
242+
243+
# azldev's renderedSpecsDir is absolute. Translate to repo-relative
244+
# so it matches git's output ('git diff --name-only' always emits
245+
# repo-relative paths regardless of the path arg form).
246+
specs_dir_abs="$(azldev config dump -q -f json | jq -r '.project.renderedSpecsDir')"
247+
specs_dir="$(realpath --relative-to="$(pwd)" "$specs_dir_abs")"
248+
249+
# Capture git diff under the specs tree so the render set can
250+
# include components whose specs were edited directly (which
251+
# azldev's input-fingerprint view of "changed" would miss).
252+
# --no-renames prevents collapse of delete+add into a rename
253+
# entry which would lose the old path. The Python script
254+
# filters out deleted/unknown components using the full
255+
# changed-components JSON.
256+
specs_diff_file="$ARTIFACT_STAGING_DIR/specs-diff.txt"
257+
git diff --no-renames --name-only "$TARGET_COMMIT" "$SOURCE_COMMIT" -- "$specs_dir" > "$specs_diff_file"
258+
259+
# Render set is the union of:
260+
# - components flagged by 'azldev component changed' (inputs differ)
261+
# - components whose spec tree was touched directly in the PR
262+
changed=$(python3 .github/workflows/scripts/control-tower/compute_render_set.py \
263+
--changed-components-file "$CHANGED_COMPONENTS_FILE" \
264+
--specs-diff-file "$specs_diff_file" \
265+
--specs-dir "$specs_dir")
266+
267+
if [[ -z "$changed" ]]; then
268+
echo "No changed components -- skipping render."
269+
else
270+
changed_count=$(echo "$changed" | wc -l)
271+
echo "Rendering $changed_count component(s) (azldev dedupes internally)..."
272+
echo "##[group]Render set"
273+
echo "$changed" | sed 's/^/ - /'
274+
echo "##[endgroup]"
275+
echo "##[group]Specs rendering"
276+
printf '%s\n' "$changed" | xargs azldev component render -q --
277+
echo "##[endgroup]"
278+
fi
279+
140280
# Check for any new or modified files under the specs directory.
141281
if ! python3 .github/workflows/scripts/render-specs-check/check_rendered_specs.py --specs-dir "$specs_dir"; then
142282
echo "##[group]Git diff"
@@ -145,51 +285,15 @@ stages:
145285
echo "##[error]Specs are not up to date. Run 'azldev component render -q -a --clean-stale' and try again."
146286
exit 1
147287
fi
148-
displayName: "Verify rendered specs are up to date"
149-
# TODO: Re-enabled failing once specs in the main branch are updated and stable.
150-
# ADO task: 19179
151-
continueOnError: true
152-
153-
- script: |
154-
set -euo pipefail
155-
156-
cat << EOF
157-
NOTE: It's possible more components changed between the source and target commits than displayed below.
158-
This workflow only checks for updates in the 'sources' files and ignores all other changes.
159-
Only components present on the source commit are reported; components that exist only on the
160-
target branch are intentionally excluded. Pure renames are reported under their source-branch
161-
name, since the component name is part of the cache blob name we upload.
162-
EOF
163-
164-
# Diff direction is TARGET -> SOURCE so additions/modifications represent the source-side state:
165-
# * --no-renames decomposes a rename into an add (new path, source side) + delete (old path,
166-
# target side); we want the source-side name because the component name is part of the cache
167-
# blob name uploaded for the PR. It also silences git's diff.renameLimit warning on large diffs.
168-
# * --diff-filter=d (lowercase) excludes deletions, which here correspond to paths present only
169-
# on the target branch -- those components are not on the PR and must not be uploaded.
170-
# Net effect: the listed '*/sources' files are exactly those present on the source commit that
171-
# differ from target, including pure renames surfaced under the source-branch name.
172-
sources_files="$(
173-
git diff --no-renames --diff-filter=d "$TARGET_COMMIT" "$SOURCE_COMMIT" --name-only \
174-
| { grep '/sources$' || true; } \
175-
| sort -u
176-
)"
177-
178-
components=()
179-
while IFS= read -r line; do
180-
# Skip the spurious empty iteration when no '*/sources' files changed.
181-
if [[ -n "$line" ]]; then
182-
components+=("$(basename "$(dirname "$line")")")
183-
fi
184-
done <<< "$sources_files"
185-
186-
components="$(IFS=, ; echo "${components[*]}")"
187-
echo "Affected components: $components"
188-
echo "##vso[task.setvariable variable=components]$components"
189288
env:
289+
ARTIFACT_STAGING_DIR: $(Build.ArtifactStagingDirectory)
290+
# OneBranch containers run as root. azldev refuses to run as root
291+
# by default, disable the root security check for this step.
292+
AZLDEV_ALLOW_ROOT: "1"
293+
CHANGED_COMPONENTS_FILE: $(changedComponentsFile)
190294
SOURCE_COMMIT: $(sourceCommit)
191295
TARGET_COMMIT: $(targetCommit)
192-
displayName: "Determine affected components"
296+
displayName: "Render changed specs and verify rendered tree"
193297
194298
- task: AzureCLI@2
195299
displayName: "Call Control Tower 'prcheck' API"
@@ -204,14 +308,14 @@ stages:
204308
--api-audience "$API_AUDIENCE" \
205309
--api-base-url "$API_BASE_URL" \
206310
--build-reason "$BUILD_REASON" \
207-
--components "$COMPONENTS" \
311+
--changed-components-file "$CHANGED_COMPONENTS_FILE" \
208312
--source-commit "$SOURCE_COMMIT" \
209313
--repo-uri "$UPSTREAM_REPO_URL"
210314
env:
211315
API_AUDIENCE: $(ApiAudience)
212316
API_BASE_URL: $(ApiBaseDirectUrl)
213317
BUILD_REASON: $(Build.Reason)
214-
COMPONENTS: $(components)
318+
CHANGED_COMPONENTS_FILE: $(changedComponentsFile)
215319
SOURCE_COMMIT: $(sourceCommit)
216320
# TODO: Target commit is not used. Will be needed once we move detection of affected components to CT.
217321
# ADO task: 18816
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
"""Compute the render set: components flagged by `azldev component changed`
2+
plus components whose spec tree was touched directly in the PR.
3+
4+
Emits one component name per line on stdout (azldev dedupes internally).
5+
"""
6+
7+
import argparse
8+
import json
9+
from pathlib import Path
10+
11+
12+
def _load_entries(path: Path) -> list[dict]:
13+
"""Load the changed-components JSON, coercing null to []."""
14+
return json.loads(path.read_text(encoding="utf-8")) or []
15+
16+
17+
def _renderable_components(entries: list[dict]) -> set[str]:
18+
"""Component names that azldev can still render (everything except deleted)."""
19+
return {
20+
e["component"]
21+
for e in entries
22+
if e.get("changeType") != "deleted"
23+
}
24+
25+
26+
def from_changed(entries: list[dict]) -> list[str]:
27+
"""Components from `azldev component changed` JSON.
28+
29+
Includes anything that is not 'deleted' and either has a non-'unchanged'
30+
changeType or has sourcesChange=true (safety net).
31+
"""
32+
out = []
33+
for e in entries:
34+
change_type = e.get("changeType")
35+
sources_change = e.get("sourcesChange") is True
36+
if change_type == "deleted":
37+
continue
38+
if change_type == "unchanged" and not sources_change:
39+
continue
40+
out.append(e["component"])
41+
return out
42+
43+
44+
def from_specs_diff(path: Path, specs_dir: Path, renderable: set[str]) -> list[str]:
45+
"""Components with any modified file under specs_dir.
46+
47+
Spec layout is rigid: <specs_dir>/<first-char>/<component>/...
48+
so the component name is the second segment under specs_dir.
49+
50+
Only components in ``renderable`` are included -- a deleted component's
51+
.comp.toml is gone so ``azldev component render`` would fail, and a
52+
path that doesn't map to any known component is noise.
53+
"""
54+
prefix = str(specs_dir).rstrip("/") + "/"
55+
out = []
56+
for line in path.read_text(encoding="utf-8").splitlines():
57+
if not line.startswith(prefix):
58+
continue
59+
parts = line[len(prefix):].split("/", 2)
60+
if len(parts) >= 2 and parts[1]:
61+
name = parts[1]
62+
if name in renderable:
63+
out.append(name)
64+
return out
65+
66+
67+
def main() -> None:
68+
p = argparse.ArgumentParser()
69+
p.add_argument("--changed-components-file", type=Path, required=True)
70+
p.add_argument("--specs-diff-file", type=Path, required=True)
71+
p.add_argument("--specs-dir", type=Path, required=True)
72+
args = p.parse_args()
73+
74+
entries = _load_entries(args.changed_components_file)
75+
renderable = _renderable_components(entries)
76+
77+
for name in from_changed(entries):
78+
print(name)
79+
for name in from_specs_diff(args.specs_diff_file, args.specs_dir, renderable):
80+
print(name)
81+
82+
83+
if __name__ == "__main__":
84+
main()

0 commit comments

Comments
 (0)