Skip to content

Commit ff517e7

Browse files
dmcilvaneychristopherco
authored andcommitted
ci(pipeline): add package build to Control Tower integration
Add a 'Submit package build to Control Tower' step that calls the /api/Scenario/package endpoint after the prcheck step succeeds. The step is gated on PR triggers so unmerged code never kicks off a build. Naming: the pipeline now does more than source upload, so rename to reflect what it actually does (call Control Tower). * sources-upload.yml -> control-tower-integration.yml * sources-upload-stages.yml -> control-tower-integration-stages.yml * Stage PRCheck -> Integration * Job CallControlTowerAPI -> UploadAndBuild Both API calls live in a single job. Two jobs would have doubled the fixed-cost OneBranch SDL/binary-analysis injections, required a cross-job artifact handoff for the changed-components JSON, and bought us only marginal isolation -- upload finishes in minutes and the package-build call only briefly polls to confirm acceptance. The ADO pipeline definition's 'YAML file path' must be updated in the portal from sources-upload.yml to control-tower-integration.yml when this lands. Package-build step details: * condition: and(succeeded(), ne(Build.Reason, PullRequest)) * Reuses the changed-components JSON from earlier in the job, filters to changeType in {added, changed}, submits with packageTarget=azl4, isScratchBuild=true. * run_package_build.py polls briefly (default 5 min) just to confirm the job left Queued; full build progress is monitored in CT itself.
1 parent ad99659 commit ff517e7

5 files changed

Lines changed: 319 additions & 14 deletions

File tree

.github/workflows/ado/sources-upload.yml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,13 @@
77
# .github/workflows/ado/templates/sources-upload-stages.yml
88
#
99
# Authenticates via Workload Identity Federation (OIDC) and calls the Control
10-
# Tower prcheck API with PR context.
10+
# Tower APIs to:
11+
# 1. Validate that the rendered sources of every changed component can be
12+
# fetched from the lookaside (prcheck). The actual upload happens later
13+
# from the merge queue, not here.
14+
# 2. Submit scratch package builds for changed components.
15+
#
16+
# Helper scripts live under .github/workflows/scripts/control-tower/.
1117
#
1218
# Prerequisites (ADO / Azure Portal):
1319
# 1. Entra ID App Registration with audience URI

.github/workflows/ado/templates/sources-upload-stages.yml

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Microsoft Corporation
22
#
3-
# Raw stages template for the sources-upload PR check pipeline.
3+
# Raw stages template for the Control Tower integration pipeline.
44
#
55
# This template is OneBranch-agnostic: it declares the stages/jobs/steps that
66
# do the actual work and exposes the OneBranch-coupled knobs as parameters.
@@ -27,9 +27,9 @@ parameters:
2727
type: number
2828

2929
stages:
30-
- stage: PRCheck
30+
- stage: Integration
3131
jobs:
32-
- job: CallControlTowerAPI
32+
- job: UploadAndBuild
3333
# Non-blocking PR check: failing steps still render red error annotations
3434
# and surface in the build-issues view, but the job resolves to
3535
# SucceededWithIssues and the run to PartiallySucceeded, which the Azure
@@ -186,3 +186,29 @@ stages:
186186
# ADO task: 18816
187187
TARGET_COMMIT: $(targetCommit)
188188
UPSTREAM_REPO_URL: $(Build.Repository.Uri)
189+
190+
- task: AzureCLI@2
191+
displayName: "Submit package build to Control Tower"
192+
inputs:
193+
azureSubscription: ${{ parameters.serviceConnection }}
194+
scriptType: bash
195+
scriptLocation: inlineScript
196+
inlineScript: |
197+
set -euo pipefail
198+
199+
python3 .github/workflows/scripts/control-tower/run_package_build.py \
200+
--api-audience "$API_AUDIENCE" \
201+
--api-base-url "$API_BASE_URL" \
202+
--build-reason "$BUILD_REASON" \
203+
--changed-components-file "$CHANGED_COMPONENTS_FILE" \
204+
--package-target azl4 \
205+
--official-build \
206+
--commit-sha "$SOURCE_COMMIT" \
207+
--repo-uri "$UPSTREAM_REPO_URL"
208+
env:
209+
API_AUDIENCE: $(ApiAudience)
210+
API_BASE_URL: $(ApiBaseDirectUrl)
211+
BUILD_REASON: $(Build.Reason)
212+
CHANGED_COMPONENTS_FILE: $(changedComponentsFile)
213+
SOURCE_COMMIT: $(sourceCommit)
214+
UPSTREAM_REPO_URL: $(Build.Repository.Uri)

.github/workflows/scripts/control-tower/client.py

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -268,16 +268,20 @@ def poll_until_terminal(
268268
job_id: str,
269269
poll_interval_seconds: int,
270270
poll_timeout_seconds: int,
271-
) -> Optional[dict]:
271+
) -> tuple[dict, bool]:
272272
"""Poll the job status until it reaches a terminal state or the timeout expires.
273273
274-
Returns the final status dict, or ``None`` if the local timeout was hit
275-
before the job reached a terminal state.
274+
Returns ``(last_status_dict, timed_out)``:
275+
- ``timed_out == False``: the job reached a terminal state, last_status_dict
276+
is that final state.
277+
- ``timed_out == True``: the local timeout expired first, last_status_dict
278+
is the most recent non-terminal observation (caller can inspect
279+
``status`` to distinguish "still Queued" from "Running").
276280
"""
277281
start = time.monotonic()
278282
deadline = start + poll_timeout_seconds
279283
previous_status: Optional[str] = None
280-
job_status_object: Optional[dict] = None
284+
job_status_object: dict = {}
281285

282286
while True:
283287
job_status_object = get_job_status(
@@ -307,15 +311,15 @@ def poll_until_terminal(
307311
)
308312

309313
if current_status not in NON_TERMINAL_STATUSES:
310-
return job_status_object
314+
return job_status_object, False
311315

312316
remaining = deadline - time.monotonic()
313317
if remaining <= 0:
314318
print(
315319
f"##[warning]Local poll timeout of {poll_timeout_seconds}s reached "
316320
f"while job {job_id} was still in status '{current_status}'."
317321
)
318-
return None
322+
return job_status_object, True
319323

320324
time.sleep(min(poll_interval_seconds, max(1, int(remaining))))
321325

Lines changed: 267 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
"""Submit a package-build job to the Control Tower service and wait briefly.
2+
3+
Flow:
4+
1. Read the changed-components JSON.
5+
2. Filter to the build set: ``changeType in {added, changed}`` -- any
6+
component whose inputs changed needs a rebuild, regardless of whether
7+
its ``sourcesChange`` flag is set.
8+
3. POST ``/api/Scenario/package`` with the build request.
9+
4. Poll briefly (default 5 min) until the job reaches a terminal state
10+
(success or failure) or the local timeout expires. The goal is to
11+
catch jobs that fail immediately on submission, not to wait for the
12+
full build -- a non-terminal status at timeout is treated as
13+
acceptance and the build continues async.
14+
5. Exit 0 if the job started (or completed). Exit 1 only on submission
15+
failure or immediate terminal failure.
16+
"""
17+
18+
import argparse
19+
import json
20+
import sys
21+
from pathlib import Path
22+
23+
from azure.identity import DefaultAzureCredential
24+
25+
import client as ct
26+
27+
28+
def _load_build_components(path: Path) -> list[str]:
29+
"""Filter the ``azldev component changed`` JSON to the build set.
30+
31+
The build set is every component with ``changeType`` in ``{added, changed}``
32+
— these are the components whose inputs differ between source and target
33+
and therefore need a rebuild. Unlike the upload set, we do NOT filter on
34+
``sourcesChange`` here: a component can need a rebuild even if its source
35+
tarballs didn't change (e.g. an overlay or build-config change).
36+
37+
Deleted components are excluded — there is nothing to build.
38+
"""
39+
try:
40+
raw = path.read_text(encoding="utf-8")
41+
except OSError as exc:
42+
raise SystemExit(
43+
f"##[error]Failed to read --changed-components-file {path!s}: {exc}"
44+
) from exc
45+
46+
try:
47+
entries = json.loads(raw)
48+
except json.JSONDecodeError as exc:
49+
raise SystemExit(
50+
f"##[error]--changed-components-file {path!s} is not valid JSON: {exc}"
51+
) from exc
52+
53+
if not isinstance(entries, list):
54+
raise SystemExit(
55+
f"##[error]--changed-components-file {path!s} top-level value "
56+
f"must be a JSON array (got {type(entries).__name__})."
57+
)
58+
59+
build_change_types = {"added", "changed"}
60+
components: list[str] = []
61+
for entry in entries:
62+
if not isinstance(entry, dict):
63+
continue
64+
if entry.get("changeType") in build_change_types:
65+
name = entry.get("component")
66+
if isinstance(name, str) and name:
67+
components.append(name)
68+
69+
return sorted(set(components))
70+
71+
72+
def _parse_args() -> argparse.Namespace:
73+
parser = argparse.ArgumentParser(
74+
description="Submit a package-build job to the Control Tower service.",
75+
)
76+
parser.add_argument(
77+
"--api-audience",
78+
required=True,
79+
help="Entra ID audience URI (e.g. api://<client-id>)",
80+
)
81+
parser.add_argument(
82+
"--api-base-url",
83+
required=True,
84+
help="Base URL of the Control Tower service",
85+
)
86+
parser.add_argument(
87+
"--build-reason",
88+
required=True,
89+
help="ADO build reason (PullRequest, IndividualCI, ...). Used for the "
90+
"local skip guard -- package builds are not submitted for PR triggers.",
91+
)
92+
parser.add_argument(
93+
"--changed-components-file",
94+
required=True,
95+
type=Path,
96+
help="Path to the raw JSON output of 'azldev component changed -a -O json'.",
97+
)
98+
parser.add_argument(
99+
"--package-target",
100+
required=True,
101+
help="Package target identifier (e.g. 'azl4').",
102+
)
103+
parser.add_argument(
104+
"--repo-uri",
105+
required=True,
106+
help="Upstream repository URI.",
107+
)
108+
parser.add_argument(
109+
"--commit-sha",
110+
default=None,
111+
help="Source commit SHA to build from.",
112+
)
113+
parser.add_argument(
114+
"--branch",
115+
default=None,
116+
help="Source branch name (alternative to --commit-sha).",
117+
)
118+
parser.add_argument(
119+
"--official-build",
120+
action="store_true",
121+
default=False,
122+
help="Submit as a non-scratch (official, persisted) build. The default "
123+
"is to submit a scratch build -- official is opt-in so the caller has "
124+
"to explicitly say they want a persisted artifact.",
125+
)
126+
parser.add_argument(
127+
"--poll-interval-seconds",
128+
type=int,
129+
default=10,
130+
help="How often to poll the job status endpoint (default: 10).",
131+
)
132+
parser.add_argument(
133+
"--poll-timeout-seconds",
134+
type=int,
135+
default=600,
136+
help=(
137+
"Maximum time to wait for the job to reach a terminal state "
138+
"(default: 600 = 10 min). This is NOT the build timeout -- we "
139+
"just want to catch jobs that fail immediately on submission. "
140+
"A non-terminal status at timeout is treated as acceptance."
141+
),
142+
)
143+
return parser.parse_args()
144+
145+
146+
def main() -> None:
147+
args = _parse_args()
148+
149+
if args.poll_interval_seconds <= 0:
150+
print("##[error]--poll-interval-seconds must be a positive integer.")
151+
sys.exit(2)
152+
if args.poll_timeout_seconds <= 0:
153+
print("##[error]--poll-timeout-seconds must be a positive integer.")
154+
sys.exit(2)
155+
156+
components = _load_build_components(args.changed_components_file)
157+
158+
base_url = args.api_base_url.rstrip("/")
159+
160+
if args.build_reason == "PullRequest":
161+
print(
162+
"Skipping Control Tower call -- pull request triggers do not submit "
163+
"package builds (unmerged code should not consume build capacity)."
164+
)
165+
return
166+
167+
if not components:
168+
print("No components need a rebuild -- skipping package-build submission.")
169+
return
170+
171+
# ── Build payload ────────────────────────────────────────────────
172+
payload: dict = {
173+
"repoUri": args.repo_uri,
174+
"packageTarget": args.package_target,
175+
"packages": components,
176+
"isScratchBuild": not args.official_build,
177+
"buildReason": args.build_reason,
178+
}
179+
if args.commit_sha is not None:
180+
payload["commitSha"] = args.commit_sha
181+
if args.branch is not None:
182+
payload["branch"] = args.branch
183+
184+
print("Calling Control Tower 'package' endpoint...")
185+
print("Payload:")
186+
print(json.dumps(payload, indent=2))
187+
188+
# ── Acquire bearer token ─────────────────────────────────────────
189+
credential = DefaultAzureCredential()
190+
token_holder = ct.TokenHolder(token=ct.get_token(credential, args.api_audience))
191+
192+
session = ct.make_session()
193+
194+
# ── Submit build ─────────────────────────────────────────────────
195+
try:
196+
build_response = ct.post_scenario(
197+
session,
198+
base_url,
199+
"/api/Scenario/package",
200+
credential,
201+
args.api_audience,
202+
token_holder,
203+
payload,
204+
context="package-build",
205+
)
206+
except RuntimeError as exc:
207+
print(f"##[error]{exc}")
208+
sys.exit(1)
209+
210+
print("package-build response:")
211+
print(json.dumps(build_response, indent=2, default=str))
212+
213+
job_id = build_response.get("jobId")
214+
if not job_id:
215+
print(
216+
"##[error]Control Tower 'package' response did not include a 'jobId'. "
217+
"Cannot confirm job acceptance."
218+
)
219+
sys.exit(1)
220+
221+
# ── Brief poll — just confirm the job was accepted ───────────────
222+
print(
223+
f"Polling job {job_id} for up to {args.poll_timeout_seconds}s to confirm "
224+
f"acceptance (not waiting for full build completion)..."
225+
)
226+
try:
227+
final, timed_out = ct.poll_until_terminal(
228+
session,
229+
base_url,
230+
credential,
231+
args.api_audience,
232+
token_holder,
233+
job_id,
234+
args.poll_interval_seconds,
235+
args.poll_timeout_seconds,
236+
)
237+
except RuntimeError as exc:
238+
print(f"##[error]{exc}")
239+
sys.exit(1)
240+
241+
if timed_out:
242+
# We don't wait for full build completion -- the goal of this poll
243+
# is just to surface a fast-failing job. A non-terminal status at
244+
# the timeout is acceptance enough; the build continues async and
245+
# is monitored in the Control Tower UI.
246+
last_status = final.get("status", "Unknown")
247+
print(
248+
f"Job {job_id} still in non-terminal status '{last_status}' "
249+
f"after {args.poll_timeout_seconds}s -- build accepted. "
250+
f"Monitor progress in the Control Tower UI."
251+
)
252+
return
253+
254+
ct.print_final_status(final)
255+
256+
status = final.get("status")
257+
if status == ct.SUCCESS_STATUS:
258+
print(f"Control Tower build job {job_id} completed successfully.")
259+
return
260+
261+
# Terminal failure -- the job was accepted but failed immediately.
262+
ct.report_failure(final)
263+
sys.exit(1)
264+
265+
266+
if __name__ == "__main__":
267+
main()

0 commit comments

Comments
 (0)