Skip to content

feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR and bump to v1.1.4-1#8618

Open
cameronmeissner wants to merge 23 commits into
mainfrom
cameissner/stls-client-dalec-linux
Open

feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR and bump to v1.1.4-1#8618
cameronmeissner wants to merge 23 commits into
mainfrom
cameissner/stls-client-dalec-linux

Conversation

@cameronmeissner

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR now that the client is being built/published by dalec

Which issue(s) this PR fixes:

Fixes #

Copilot AI review requested due to automatic review settings June 1, 2026 20:14
@github-actions github-actions Bot added the components This pull request updates cached components on Linux or Windows VHDs label Jun 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors how aks-secure-tls-bootstrap-client is sourced for Linux images, moving away from GitHub release tarballs toward packages.microsoft.com (PMC) for Ubuntu/Azure Linux and MCR (OCI/sysext) for Flatcar/ACL, and updates Renovate ownership for related updates.

Changes:

  • Update parts/common/components.json to define distro-specific sources/versions for aks-secure-tls-bootstrap-client (PMC for Ubuntu/Azure Linux, MCR sysext for Flatcar).
  • Update VHD build dependency caching logic to use package/sysext download helpers instead of a direct tarball download.
  • Rename the “download from URL” helper in cse_install.sh for clarity and adjust its callsite; tweak Renovate assignee/reviewer rules.

Package Update Analysis: aks-secure-tls-bootstrap-client

Version change: 1.1.2 → 1.1.3 (patch update)
OS variants affected: Ubuntu 20.04/22.04/24.04, Azure Linux 3.0, Flatcar (sysext), Windows
OS variants NOT updated: Mariner (no entry / no default fallback) — causes silent skip on Mariner builds.

Upstream changelog: Not evaluated here (not available in-repo). Manual validation recommended.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
vhdbuilder/packer/install-dependencies.sh Switch aks-secure-tls-bootstrap-client handling to package/sysext download flow during VHD build.
parts/linux/cloud-init/artifacts/cse_install.sh Rename the custom-URL download helper and update its caller.
parts/common/components.json Move component metadata to distro-specific PMC/MCR sources and bump versions.
.github/renovate.json Adjust Renovate assignees/reviewers and add a rule grouping for this component.

Comment thread vhdbuilder/packer/install-dependencies.sh Outdated
Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread parts/linux/cloud-init/artifacts/cse_install.sh Outdated
Comment thread parts/common/components.json Outdated
Copilot AI review requested due to automatic review settings June 1, 2026 22:22

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread parts/common/components.json
Copilot AI review requested due to automatic review settings June 1, 2026 23:42
Copilot AI review requested due to automatic review settings June 4, 2026 21:44

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread parts/linux/cloud-init/artifacts/acl/cse_install_acl.sh
Comment thread parts/common/components.json Outdated
Copilot AI review requested due to automatic review settings June 4, 2026 23:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread parts/linux/cloud-init/artifacts/cse_install.sh
Comment thread parts/common/components.json
Comment thread parts/common/components.json
Copilot AI review requested due to automatic review settings June 5, 2026 15:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread vhdbuilder/packer/install-dependencies.sh
Comment thread parts/linux/cloud-init/artifacts/cse_install.sh Outdated
Copilot AI review requested due to automatic review settings June 5, 2026 19:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings June 8, 2026 21:16
@cameronmeissner cameronmeissner changed the title feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR and bump to 1.1.4-1 Jun 8, 2026
@cameronmeissner cameronmeissner changed the title feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR and bump to 1.1.4-1 feat(linux): refactor aks-secure-tls-bootstrap-client installation to use PMC/MCR and bump to v1.1.4-1 Jun 8, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread parts/linux/cloud-init/artifacts/flatcar/cse_install_flatcar.sh
@aks-node-assistant

Copy link
Copy Markdown
Contributor

AgentBaker Linux PR gate — Ubuntu 24.04 fwupd.service mass E2E failure (RECURRING main regression, NOT this PR)

  • Run: 167219726 (failed)
  • Failed task: Run AgentBaker E2E (Stage e2e → Job/Phase Run AgentBaker E2E)
  • Signature: validators.go:995: 🔴 FAIL: the following systemd units have unexpectedly entered a failed state: [fwupd.service]
  • Scope: Ubuntu 24.04 scenarios (Test_Ubuntu2404_SecureTLSBootstrapping_BootstrapToken_Fallback, Test_Ubuntu2404_NPD_Basic, and others)

This matches an active main-branch regression flagged earlier today on PR #8294 build 167206065 and re-confirmed on PR #8294 build 167221197 within the same ~1.5h window. All three runs share the same [fwupd.service] failed-unit signature across unrelated PRs (node-exporter bump, this STLS client refactor, etc.).

Build-vs-test: product/VHD regression caught by E2E (NOT a flake, NOT test-code).
This PR's exposure check: changes refactor aks-secure-tls-bootstrap-client install to PMC/MCR; the failing validator is the systemd-unit health check, not STLS install. STLS tests in this run failed because the post-install systemd-units validator trips on fwupd.service before STLS-specific assertions could differentiate. No evidence the PR introduced or worsened the fwupd state.
Confidence: HIGH that PR #8618 is not the cause; HIGH that this is a 24.04 VHD main regression around fwupd.service.
Strongest alternative (less likely): STLS PMC/MCR refactor altering boot-time package install order and breaking fwupd.service first-start — refuted: the same signature reproduces on PRs that don't touch STLS or package install order.

Recommended next action / owner: NodeSIG-dev — bisect main since the last green 24.04 E2E for anything touching fwupd or systemd unit enablement in vhdbuilder/packer/install-dependencies.sh / tool_installs_distro.sh. Likely mitigation: mask fwupd.service in the 24.04 VHD or fix the first-start dependency. PR author: do NOT block merge on this; rerun once the main fix lands. If you want to be extra safe, rebase once the fix is in to confirm 24.04 E2E goes green for this PR's diff.

Posted by Clawpilot AgentBaker gate detective.

@aks-node-assistant

Copy link
Copy Markdown
Contributor

AgentBaker Linux PR gate — Ubuntu 24.04 fwupd.service mass E2E failure (STILL the recurring main regression, NOT this PR)

  • Run: 167238023 (failed) — new commit df88bc2 bumping STLS client to v1.1.4-1
  • Failed task: Run AgentBaker E2E (Stage e2e → Job/Phase Run AgentBaker E2E)
  • Test summary: DONE 438 tests, 95 skipped, 17 failures in 1666.129s
  • Primary signature: validators.go:995: 🔴 FAIL: the following systemd units have unexpectedly entered a failed state: [fwupd.service] (6 hits across this run)

Failing scenarios (all Ubuntu 24.04 except one):

  • Test_LocalDNSHostsPlugin/Ubuntu2404/{default,scriptless_nbc}
  • Test_Ubuntu2404_SecureTLSBootstrapping_BootstrapToken_Fallback/default
  • Test_Ubuntu2404_CSE_CachedPerformance/default
  • Test_Ubuntu2404_CSE_FullInstallPerformance/default
  • Test_Ubuntu2404Gen2/default
  • Test_Ubuntu2404Gen2_McrChinaCloud/scriptless_nbc
  • Test_Ubuntu2204Gen2_ImagePullIdentityBinding_NetworkIsolated/{default,scriptless_nbc} ← separate ongoing NetworkIsolated infra/fixture issue, not fwupd

This is the same fwupd.service 24.04 main regression previously flagged on builds 167206065, 167219726, and 167221197. New STLS commit landed but failure shape and scope are unchanged.

Build-vs-test: product/VHD regression caught by E2E (NOT a flake, NOT test-code, NOT STLS-related).
This PR's exposure check: STLS install moved to PMC/MCR + bumped to v1.1.4-1. The failing validator is the generic systemd-unit health check tripping on fwupd.service before STLS-specific assertions run; STLS BootstrapToken_Fallback failure is downstream of that pre-condition, not STLS install logic. No new failure modes introduced by the bump.
Confidence: HIGH that PR #8618 is not the cause; HIGH that this is a 24.04 VHD main regression around fwupd.service; the NetworkIsolated 22.04 failure is a separate known infra issue.
Strongest alternative (less likely): STLS PMC/MCR refactor altering boot-time package install order and breaking fwupd.service first-start — refuted: identical signature reproduces on unrelated PRs (renovate node-exporter #8294) on the same main HEAD; scope is strictly 24.04.

Recommended next action / owner: NodeSIG-dev — main-branch fix still pending. Likely mitigation: mask fwupd.service in 24.04 VHD or fix the first-start dependency in vhdbuilder/packer/install-dependencies.sh / tool_installs_distro.sh. PR author: do NOT block merge on this; rebase + rerun once the main fix lands to confirm a clean 24.04 leg.

Posted by Clawpilot AgentBaker gate detective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

components This pull request updates cached components on Linux or Windows VHDs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants