Skip to content
Emmanuel Knafo edited this page Apr 3, 2026 · 4 revisions

Private AKS Deployment PoC — Wiki

This wiki documents the proof-of-concept that validates Azure Managed Identity bypassing Entra ID conditional access (CA) policies when deploying private AKS clusters. The main repository contains all automation: aks-private-deployment.


Table of Contents


Overview

Organizations with strict Entra ID conditional access location policies experience deployment failures when AKS Resource Provider authenticates via service principal credentials. The sign-in originates from Azure datacenter IPs, which fall outside allowed network perimeters. Managed identity bypasses CA entirely because tokens are acquired internally via IMDS (169.254.169.254), never through login.microsoftonline.com.

This PoC proves that pattern by running a three-job GitHub Actions workflow:

  1. Job 1 (setup-runner) — provisions VNet, managed identity, and a self-hosted runner VM (GitHub-hosted runner via OIDC).
  2. Job 2 (deploy-and-log) — deploys private AKS from the self-hosted runner using managed identity, validates with kubectl.
  3. Job 3 (teardown-runner) — deregisters runner and deletes all resource groups.

Proof: Self-Hosted Runner Execution

The following screenshots confirm that the GitHub Actions workflow is executing on a self-hosted runner VM inside the Azure VNet. The runner VM authenticates via managed identity (IMDS) and deploys the private AKS cluster from within the same VNet, bypassing conditional access.

Screenshot 1 — GitHub Actions workflow running on the self-hosted runner VM. This shows the deploy-and-log job executing on the self-hosted runner (label aks-poc-runner-<run_id>) inside subnet-runner (10.224.1.0/24). The runner authenticated via az login --identity using the user-assigned managed identity, confirming IMDS-based token acquisition without CA evaluation.

GitHub Actions deploy-and-log job running on self-hosted runner VM inside the Azure VNet, authenticated via managed identity

Screenshot 2 — Additional confirmation of self-hosted runner execution. Shows further workflow output or runner registration details from the VM, confirming the runner is registered and processing the AKS deployment job from within the private network.

Self-hosted runner registration and workflow execution confirmation from the runner VM in subnet-runner

Architecture and Network Layout

The PoC uses a shared VNet (10.224.0.0/16) with two subnets. The runner VM sits in subnet-runner (10.224.1.0/24) and the AKS cluster deploys into subnet-aks (10.224.0.0/24). Because both subnets share the same VNet, the runner can reach the AKS API server via its private endpoint.

Screenshot 3 — High-level view of the deployed Azure resources. Shows the resource groups and provisioned infrastructure for the PoC run, including the VNet, managed identity, runner VM, and AKS cluster resources across the infrastructure RG (rg-aks-poc-infra-<run_id>) and AKS RG (rg-aks-poc-<run_id>).

Azure Portal or GitHub Actions overview showing provisioned PoC resources — VNet, managed identity, runner VM, and AKS cluster across infrastructure and AKS resource groups

Private Cluster Details

The AKS cluster is deployed with --enable-private-cluster, making the API server accessible only via private endpoint within the VNet. DNS resolution of the private FQDN returns a private IP (10.224.0.4) inside subnet-aks.

Screenshot 4 — Detailed view of the private AKS cluster configuration. Shows cluster properties confirming enablePrivateCluster: true, the private FQDN, API server endpoint on the privatelink domain, and network configuration (Azure CNI, VNet subnet integration). The private FQDN resolves to 10.224.0.4 within the VNet.

Private AKS cluster configuration details showing enablePrivateCluster true, private FQDN resolving to 10.224.0.4, and Azure CNI network configuration within subnet-aks

Screenshot 5 — Kubectl validation from the self-hosted runner. Shows kubectl get nodes confirming nodes are Ready, kubectl cluster-info pointing to the private endpoint, and nslookup of the private FQDN resolving to 10.224.0.4. This confirms end-to-end connectivity from the runner VM (10.224.1.4) to the AKS API server (10.224.0.4) via the private endpoint.

Kubectl output from self-hosted runner: get nodes showing Ready status, cluster-info pointing to privatelink FQDN, nslookup resolving to private IP 10.224.0.4

Troubleshooting

Common issues encountered during private AKS deployment with conditional access environments.

Screenshot 6 — Troubleshooting scenario. Shows a diagnostic view related to deployment issues, potentially the conditional access block error when using service principal authentication, or Azure Activity Log entries showing the IP comparison between the runner VM's public IP and the ARM operation caller IPs. This contrasts with the managed identity flow which avoids CA evaluation entirely.

Troubleshooting view showing diagnostic information for private AKS deployment — conditional access evaluation, Activity Log IP comparison, or deployment error details

Common Issues

Symptom Cause Resolution
az aks create fails with AADSTS53003 Conditional access location policy blocks service principal sign-in from Azure datacenter IP Switch to managed identity (--enable-managed-identity + az login --identity)
kubectl cannot reach API server Runner VM is not in the same VNet as the AKS private endpoint Ensure runner deploys into a subnet within the same VNet as subnet-aks
Runner never comes online Cloud-init not complete or runner registration token expired Check cloud-init status on VM; regenerate registration token
Activity Log shows unexpected IPs AKS RP internal operations use Azure datacenter IPs (expected for resolvePrivateLinkServiceId) Only customer-initiated ARM writes should match the runner VM IP

Key References