Skip to content

aks-preview: Support BYO VNet for --enable-hosted-system automatic clusters + --disable-hosted-system#9812

Open
wenhug wants to merge 11 commits intoAzure:mainfrom
wenhug:wenhug/aks-preview-byo-hobo
Open

aks-preview: Support BYO VNet for --enable-hosted-system automatic clusters + --disable-hosted-system#9812
wenhug wants to merge 11 commits intoAzure:mainfrom
wenhug:wenhug/aks-preview-byo-hobo

Conversation

@wenhug
Copy link
Copy Markdown
Contributor

@wenhug wenhug commented Apr 22, 2026

Summary

Adds CLI surface in the aks-preview extension so customers can drive all five HOBO (hosted system pool) scenarios for --sku automatic clusters backed by the 2026-02-02-preview API:

  1. az aks create --sku automatic --enable-hosted-system (non-BYO, baseline)
  2. az aks create --sku automatic --enable-hosted-system --system-node-vnet-subnet-id <A> --node-vnet-subnet-id <B> --apiserver-subnet-id <C> (BYO VNet, default managedNATGateway outbound)
  3. Same as Move the index #2 plus --outbound-type loadBalancer (BYO VNet + SLB outbound)
  4. az aks create --sku automatic --disable-hosted-system (deterministic opt-out once HOBO becomes the regional default for automatic)
  5. az aks update --sku base (downgrade an automatic+HOBO cluster to base)

New flags

  • --system-node-vnet-subnet-id / --node-vnet-subnet-id on az aks create: bring your own subnets for the hosted system pool and the user node pool. Must be set together with --apiserver-subnet-id and --enable-hosted-system. Values go to ManagedClusterHostedSystemProfile.{systemNodeSubnetId, nodeSubnetId}.
  • --disable-hosted-system on az aks create: mutually exclusive with --enable-hosted-system; both are gated to --sku automatic. Writes HostedSystemProfile.enabled=false without clearing agentPoolProfiles so customers keep real agent pools.

Validation (client-side, before PATCH)

  • --enable-hosted-system and --disable-hosted-system are mutually exclusive.
  • Both require --sku automatic.
  • Using any of the three BYO subnet flags requires --enable-hosted-system; omitting one of the three raises a clear error listing the missing ones.
  • --system-node-vnet-subnet-id / --node-vnet-subnet-id are validated through the existing validate_vnet_subnet_id resource-ID validator.

Plumbing fixes needed to make the BYO path actually work

  • aks-preview override of _get_outbound_type used to unconditionally flip automatic-SKU outbound to managedNATGateway when --vnet-subnet-id was empty. That silently overwrote --outbound-type loadBalancer on BYO HOBO (which uses the two new subnet flags instead of --vnet-subnet-id). The override now only forces managedNATGateway when the user did NOT explicitly pass --outbound-type.
  • aks-preview now overrides process_add_role_assignment_for_vnet_subnet to also grant Network Contributor on each of the three HOBO BYO subnets to the cluster identity (SP or UAMI). Without this RP returned ResourceMissingPermissionError on every BYO create.
  • set_up_api_server_access_profile is replaced by a full override (not a super()-and-patch) because base acs writes subnetId / enableVnetIntegration via msrest-style additional_properties, which the 2026-02-02-preview SDK (azure.core Model) does not expose; the override assigns typed fields instead.

Test plan

  • test_aks_automatic_sku_hosted_system_byovnet_slb (live_only): creates BYO VNet + 3 subnets + UAMI, creates automatic HOBO with --outbound-type loadBalancer, asserts provisioningState, SKU, hostedSystemProfile.enabled, hostedSystemProfile.systemNodeSubnetId, hostedSystemProfile.nodeSubnetId, apiServerAccessProfile.subnetId, networkProfile.outboundType, then downgrades to Base.
  • test_aks_automatic_sku_with_hosted_system_disabled (live_only): --disable-hosted-system asserts hostedSystemProfile.enabled=False and agent pools remain.
  • test_aks_automatic_sku_hosted_system_byovnet_natgw (live_only): BYO + default managedNATGateway outbound. Requires the RP-side HOBO support (ADO PR 15102840) before it can go green end-to-end; CLI request body validated manually.
  • Live smoke in eastus2euap against AKS Standalone Sub 1 exercised scenarios 3, 4, and 5 end-to-end.

Signed-off-by: wenhug 50309350+wenhug@users.noreply.github.com

Copilot AI review requested due to automatic review settings April 22, 2026 00:16
@azure-client-tools-bot-prd
Copy link
Copy Markdown

azure-client-tools-bot-prd Bot commented Apr 22, 2026

⚠️Azure CLI Extensions Breaking Change Test
⚠️aks-preview
rule cmd_name rule_message suggest_message
⚠️ 1006 - ParaAdd aks create cmd aks create added parameter disable_hosted_system
⚠️ 1006 - ParaAdd aks create cmd aks create added parameter node_subnet_id
⚠️ 1006 - ParaAdd aks create cmd aks create added parameter system_node_subnet_id

@azure-client-tools-bot-prd
Copy link
Copy Markdown

Hi @wenhug,
Please write the description of changes which can be perceived by customers into HISTORY.rst.
If you want to release a new extension version, please update the version in setup.py as well.

@yonzhan
Copy link
Copy Markdown
Collaborator

yonzhan commented Apr 22, 2026

Thank you for your contribution! We will review the pull request and get back to you soon.

@github-actions
Copy link
Copy Markdown
Contributor

The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR.

Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions).
After that please run the following commands to enable git hooks:

pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>

@github-actions
Copy link
Copy Markdown
Contributor

CodeGen Tools Feedback Collection

Thank you for using our CodeGen tool. We value your feedback, and we would like to know how we can improve our product. Please take a few minutes to fill our codegen survey

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

Hi @wenhug

Release Suggestions

Module: aks-preview

  • Update VERSION to 20.0.0b5 in src/aks-preview/setup.py

Notes

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds aks-preview CLI support for hosted-system (HOBO) automatic clusters to cover both BYO VNet scenarios (with new subnet flags) and a deterministic opt-out via --disable-hosted-system, along with plumbing fixes needed for outbound type defaults, API server access profile wiring, and subnet role assignments.

Changes:

  • Add new az aks create flags: --system-node-vnet-subnet-id, --node-vnet-subnet-id, and --disable-hosted-system, plus client-side validation for HOBO BYO subnet combinations.
  • Update managed cluster decorator logic for outbound type defaulting, hosted system profile wiring, API server access profile setup (typed fields), and role assignment across HOBO subnets.
  • Add live-only scenario tests for BYO VNet HOBO (NATGW + SLB) and --disable-hosted-system, and update docs/changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py Adds/updates live-only tests for HOBO automatic scenarios (BYO VNet + SLB/NATGW, disable hosted system).
src/aks-preview/azext_aks_preview/managed_cluster_decorator.py Implements new flag validation/plumbing, outbound-type defaulting adjustment, typed API server access profile setup, and extended subnet role assignment.
src/aks-preview/azext_aks_preview/custom.py Extends aks_create function signature to accept new CLI parameters.
src/aks-preview/azext_aks_preview/_validators.py Adds resource ID validators for the new HOBO subnet flags.
src/aks-preview/azext_aks_preview/_params.py Wires new CLI arguments into aks create.
src/aks-preview/azext_aks_preview/_help.py Documents new flags and adds examples for HOBO BYO VNet and opt-out flows.
src/aks-preview/HISTORY.rst Notes the new CLI surface and behavior in the pending changelog.

Comment thread src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py
Comment thread src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py
Comment thread src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py Outdated
Comment thread src/aks-preview/azext_aks_preview/managed_cluster_decorator.py
Comment thread src/aks-preview/azext_aks_preview/managed_cluster_decorator.py Outdated
Comment thread src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py
Comment thread src/aks-preview/azext_aks_preview/tests/latest/test_aks_commands.py
…usters + --disable-hosted-system

Adds CLI surface for BYO VNet HOBO (hosted system pool) automatic clusters:

* `--system-node-vnet-subnet-id` and `--node-vnet-subnet-id` on `az aks create`
  to bring your own VNet for the hosted system pool and user node pool. Must
  be used together with `--apiserver-subnet-id` and `--enable-hosted-system`.
* `--disable-hosted-system` on `az aks create` to deterministically opt out of
  HOBO on automatic clusters (mutually exclusive with `--enable-hosted-system`,
  both gated to `--sku automatic`).

Supported scenarios:

1. az aks create --sku automatic --enable-hosted-system
2. ... + --system-node-vnet-subnet-id --node-vnet-subnet-id --apiserver-subnet-id (NATGW)
3. ... + --outbound-type loadBalancer for BYO VNet with SLB outbound
4. az aks create --sku automatic --disable-hosted-system
5. az aks update --sku base to downgrade an automatic+HOBO cluster

Validation (client-side, before PATCH):

* --enable-hosted-system and --disable-hosted-system are mutually exclusive.
* Both require --sku automatic.
* If --enable-hosted-system is set with any of the 3 BYO subnet flags, all
  three must be provided; otherwise a clear error lists the missing ones.
* BYO subnet flags cannot be used without --enable-hosted-system.

Live-only E2E tests cover BYO+NATGW, BYO+SLB with downgrade to base SKU, and
the disable opt-out path.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
@wenhug wenhug force-pushed the wenhug/aks-preview-byo-hobo branch from 3bcaac9 to 0e2767b Compare April 22, 2026 00:31
* Add short alias --sys-node-subnet-id for --system-node-vnet-subnet-id
  to satisfy option_length_too_long linter rule.
* Rename skuName/isVnetSubnetIdEmpty to snake_case per PEP 8.
* Disable too-many-branches pylint warning on _get_outbound_type (overridden
  from base azure-cli and the preview-specific branches are necessary).
* Replace fixed 180s sleep before aks update --sku base with a retry loop
  that handles the RP's post-create 409 OperationNotAllowed window more
  robustly.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
@wenhug
Copy link
Copy Markdown
Contributor Author

wenhug commented Apr 22, 2026

Thanks for the review. Addressed in 1e4fd37:

CI fixes

  • option_length_too_long: added short alias --sys-node-subnet-id for --system-node-vnet-subnet-id.
  • too-many-branches in _get_outbound_type: added # pylint: disable=too-many-branches. This is an override of the base azure-cli method and all branches are needed for the preview-specific defaulting behavior.

Copilot comments

  • Snake-case rename: skuNamesku_name, isVnetSubnetIdEmptyis_vnet_subnet_id_empty. ✅
  • Fixed 180s sleep → retry loop with 60s backoff for up to 15 min on 409/OperationNotAllowed. ✅
  • apiserver_subnet_id being included unconditionally in process_add_role_assignment_for_vnet_subnet: already gated by an early return at the top of the override — if not self.context.get_enable_hosted_system(): return. So when --apiserver-subnet-id is used without --enable-hosted-system (generic API-server-VNet-integration path), the extra RBAC grant is skipped. The apiserver subnet only joins hobo_subnets in BYO HOBO mode.
  • Hard-coded Log Analytics workspace ID: matches the pre-existing test_aks_automatic_sku_with_hosted_system_enabled pattern for HOBO live-only tests. These tests are gated by @live_only() and only run in our HOBO-enabled test subscription. Will migrate to provisioning the workspace in-test in a follow-up PR to cover all HOBO live tests consistently.

wenhug added 6 commits April 22, 2026 01:35
Include both --system-node-vnet-subnet-id and --sys-node-subnet-id in the
help entry name so azdev linter recognizes all option aliases and does not
report unrecognized_help_parameter_rule / missing_parameter_help.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
The azdev linter compares help parameter names against knack's HelpParameter.name
which is built by sorting options alphabetically (knack/help.py line 349).
Swap the order so --sys-node-subnet-id comes before --system-node-vnet-subnet-id.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
The _get_outbound_type validation previously required --vnet-subnet-id when
outbound_type is userAssignedNATGateway / userDefinedRouting. For BYO HOBO
automatic clusters the VNet is provided via --system-node-vnet-subnet-id /
--node-vnet-subnet-id instead, so treat those as satisfying the requirement.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Adds test_aks_automatic_sku_hosted_system_byovnet_user_natgw covering
BYO VNet hosted-system automatic clusters with userAssignedNATGateway
outbound type, exercising the _get_outbound_type fix that treats BYO
HOBO subnets as satisfying the VNet requirement.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Previously _validate_byo_hobo_subnet_trio only ran when
set_up_api_server_access_profile invoked get_apiserver_subnet_id, which
in the base construct_mc_profile_default flow runs AFTER
process_add_role_assignment_for_vnet_subnet. A malformed BYO HOBO create
(partial subnet trio, or HOBO subnet flags without --enable-hosted-system)
could therefore leave residual Network Contributor grants on customer
subnets before the CLI surface-level validation fired.

Move trio validation to the start of the role-assignment override so
misuse fails before any RBAC mutation happens.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Refactor-only (no behavior change): introduce public
AKSPreviewManagedClusterContext.has_byo_hobo_subnets() and replace three
inline duplicate 'system_node_vnet_subnet_id or node_vnet_subnet_id'
checks (in _get_outbound_type default-completion, _get_outbound_type
validation, and get_api_server_access_profile validation) with calls to
it.

Also rename _validate_byo_hobo_subnet_trio to validate_byo_hobo_subnet_trio
(drop the leading underscore) so the CreateDecorator override can call it
directly without a pylint protected-access exception.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Comment thread src/aks-preview/azext_aks_preview/_help.py
Comment thread src/aks-preview/azext_aks_preview/_help.py Outdated
Comment thread src/aks-preview/azext_aks_preview/_help.py Outdated
wenhug added 2 commits April 22, 2026 21:59
Rename --system-node-vnet-subnet-id -> --system-node-subnet-id and
--node-vnet-subnet-id -> --node-subnet-id (with Python identifiers
system_node_subnet_id / node_subnet_id) per @zqingqing1 review
feedback on PR Azure#9812. The --sys-node-subnet-id alias is retained.

Also drop the BYO VNet combination paragraph from the
--enable-hosted-system long-summary per PM guidance.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
RP rejects the combination of BYO VNet + managedNATGateway with
"Outbound type is managedNATGateway but agent pool 'hostedpool' is
using custom VNet, which is not allowed" (by design, enforced in
natgatewayv2.go). For BYO VNet the RP auto-defaults outboundType to
loadBalancer and only accepts loadBalancer or userAssignedNATGateway.

The byovnet_slb and byovnet_user_natgw tests already cover the two
supported BYO HOBO outbound modes, so this test was attempting an
unsupported scenario and is removed.

Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
@wenhug
Copy link
Copy Markdown
Contributor Author

wenhug commented Apr 23, 2026

@zhoxing-ms , @yanzhudd , could you help review this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AKS Auto-Assign Auto assign by bot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants