Is your feature request related to a problem? Please describe.
When a Container App has multiple entries in the registries array and one of them has a DNS resolution failure (e.g., missing private DNS zone record), the deployment succeeds (HTTP 200/201) but the pod enters CrashLoopBackOff / Pending:NotReady at runtime with no clear surface-level error.
The actual failure (no such host on token exchange) is only visible deep in platform telemetry (Kusto), not in the ARM response, provisioning error, or az containerapp show output.
This commonly happens when:
- An ACR has dedicated data endpoints enabled, creating two FQDNs (login server + data endpoint)
- A Private Endpoint is configured but the private DNS zone only has A records for one of the two FQDNs
- The
registries array carries a stale entry that was valid previously but no longer resolves
Describe the solution you'd like
-
Deployment-time validation: When a PUT/PATCH to a container app includes a registries array, the control plane should attempt DNS resolution for each server entry. If resolution fails, return a clear error in the ARM response (e.g., "Registry server 'foo.azurecr.io' could not be resolved. Verify DNS configuration and private endpoint setup."), rather than accepting the deployment and failing silently at pod scheduling.
-
Provisioning error surfacing: If DNS validation at deployment time is not feasible (e.g., DNS is only resolvable from the VNet), surface the image pull failure reason in properties.provisioningState or a new properties.latestRevisionError field on az containerapp show, so customers don't need platform telemetry access to diagnose.
-
Warning for unreachable registries: If a registry in the array is not referenced by any container's image field, surface a warning (non-blocking) suggesting the entry may be stale.
Describe alternatives you've considered
- Customers manually validating DNS from within the VNet before deploying — error-prone and not always feasible
- Relying on
az containerapp logs — these show the app crash but not the upstream token exchange / DNS failure
- Removing unused registries manually — customers often don't know which entry is stale vs. active
Additional context
- The lack of deployment-time feedback led to multiple hours of troubleshooting with MS to identify the root cause.
- Related: ACR Private Endpoint creation does not warn customers that dedicated data endpoint FQDNs also need DNS records.
Component: Microsoft.App/containerApps — Registry configuration & image pull validation
Is your feature request related to a problem? Please describe.
When a Container App has multiple entries in the
registriesarray and one of them has a DNS resolution failure (e.g., missing private DNS zone record), the deployment succeeds (HTTP 200/201) but the pod entersCrashLoopBackOff/Pending:NotReadyat runtime with no clear surface-level error.The actual failure (
no such hoston token exchange) is only visible deep in platform telemetry (Kusto), not in the ARM response, provisioning error, oraz containerapp showoutput.This commonly happens when:
registriesarray carries a stale entry that was valid previously but no longer resolvesDescribe the solution you'd like
Deployment-time validation: When a
PUT/PATCHto a container app includes aregistriesarray, the control plane should attempt DNS resolution for eachserverentry. If resolution fails, return a clear error in the ARM response (e.g.,"Registry server 'foo.azurecr.io' could not be resolved. Verify DNS configuration and private endpoint setup."), rather than accepting the deployment and failing silently at pod scheduling.Provisioning error surfacing: If DNS validation at deployment time is not feasible (e.g., DNS is only resolvable from the VNet), surface the image pull failure reason in
properties.provisioningStateor a newproperties.latestRevisionErrorfield onaz containerapp show, so customers don't need platform telemetry access to diagnose.Warning for unreachable registries: If a registry in the array is not referenced by any container's
imagefield, surface a warning (non-blocking) suggesting the entry may be stale.Describe alternatives you've considered
az containerapp logs— these show the app crash but not the upstream token exchange / DNS failureAdditional context
Component: Microsoft.App/containerApps — Registry configuration & image pull validation