feat(azure): look up BYOD DNS zone in current subscription#2156
feat(azure): look up BYOD DNS zone in current subscription#2156edwardrf wants to merge 2 commits into
Conversation
When a service sets a custom `domainname`, search the current subscription for a public Azure DNS zone that is a parent of that domain (longest-suffix match, no ownership filter). If found, record its ARM resource ID in ServiceInfo.ZoneId so the CD/Pulumi program manages records + issues a managed cert directly in that zone, mirroring AWS's findZone. Otherwise fall back to the ACME managed-cert path. Lookup failures (no subscription resolved, or missing DNS read permission) fall back to ACME rather than failing the deploy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAzure BYOC now looks up a matching public DNS zone before falling back to ACME, and Azure Container Apps certificate handling now distinguishes subdomain CNAME and apex A-record flows during DNS readiness and managed-certificate validation. ChangesAzure DNS Zone Discovery and Cert Handling
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=warning msg="The linter 'gomodguard' is deprecated (since v2.12.0) due to: new major version. Replaced by gomodguard_v2." Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/pkg/clouds/azure/dns/dns.go (1)
93-95: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winWrap client setup errors with
FindZonecontext.Returning
d.newZonesClient()errors bare makes fallback logs less actionable when auth/client construction fails before listing zones.As per coding guidelines, “Preserve cloud-provider error detail while wrapping errors with operation context in cloud SDK wrappers.”
Proposed fix
client, err := d.newZonesClient() if err != nil { - return "", err + return "", fmt.Errorf("creating DNS zones client: %w", err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pkg/clouds/azure/dns/dns.go` around lines 93 - 95, The FindZone path returns newZonesClient setup failures without operation context, which makes Azure DNS fallback logs harder to interpret. Update FindZone in dns.go so that errors from d.newZonesClient() are wrapped with FindZone context while preserving the original error detail, following the same pattern used by other cloud SDK wrappers.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pkg/cli/client/byoc/azure/byoc.go`:
- Around line 1042-1057: The certificate selection logic in the byoc Azure flow
leaves `si.ZoneId` and `si.UseAcmeCert` partially stale because only one side is
updated in each branch. In the `setUpLocation`/`FindZone` path, explicitly clear
`UseAcmeCert` when a DNS zone is found and explicitly clear `ZoneId` whenever
falling back to ACME. Update the `si` assignment around `si.ZoneId` and
`si.UseAcmeCert` so the fields are always mutually exclusive regardless of prior
state.
---
Nitpick comments:
In `@src/pkg/clouds/azure/dns/dns.go`:
- Around line 93-95: The FindZone path returns newZonesClient setup failures
without operation context, which makes Azure DNS fallback logs harder to
interpret. Update FindZone in dns.go so that errors from d.newZonesClient() are
wrapped with FindZone context while preserving the original error detail,
following the same pattern used by other cloud SDK wrappers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 591ae40f-e9c3-4119-aac1-a834fdf0ab9a
📒 Files selected for processing (2)
src/pkg/cli/client/byoc/azure/byoc.gosrc/pkg/clouds/azure/dns/dns.go
aca.IssueCert now handles apex custom domains (e.g. example.com) in addition to subdomains, detecting apex from DNS/Azure rather than a caller hint: - waitForBYODdns accepts an A record (apex routing) as well as a CNAME→app FQDN (subdomain), still requiring the asuid TXT in both cases. - issueManagedCertificate falls back from CNAME validation to HTTP validation when Azure rejects CNAME with InvalidValidationMethod (the apex signal). HTTP validation needs no extra DNS record, so it completes unattended in the CD task — unlike the TXT _dnsauth dance, which remains a last-resort fallback. The provider (pulumi-defang #305) creates the matching A + asuid records for apex BYOD domains. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pkg/clouds/azure/aca/cert.go`:
- Around line 145-176: The readiness check in waitForBYODdns is too broad
because it marks routeOK true for any resolvable hostname IP after the CNAME
check fails. Tighten the apex-path validation by comparing the hostname’s
resolved IPs against the IPs for expectedCname, or better, thread the expected
Container Apps environment IPs into waitForBYODdns and validate against those in
this layer. Keep the existing asuid TXT check and prompt logic, but remove the
“any IP is enough” fallback so the function only proceeds when the hostname
actually points to the intended target.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f506d68c-a776-4fe9-9fc6-39b85292fa6e
📒 Files selected for processing (1)
src/pkg/clouds/azure/aca/cert.go
| // waitForBYODdns blocks until a routing record and the asuid TXT record both | ||
| // resolve, prompting the user once with the values to add. | ||
| // | ||
| // The routing record is a CNAME → app FQDN for a subdomain, or an A record for | ||
| // an apex domain (no CNAME possible at the zone apex). We don't know which the | ||
| // hostname is, so we accept either: the precise CNAME→app-FQDN check, OR any | ||
| // resolvable address at the hostname (the apex A case — the env IP isn't known | ||
| // to this layer, and the cert step validates the apex binding over HTTP). The | ||
| // asuid TXT is always required. | ||
| func waitForBYODdns(ctx context.Context, hostname, expectedCname, expectedTxt string, resolverAt func(string) dns.Resolver) error { | ||
| asuid := "asuid." + hostname | ||
| deadline := time.Now().Add(dnsWaitTimeout) | ||
| promptShown := false | ||
|
|
||
| for { | ||
| cnameOK := dns.CheckDomainDNSReady(ctx, hostname, []string{expectedCname}, resolverAt) | ||
| routeOK := dns.CheckDomainDNSReady(ctx, hostname, []string{expectedCname}, resolverAt) | ||
| if !routeOK { | ||
| // Apex domains route via an A record, not a CNAME to the app FQDN; | ||
| // accept any resolvable address at the hostname. | ||
| if ips, err := resolverAt("").LookupIPAddr(ctx, hostname); err == nil && len(ips) > 0 { | ||
| routeOK = true | ||
| } | ||
| } | ||
| txtOK, _ := dns.LookupTXTContains(ctx, asuid, expectedTxt, resolverAt("")) | ||
| if cnameOK && txtOK { | ||
| if routeOK && txtOK { | ||
| term.Infof("DNS records for %s verified", hostname) | ||
| return nil | ||
| } | ||
| if !promptShown { | ||
| term.Printf("Configure DNS records for %s:\n", hostname) | ||
| term.Printf(" CNAME %s -> %s\n", hostname, expectedCname) | ||
| term.Printf(" CNAME %s -> %s (subdomain)\n", hostname, expectedCname) | ||
| term.Printf(" A %s -> <Container Apps environment IP> (apex)\n", hostname) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don’t accept any resolvable IP as apex readiness.
After the exact CNAME check fails, LookupIPAddr(hostname) makes routeOK true for any existing A/AAAA record or wrong CNAME target. That can proceed while the hostname still points away from Container Apps. Compare the hostname IPs with the resolved expectedCname IPs, or pass the expected environment IPs into this owning layer and print/validate those. As per coding guidelines, “Prefer a narrow option passed into the owning package over duplicating path discovery or state reconstruction in a caller” and “file-existence and working-directory behavior should be deterministic and tested at the owning layer.”
Possible localized direction
- if ips, err := resolverAt("").LookupIPAddr(ctx, hostname); err == nil && len(ips) > 0 {
+ hostIPs, hostErr := resolverAt("").LookupIPAddr(ctx, hostname)
+ targetIPs, targetErr := resolverAt("").LookupIPAddr(ctx, expectedCname)
+ if hostErr == nil && targetErr == nil && hasCommonIP(hostIPs, targetIPs) {
routeOK = true
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pkg/clouds/azure/aca/cert.go` around lines 145 - 176, The readiness check
in waitForBYODdns is too broad because it marks routeOK true for any resolvable
hostname IP after the CNAME check fails. Tighten the apex-path validation by
comparing the hostname’s resolved IPs against the IPs for expectedCname, or
better, thread the expected Container Apps environment IPs into waitForBYODdns
and validate against those in this layer. Keep the existing asuid TXT check and
prompt logic, but remove the “any IP is enough” fallback so the function only
proceeds when the hostname actually points to the intended target.
Source: Coding guidelines
What
When an Azure service sets a custom
domainname, search the current subscription for a public Azure DNS zone that is a parent of that domain. If one exists, record its ARM resource ID inServiceInfo.ZoneIdso the CD/Pulumi program manages records + issues a managed cert directly in that zone — mirroring AWS'sfindZone. Otherwise fall back to the ACME managed-cert path (current behaviour).Companion provider/CD change that consumes
ZoneId: DefangLabs/pulumi-defang#305.Changes
clouds/azure/dns/dns.go— newFindZone(ctx, domain): subscription-wideZonesClient.NewListPager, longest DNS-suffix match, returns the zone's ARM resource ID (or"").cli/client/byoc/azure/byoc.go—UpdateServiceInfonow callsFindZone; setssi.ZoneIdwhen found, elsesi.UseAcmeCert = true.Behaviour
Testing
go build+go test ./pkg/clouds/azure/dns/...pass.golangci-lint runon the two changed packages: 0 issues.🤖 Generated with Claude Code
Summary by CodeRabbit