Skip to content

fix(ci): connect to release DB via Tailscale FQDN instead of port-forward#449

Open
julianam-w wants to merge 6 commits into
mainfrom
fix-assets-workflow
Open

fix(ci): connect to release DB via Tailscale FQDN instead of port-forward#449
julianam-w wants to merge 6 commits into
mainfrom
fix-assets-workflow

Conversation

@julianam-w

Copy link
Copy Markdown
Collaborator

No description provided.

@github-actions

This comment was marked as outdated.

@julianam-w julianam-w changed the title Update build-reporting-assets.yml fix(ci): connect to release DB via Tailscale FQDN instead of port-forward May 17, 2026
@github-actions

github-actions Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Claude finished @julianam-w's task in 1m 18s —— View job


Claude Code Review

  • Read PR diff vs origin/main
  • Inspect changed workflow file
  • Check .maui submodule change
  • Post review feedback

Overview

Replaces the kubectl port-forward dance to the CNPG primary pod with a direct connection to a Tailscale-exposed FQDN (k8s-pg-release-${MINOR}.tail53aef.ts.net). Net effect: −35/+2 in the workflow, and the readiness-polling + cleanup-trap scaffolding goes away. Cleaner and more reliable — the port-forward subshell was a common source of flakes.

Code quality

  • Solid simplification. The previous block was 35 lines of bash plumbing whose only job was making 127.0.0.1:5432 work; now that the runner is already on the tailnet (Connect to Tailscale step), going to the FQDN directly is the right call.
  • The new db_host output is co-located with the other version-derived outputs (namespace, ref) — consistent with the existing pattern in the same step.
  • kubectl is still needed to read the central-db-app secret, so keeping the Tailscale + kubeconfig setup steps is correct.

Suggestions

1. Hardcoded tailnet identifiertail53aef.ts.net is embedded inline in .github/workflows/build-reporting-assets.yml:55. If the tailnet is ever renamed/migrated, this breaks silently at runtime. Consider promoting it to a repo variable (e.g. vars.TS_TAILNET) alongside the other vars.TS_* you already use. Low priority but cheap to do now.

2. Lost comment about primary-pod targeting — the removed block had a useful comment explaining the trade-off ("Pinned to the primary pod… no failover during the build"). With the FQDN approach, presumably this resolves to the CNPG -rw Service (which always routes to the current primary, and does survive failover). Worth a one-line comment near TAMANU_RL_DB_URL=... noting which Service the hostname maps to, so the next reader doesn't have to dig through Tailscale operator config to find out. Something like:

# Routes via the Tailscale-operator-exposed central-db-rw Service,
# which always points to the current CNPG primary.
TAMANU_RL_DB_URL=${{ steps.version.outputs.db_host }} \

3. No reachability pre-check — the old code had pg_isready polling. That was only needed because port-forward starts async; with a real FQDN you can drop it. Fine as-is, but be aware that if the Service isn't exposed for a given release namespace, you'll only learn about it when build_reporting_assets.py errors. Not worth adding back, but flagging.

4. Pre-existing nit (not introduced here)TAMANU_RL_DB_URL holds a hostname, not a URL. Was already misnamed before this PR; mentioning only so it's on the radar for a future cleanup.

.maui submodule bump

Pointer moves from 213a7526f29c2e. The contents of that bump aren't visible from this repo's diff alone — worth confirming it's intentional and unrelated to the workflow change, or noting in the PR description what's in it.

Risks

  • Runtime-only failure mode: if the per-namespace Service k8s-pg-release-X-Y isn't actually advertised on the tailnet for a given release (e.g. the operator annotation is missing), the workflow will fail at connect time. Verify the naming convention matches what the Tailscale operator publishes for all currently-supported release namespaces (tamanu-release-2-50, etc.).
  • No test coverage changes needed — this is workflow plumbing.
  • No security concerns: password handling (::add-mask::) is unchanged; auth still flows through Tailscale OIDC.

Verdict

LGTM with the optional tailnet-var and one-line comment suggestions above. The simplification is a clear win over the port-forward approach.
branch fix-assets-workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant