|
| 1 | +# ============================================================================= |
| 2 | +# MIGRATION GUIDE: Moving from Manual / CI-managed Infrastructure to Full IaC |
| 3 | +# ============================================================================= |
| 4 | +# |
| 5 | +# CURRENT STATE |
| 6 | +# ------------- |
| 7 | +# The following resources exist in GCP but are managed manually or by the CI |
| 8 | +# pipeline (not by Terraform): |
| 9 | +# - Cloud SQL instance: querypal-db (manually created) |
| 10 | +# - Cloud SQL database: querypal (manually created) |
| 11 | +# - Cloud Run services: querypal-backend/frontend (deployed by CI) |
| 12 | +# - Secrets: stored in GitHub Secrets (not in Secret Manager) |
| 13 | +# |
| 14 | +# WHAT TERRAFORM NOW MANAGES |
| 15 | +# --------------------------- |
| 16 | +# This configuration manages the infrastructure layer only. The CI pipeline |
| 17 | +# (.github/workflows/google-cloudrun-docker.yml) continues to own image builds |
| 18 | +# and Cloud Run deployments. |
| 19 | +# |
| 20 | +# Resource | Terraform action | App impact |
| 21 | +# --------------------------|--------------------|--------------------------- |
| 22 | +# VPC connector | CREATE (new) | None until next CI deploy |
| 23 | +# Secret Manager secrets | CREATE (new) | None until values added |
| 24 | +# Cloud Run service account | CREATE (new) | None until next CI deploy |
| 25 | +# IAM bindings | CREATE (new) | None |
| 26 | +# Cloud SQL instance | IMPORT (existing) | Zero — instance untouched |
| 27 | +# Cloud SQL database | IMPORT (existing) | Zero — database untouched |
| 28 | +# |
| 29 | +# ============================================================================= |
| 30 | +# MIGRATION STEPS — run once, in this order |
| 31 | +# ============================================================================= |
| 32 | +# |
| 33 | +# STEP 1 — Verify Terraform config matches your actual Cloud SQL instance. |
| 34 | +# |
| 35 | +# Before importing, check that database.tf reflects the real instance: |
| 36 | +# - tier (e.g. db-f1-micro) |
| 37 | +# - database_version (POSTGRES_15) |
| 38 | +# - region (europe-west1) |
| 39 | +# - backup settings, flags, IP config |
| 40 | +# |
| 41 | +# To inspect the current instance: |
| 42 | +# gcloud sql instances describe querypal-db --format=json |
| 43 | +# |
| 44 | +# If anything in database.tf does not match, fix it BEFORE importing. |
| 45 | +# After import, any mismatch will show as a planned change. Most settings |
| 46 | +# (tier, flags, backup) can be modified in-place with no downtime. Changing |
| 47 | +# database_version would require recreation — avoid it. |
| 48 | +# |
| 49 | +# STEP 2 — Import existing Cloud SQL resources into Terraform state. |
| 50 | +# |
| 51 | +# This registers the existing instance with Terraform without touching it. |
| 52 | +# No data is moved, no connections are interrupted, the application keeps |
| 53 | +# running throughout. |
| 54 | +# |
| 55 | +# cd terraform |
| 56 | +# cp terraform.tfvars.example terraform.tfvars |
| 57 | +# terraform init |
| 58 | +# ./import.sh |
| 59 | +# |
| 60 | +# After import, run `terraform plan`. The plan should show no changes (or |
| 61 | +# only safe in-place updates to settings you deliberately changed in the |
| 62 | +# config). If you see a resource scheduled for REPLACEMENT, stop and fix the |
| 63 | +# config — do not apply until the plan is clean. |
| 64 | +# |
| 65 | +# STEP 3 — Apply Terraform to create new resources. |
| 66 | +# |
| 67 | +# terraform apply |
| 68 | +# |
| 69 | +# This creates: VPC connector, Secret Manager secrets (empty), Cloud Run SA, |
| 70 | +# and all IAM bindings. Nothing here touches the running application. |
| 71 | +# |
| 72 | +# The VPC connector takes 2–5 minutes to provision. The apply will wait. |
| 73 | +# |
| 74 | +# STEP 4 — Populate Secret Manager with the actual secret values. |
| 75 | +# |
| 76 | +# The secrets created in Step 3 are empty shells. Cloud Run will refuse to |
| 77 | +# start if it tries to mount a secret with no versions. Populate them NOW, |
| 78 | +# before triggering a new CI deployment: |
| 79 | +# |
| 80 | +# for SECRET_ID in \ |
| 81 | +# querypal-azure-tenant-id \ |
| 82 | +# querypal-azure-client-id \ |
| 83 | +# querypal-azure-client-secret \ |
| 84 | +# querypal-gemini-api-key \ |
| 85 | +# querypal-db-user \ |
| 86 | +# querypal-db-pass; do |
| 87 | +# echo -n "Enter value for ${SECRET_ID}: " |
| 88 | +# read -rs VALUE |
| 89 | +# echo |
| 90 | +# echo -n "${VALUE}" | gcloud secrets versions add "${SECRET_ID}" --data-file=- |
| 91 | +# done |
| 92 | +# |
| 93 | +# Verify each secret has at least one version: |
| 94 | +# gcloud secrets versions list querypal-gemini-api-key |
| 95 | +# |
| 96 | +# STEP 5 — Trigger a CI deployment (push to production branch). |
| 97 | +# |
| 98 | +# The updated workflow now uses: |
| 99 | +# --set-secrets (reads from Secret Manager instead of env vars) |
| 100 | +# --vpc-connector (attaches both services to the VPC) |
| 101 | +# --ingress=internal (backend becomes unreachable from public internet) |
| 102 | +# --service-account (uses the new dedicated Cloud Run SA) |
| 103 | +# |
| 104 | +# The frontend will be publicly reachable as before. The backend will only |
| 105 | +# accept traffic that arrives through the VPC connector (from the frontend |
| 106 | +# nginx proxy). Direct requests to the backend Cloud Run URL from the internet |
| 107 | +# will receive a 403 from Google's frontend. |
| 108 | +# |
| 109 | +# Monitor the deployment: |
| 110 | +# gcloud run services describe querypal-backend --region=europe-west1 |
| 111 | +# gcloud run services describe querypal-frontend --region=europe-west1 |
| 112 | +# |
| 113 | +# STEP 6 — Clean up GitHub Secrets (optional but recommended). |
| 114 | +# |
| 115 | +# Once the application is verified working with Secret Manager, delete the |
| 116 | +# now-unused GitHub Secrets from the repository settings: |
| 117 | +# AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, |
| 118 | +# GEMINI_API_KEY, DB_USER, DB_PASS |
| 119 | +# |
| 120 | +# ============================================================================= |
| 121 | +# RISKS AND THINGS TO WATCH FOR |
| 122 | +# ============================================================================= |
| 123 | +# |
| 124 | +# Cloud SQL import divergence |
| 125 | +# If `terraform plan` after import shows resource replacement (destroy + |
| 126 | +# create) for the Cloud SQL instance, do NOT apply. Terraform cannot recreate |
| 127 | +# a Cloud SQL instance in-place — it would destroy the instance and all data. |
| 128 | +# deletion_protection = true in database.tf will block the destroy, but it is |
| 129 | +# safer to fix the config discrepancy first. |
| 130 | +# |
| 131 | +# Secret versions must exist before deployment |
| 132 | +# If Step 5 runs before Step 4 completes, Cloud Run will fail to start |
| 133 | +# because the secret mount has no versions. The previous revision stays live |
| 134 | +# (Cloud Run only switches traffic after the new revision is healthy), so the |
| 135 | +# application continues to work — but the deploy will time out. Fix: add the |
| 136 | +# missing secret version, then re-deploy. |
| 137 | +# |
| 138 | +# VPC connector CIDR must not overlap existing subnets |
| 139 | +# The connector reserves 10.8.0.0/28. If your VPC already has a subnet in |
| 140 | +# that range, change vpc_connector_cidr in variables.tf before applying. |
| 141 | +# Check existing ranges: |
| 142 | +# gcloud compute networks subnets list --filter="region:europe-west1" |
| 143 | +# |
| 144 | +# Cloud Run SA permissions propagate with eventual consistency |
| 145 | +# IAM bindings may take up to 60 seconds to take effect after `terraform |
| 146 | +# apply`. If the first CI deploy fails with a permission error immediately |
| 147 | +# after applying Terraform, wait a minute and retry. |
| 148 | +# |
| 149 | +# Terraform state is local by default |
| 150 | +# The state file (terraform.tfstate) is gitignored. If you lose it, you lose |
| 151 | +# the link between Terraform and the real GCP resources, and Terraform will |
| 152 | +# try to recreate everything. Enable the GCS backend (commented out below) |
| 153 | +# before running in a team or CI environment. |
| 154 | +# |
| 155 | +# ============================================================================= |
| 156 | + |
1 | 157 | terraform { |
2 | 158 | required_version = ">= 1.5" |
3 | 159 |
|
|
0 commit comments