Selectively disable GCP Cloud Run services when spending exceeds budget thresholds.
A Cloud Function triggered by GCP Budget Alerts via Pub/Sub. When your monthly spend exceeds the configured threshold, it scales down non-protected Cloud Run services and pages PagerDuty.
Unlike "nuclear" approaches that disable all billing (killing everything including free-tier services), this selectively targets Cloud Run while preserving your infrastructure.
GCP Budget ($500/mo) → 50%/80%/100% alerts
↓
Pub/Sub topic: billing-alerts
↓
Cloud Function: gcp-billing-kill-switch
↓ (when cost > 80% of budget)
├── Scale down non-protected Cloud Run services (max instances → 0)
├── Page PagerDuty (critical — calls until acknowledged)
└── At 50%: Warning alert only (no action)
gcloud services enable \
billingbudgets.googleapis.com \
cloudfunctions.googleapis.com \
pubsub.googleapis.com \
eventarc.googleapis.com \
run.googleapis.com \
--project=YOUR_PROJECT_IDgcloud pubsub topics create billing-alerts --project=YOUR_PROJECT_IDgcloud billing budgets create \
--billing-account=YOUR_BILLING_ACCOUNT_ID \
--display-name="GCP Kill Switch Budget" \
--budget-amount=500 \
--threshold-rule=percent=0.5,basis=current-spend \
--threshold-rule=percent=0.8,basis=current-spend \
--threshold-rule=percent=1.0,basis=current-spend \
--notifications-rule-pubsub-topic=projects/YOUR_PROJECT_ID/topics/billing-alerts \
--filter-projects="projects/YOUR_PROJECT_ID"gcloud functions deploy gcp-billing-kill-switch \
--gen2 \
--runtime=nodejs22 \
--region=us-central1 \
--source=. \
--entry-point=killSwitch \
--trigger-topic=billing-alerts \
--set-env-vars="GCP_PROJECT_ID=YOUR_PROJECT_ID,GCP_REGION=us-central1,KILL_THRESHOLD=0.8,NUCLEAR_MODE=false,PAGERDUTY_ROUTING_KEY=YOUR_PD_KEY,PROTECTED_SERVICES=my-critical-api" \
--memory=256MB \
--timeout=120s \
--project=YOUR_PROJECT_ID| Variable | Default | Description |
|---|---|---|
GCP_PROJECT_ID |
Required | Your GCP project ID |
GCP_REGION |
us-central1 |
Cloud Run region to monitor |
KILL_THRESHOLD |
0.8 |
Cost ratio (0-1) at which to take action |
NUCLEAR_MODE |
false |
If true, disables all billing (kills everything). If false, selectively scales down Cloud Run. |
PROTECTED_SERVICES |
(empty) | Semicolon-separated Cloud Run services to never scale down |
PAGERDUTY_ROUTING_KEY |
(empty) | PagerDuty Events API v2 integration key for phone alerts |
| Budget % | Action |
|---|---|
| 50% | PagerDuty warning alert (no action) |
| 80%+ | Scale down non-protected Cloud Run services + PagerDuty critical alert |
| Nuclear mode | Disable billing entirely (not recommended) |
Services listed in PROTECTED_SERVICES will never be scaled down, even when the budget is exceeded. Use semicolons as separators:
PROTECTED_SERVICES=my-api;my-webhook-handlerThe monthly-budget kill switch above only fires when monthly spend crosses a threshold. That misses runaway daily spikes — a 24-hour incident can burn thousands while the monthly total is still in the safe zone, so the kill switch never fires until many days later.
The daily-rate.js Cloud Function closes that gap. It:
- Runs on a Cloud Scheduler trigger (recommended: hourly).
- Queries the BigQuery billing export for past-24h spend by service.
- If spend exceeds
DAILY_RATE_THRESHOLD_USD, publishes to the samebilling-alertsPub/Sub topic the existing kill switch listens on — so the existing handler runs the usual scale-down + PagerDuty flow.
Cloud Scheduler (hourly) → daily-rate watcher → BigQuery billing export
↓ (if past-24h ≥ threshold)
Pub/Sub: billing-alerts
↓
gcp-billing-kill-switch (existing handler)
↓
Scale down + PagerDuty
You must have BigQuery billing export enabled. If you don't:
# Enable in the Cloud Console:
# https://console.cloud.google.com/billing → Billing account → Billing export
# Create a "Detailed usage cost" export to a BigQuery dataset (e.g. `billing_export`).
# After enabling, the standard table name format is:
# gcp_billing_export_resource_v1_<billingAccountId-with-dashes-replaced># From this package directory:
gcloud functions deploy daily-rate-watcher \
--gen2 \
--runtime=nodejs22 \
--region=us-central1 \
--source=. \
--entry-point=dailyRateWatcher \
--trigger-http \
--no-allow-unauthenticated \
--set-env-vars="\
GCP_PROJECT_ID=YOUR_PROJECT_ID,\
BILLING_EXPORT_DATASET=billing_export,\
BILLING_EXPORT_TABLE=gcp_billing_export_resource_v1_01ABCD_234567_8901AB,\
BILLING_ALERTS_TOPIC=billing-alerts,\
DAILY_RATE_THRESHOLD_USD=50" \
--memory=256MB \
--timeout=120s \
--project=YOUR_PROJECT_IDThe function's service account needs roles/bigquery.dataViewer + roles/bigquery.jobUser on the project that owns the export dataset, plus roles/pubsub.publisher on the billing-alerts topic.
# Get the function's HTTPS URL
FUNCTION_URL=$(gcloud functions describe daily-rate-watcher \
--gen2 --region=us-central1 \
--project=YOUR_PROJECT_ID \
--format="value(serviceConfig.uri)")
# Create a service account for the scheduler to invoke the function
gcloud iam service-accounts create daily-rate-scheduler \
--display-name="Daily Rate Scheduler" \
--project=YOUR_PROJECT_ID
# Grant it permission to invoke the function
gcloud functions add-invoker-policy-binding daily-rate-watcher \
--gen2 --region=us-central1 \
--member="serviceAccount:daily-rate-scheduler@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--project=YOUR_PROJECT_ID
# Schedule it hourly
gcloud scheduler jobs create http daily-rate-watcher-hourly \
--location=us-central1 \
--schedule="0 * * * *" \
--uri="$FUNCTION_URL" \
--http-method=POST \
--max-retry-attempts=3 \
--max-retry-duration=300s \
--oidc-service-account-email="daily-rate-scheduler@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--project=YOUR_PROJECT_IDWithout
--max-retry-attempts, Cloud Scheduler defaults to 0 retries — a single transient BigQuery hiccup silently loses that hour's check.
The watcher is deployed as a gen2 Cloud Function, which means it's also a Cloud Run service named daily-rate-watcher. When the watcher publishes a breach, the existing kill switch handler runs scaleDownCloudRunServices() against every Cloud Run service in the project — and scaleDownCloudFunctions() against every Cloud Function.
If daily-rate-watcher is not in PROTECTED_SERVICES and PROTECTED_FUNCTIONS, it will silently disable itself on the first kill — and subsequent spikes will go undetected until you manually scale it back up.
Update your existing kill switch deployment to add the watcher (and the kill switch itself) to both protection lists:
gcloud functions deploy gcp-billing-kill-switch \
--gen2 --region=us-central1 --project=YOUR_PROJECT_ID \
--update-env-vars="\
PROTECTED_SERVICES=daily-rate-watcher;gcp-billing-kill-switch;<your-other-protected>,\
PROTECTED_FUNCTIONS=daily-rate-watcher;gcp-billing-kill-switch"The watcher publishes costAmount = past-24h spend and budgetAmount = DAILY_RATE_THRESHOLD_USD, then the existing handler computes costRatio = costAmount / budgetAmount and only kills if costRatio >= KILL_THRESHOLD (default 0.8).
A daily-rate breach always sends costRatio ≥ 1.0, so kill always fires under default config. But if you've customized KILL_THRESHOLD > 1.0, a daily-rate breach will land in the warning bracket (PagerDuty page only, no scale-down). Keep KILL_THRESHOLD ≤ 1.0 if you want daily-rate breaches to actually trigger the kill.
The watcher itself is binary — it doesn't have a warning level the way the monthly handler does (50% / 80%). It's silent below threshold and fully publishes at/above. If you want a warning-only tier, set a lower-threshold sibling watcher with DAILY_RATE_DRY_RUN=true (logs without publishing).
| Variable | Default | Description |
|---|---|---|
GCP_PROJECT_ID |
Required | Project that owns the BigQuery billing export dataset |
BILLING_EXPORT_DATASET |
Required | BigQuery dataset name (e.g. billing_export) |
BILLING_EXPORT_TABLE |
Required | Full export table name (gcp_billing_export_resource_v1_<billingAccountId>) |
BILLING_ALERTS_TOPIC |
billing-alerts |
Pub/Sub topic to publish breach events to |
DAILY_RATE_THRESHOLD_USD |
50 |
Past-24h spend ceiling. Breach when total ≥ threshold |
DAILY_RATE_WINDOW_HOURS |
24 |
Lookback window in hours |
DAILY_RATE_DRY_RUN |
(unset) | If "true", log what would be published without actually publishing |
- Start at 2× your average daily spend. Watch for a week. Adjust if it false-fires.
- For an idle / staging project: $25–$50 is a reasonable floor.
- For a busy production project: 1.5–2× p95 daily spend over the last 30 days.
npm install
npm testThe test suite uses node:test (no extra dependencies) with stubbed BigQuery + Pub/Sub clients. Tests cover:
- BigQuery SQL builder (window-hours, identifier safety, aggregation shape)
- Cost breakdown aggregation (totals, per-service rollup, edge cases)
- Handler decision flow (no breach, breach + publish, dry-run, threshold equality)
- Config-error paths (missing required env, non-numeric / zero / negative threshold, invalid window)
- Failure propagation (BigQuery query errors, Pub/Sub publish errors, SQL identifier safety)
This is the GCP component of the Kill Switch project. See also:
- Cloudflare Billing Kill Switch — auto-disconnect runaway Cloudflare Workers
MIT