feat(ops): add automated PostgreSQL backup script with GCS upload#295
feat(ops): add automated PostgreSQL backup script with GCS upload#295snowfox1003 wants to merge 3 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughIntroduces ChangesAutomated PostgreSQL Backup
Sequence Diagram(s)sequenceDiagram
participant Cron as cron / systemd timer
participant Script as backup_database.sh
participant PostgreSQL
participant GCS as GCS bucket
participant Webhook as Discord / Slack webhook
Cron->>Script: invoke (--env-file .env)
Script->>Script: load env, derive DUMP_BASENAME
Script->>Script: validate BACKUP_GCS_BUCKET, BACKUP_RETENTION_DAYS
Script->>PostgreSQL: pg_dump -Fc → local .dump file
PostgreSQL-->>Script: compressed dump
Script->>GCS: gcloud storage cp dump gs://BUCKET/bdc/
GCS-->>Script: upload OK
Script->>GCS: list objects (gcloud storage ls)
GCS-->>Script: object list
Script->>Script: compute cutoff via embedded Python
Script->>GCS: delete objects older than BACKUP_RETENTION_DAYS
Script->>Script: EXIT trap fires (success/failure)
Script->>Webhook: POST notification (if BACKUP_NOTIFICATIONS=true)
Webhook-->>Script: 200 OK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/Deployment.md`:
- Around line 274-290: The cron schedule in the bdc-db-backup example is
inconsistent with the recommended 03:00 UTC time mentioned in the preceding
text. The cron line showing the backup_database.sh script execution currently
uses 0 23 (23:00 UTC) but should be changed to 0 3 to match the 03:00 UTC
recommendation stated earlier in the document. Update the hour field in the cron
schedule from 23 to 3 to align the example with the documented best practice and
match the systemd timer configuration shown later.
In `@scripts/backup_database.sh`:
- Around line 109-113: The BACKUP_FILE_PREFIX is being used to derive both the
dump basename and construct paths, but when it contains nested segments (e.g.,
prod/bdc/), it creates subdirectories under BACKUP_STAGING_DIR that may not
exist, causing pg_dump to fail. Additionally, if BACKUP_GCS_PREFIX lacks a
trailing slash, concatenation at line 453 produces malformed keys. Fix this by
extracting only the basename (final segment without directory components) from
BACKUP_FILE_PREFIX when constructing DUMP_BASENAME, ensure BACKUP_STAGING_DIR
and any nested subdirectories are created before use, and normalize all path
concatenation operations (at lines 361-362 and 453) by adding trailing slashes
to prefix values before appending the dump filename to guarantee correct path
and key formation.
- Around line 365-368: The `|| true` operator inside the command substitution on
line 365 forces the assignment to always succeed, even when `gcloud storage ls`
fails, which prevents the error check on line 366 from ever triggering. Remove
the `|| true` from inside the command substitution (the one after the gcloud
storage ls call) so that real failures in listing GCS objects at the gcs_glob
path will cause the assignment to fail and properly enter the error handling
block, allowing retention outages to be detected instead of silently succeeding.
- Around line 347-351: The chmod command for setting permissions on the
BACKUP_STAGING_DIR is suppressing errors with 2>/dev/null and continuing on
failure with || true, which allows the script to proceed even if permissions
cannot be enforced on a directory containing sensitive backup data. Remove the
error suppression (2>/dev/null) and the failure handling (|| true) from the
chmod 700 "$BACKUP_STAGING_DIR" command so that permission hardening failures
become hard failures that stop script execution instead of being silently
ignored.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1ac07bd1-c296-46fb-8616-68532b51f997
📒 Files selected for processing (5)
.env.exampledocs/Deployment.mddocs/GCP_Production_Checklist.mdscripts/README.mdscripts/backup_database.sh
…or clarity and consistency
Summary
Adds production VM automation for PostgreSQL backups:
scripts/backup_database.shrunspg_dump -Fc, uploads timestamped dumps (bdc-YYYYMMDD.dump) to a private GCS bucket, and optionally prunes objects older thanBACKUP_RETENTION_DAYS(default 7;0disables pruning).Credentials come from
.env(DATABASE_URL/DB_*, withhost.docker.internalrewritten to127.0.0.1for host cron). On success or failure, optional Discord/Slack notifications use existingDISCORD_WEBHOOK_URL/SLACK_WEBHOOK_URL(BACKUP_NOTIFICATIONSto disable).Documentation covers bucket/IAM, env vars, cron/systemd setup, log directory creation, restore from GCS, and smoke-test steps. Cross-link added in
GCP_Production_Checklist.md.Apps touched
Test plan
python -m pytest(or scoped:python -m pytest <app>/tests)uv run pyright(if typed code changed)lint-imports(if imports or cross-app coupling changed)Docs / coupling
python scripts/generate_service_docs.pyrun (ifservices.pyorcore/protocols.pychanged)docs/updated (if behavior or ops changed)docs/Deployment.md— automated backup runbook, cron, systemd, restore, smoke testdocs/GCP_Production_Checklist.md— cross-linkscripts/README.md,.env.exampleClose #287
Summary by CodeRabbit
Release Notes
New Features
Documentation
gcloud storageovergsutil.Chores
.env.examplewith backup configuration templates.