Skip to content

Commit 5d39cc3

Browse files
Crispy1975hunleydtomashley
authored
feat: additional config for pgbackrest (#2099)
* feat: additional config for pgbackrest - pgdata-signal: add remove-pid action to remove stale postmaster.pid via the constrained wrapper rather than a broad sudo rm entry, keeping the sudoers scope limited to this script * fix: address pgbackrest PR review feedback - pgdata-chown: simplify case; use group=postgres consistently for both ownership targets (pgbackrest:postgres and postgres:postgres) - pgdata-signal: consolidate recovery/standby case into single pattern - pgdata-signal: deploy at mode 0755 so postgres can execute via sudo -u - setup-pgbackrest.yml: combine dir creation into single task with dict loop; conf.d gets 02770 setgid, others get default 0770 - setup-pgbackrest.yml: sort logrotate task keys alphabetically * fix: add missing pgbackrest sudoers entries and pre-create log files Three gaps found by cross-referencing SAA commands against Ansible: 1. adminapi.sudoers.conf: add two entries so adminapi can call the pgbackrest binary via the wrapper. - NewRunner() path: wrapper calls sudo -u pgbackrest <real binary>, requires adminapi -> pgbackrest NOPASSWD for the real binary path. - NewRunnerAs("pgbackrest") path: SAA does sudo -n -u pgbackrest /usr/bin/pgbackrest, requires adminapi -> pgbackrest NOPASSWD for the wrapper path. 2. setup-pgbackrest.yml: add pgbackrest -> pgbackrest sudoers entry for the real binary. When NewRunnerAs runs the wrapper as the pgbackrest user, the wrapper still calls sudo -u pgbackrest internally; without this entry that inner sudo fails. 3. setup-pgbackrest.yml: pre-create the three SAA log files (saa-pgb.log, wal-push.log, wal-fetch.log) as pgbackrest:postgres 0660. SAA opens them with O_APPEND|O_WRONLY (no O_CREATE) — a missing file causes enable to fail immediately before any pgBackRest work. modification_time/access_time: preserve means the task is idempotent. * Update ansible/files/supabase_admin_agent_config/pgdata-signal Co-authored-by: Tom Ashley <tom.ashley@gmail.com> * Update ansible/files/supabase_admin_agent_config/pgdata-signal Co-authored-by: Tom Ashley <tom.ashley@gmail.com> * Update ansible/tasks/internal/supabase-admin-agent.yml Co-authored-by: Tom Ashley <tom.ashley@gmail.com> * fix: use peer auth with saa_map for supabase_admin local connections Replace trust with peer map=saa_map for the supabase_admin pg_hba rule. Add saa_map entries in pg_ident.conf mapping adminapi and root OS users to the supabase_admin PG user, as required by supabase-admin-agent. * fix: restore pgdata-signal mode to 0755 pgdata-signal is called via sudo as postgres (other user), so it must be world-executable. 0700 would prevent postgres from running it. * fix: revert supabase_admin auth to trust, remove saa_map from pg_ident peer map=saa_map cannot be cleanly implemented in pg_hba.conf.j2 while the Dockerfiles copy it raw — pg_hba include_if_exists (PG 16+) is needed to separate Docker from production, and that work is in progress on a separate branch. Trust is appropriate for this local Unix-socket-only connection; the security boundary is OS-level access to the machine. Revisit once the include directive infrastructure lands. * docs: add comment to supabase_admin trust rule in pg_hba.conf * docs: add explanatory comments to pgBackRest Ansible changes Clarify the intent and rationale behind the less-obvious changes: - adminapi.sudoers.conf: explain the two pgbackrest sudo chains (NewRunner vs NewRunnerAs) and the wrapper/real-binary split - supabase-admin-agent.yml: explain why pgdata-chown is 0700 and pgdata-signal is 0755 - setup-pgbackrest.yml: explain the pgbackrest self-sudo entry, why conf.d needs setgid (02770), and why SAA log files must be pre-created before the agent runs - pgbackrest.conf: explain expire-auto change and why the [supabase] stanza was removed (SAA owns it via conf.d to avoid error [031]) * fix: move pgbackrest logrotate config to follow repo convention - Move from pgbackrest_config/ to logrotate_config/ to match the existing pattern for all other logrotate configs in the repo - Deploy via the finalize-ami.yml loop rather than setup-pgbackrest.yml - Add size 50M cap to prevent logs growing unbounded between daily rotations during large backup/restore operations * fix: run pgBackRest directory and log file setup in stage2_nix pass The directory and log file creation tasks were gated on nixpkg_mode only, so they were skipped during the stage 2 Nix provision pass (nixpkg_mode=false, stage2_nix=true). SAA is installed in stage 2 but had no log directory or pre-created log files to write to, causing pgbackrest enable to fail immediately. Extend both conditions to nixpkg_mode or stage2_nix so the paths are created in whichever pass installs the software. * fix: set data_directory to canonical path to resolve pgBackRest [058] pgBackRest stanza-create compares the live cluster's data_directory against the DataDir stored in pg_control. initdb resolves symlinks internally so pg_control stores /data/pgdata, but postgresql.conf explicitly set data_directory to /var/lib/postgresql/data (the symlink), causing a path mismatch and [058] error on every enable attempt. Patch data_directory to /data/pgdata immediately after deploying the postgresql.conf template, scoped to the debpkg_mode/nixpkg_mode block where the /data volume and symlink are always present. Docker images and Nix test environments are unaffected as they do not use this Ansible path. * fix: allow SAA execution from postgres AppArmor confinement Remove explicit `deny /** x,` from postgres_shell and pgbackrest_shell sub-profiles — the deny keyword has absolute precedence in AppArmor, overriding all specific ix allow rules and blocking SAA execution with exit 126. Add SAA to the parent sbpostgres profile with Pix -> postgres_shell so that the archive_command path (shell staying in sbpostgres due to Pix fallback) can exec SAA and transition it into postgres_shell. Also add /nix/store/*/bin/sh to the shell transition rules to cover nix-built postgres popen() behaviour. * chore: bump postgres release versions for pgbackrest AMI * fix: align AppArmor profile with salt managed version Sync sbpostgres_apparmor with the salt repo (PR #526, merged): - Add WAL dir rwkl entries to parent and both sub-profiles - Consolidate systemd notify to single /{,var/}run/systemd/notify rule - Add /usr/bin/kill and signal hup rule to parent - Add pg_archivecleanup ix to postgres_shell --------- Co-authored-by: Douglas J Hunley <doug.hunley@gmail.com> Co-authored-by: Tom Ashley <tom.ashley@gmail.com>
1 parent 9313c07 commit 5d39cc3

12 files changed

Lines changed: 211 additions & 28 deletions

File tree

ansible/files/adminapi.sudoers.conf

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,21 @@ Cmnd_Alias PGBOUNCER = /bin/systemctl start pgbouncer.service, /bin/systemctl st
1515
%adminapi ALL= NOPASSWD: /etc/adminapi/pg_upgrade_scripts/common.sh
1616
%adminapi ALL= NOPASSWD: /etc/adminapi/pg_upgrade_scripts/pgsodium_getkey.sh
1717
%adminapi ALL= NOPASSWD: /usr/bin/systemctl daemon-reload
18+
# pgBackRest wrapper scripts: constrained helpers called by supabase-admin-agent.
19+
# pgdata-chown runs as root (default); pgdata-signal runs as postgres so it can
20+
# create/remove signal files owned by that user.
21+
%adminapi ALL= NOPASSWD: /usr/local/lib/supabase-admin-agent/pgdata-chown
22+
%adminapi ALL=(postgres) NOPASSWD: /usr/local/lib/supabase-admin-agent/pgdata-signal
23+
# pgBackRest binary entries support two sudo chains used by supabase-admin-agent:
24+
# NewRunner() — adminapi → /usr/bin/pgbackrest wrapper → sudo -u pgbackrest real_binary
25+
# NewRunnerAs() — adminapi → sudo -u pgbackrest /usr/bin/pgbackrest → sudo -u pgbackrest real_binary
26+
# Both paths require the wrapper entry; NewRunner() additionally needs the real binary entry
27+
# because the wrapper itself calls sudo to drop privileges to the pgbackrest user.
28+
%adminapi ALL=(pgbackrest) NOPASSWD: /var/lib/pgbackrest/.nix-profile/bin/pgbackrest
29+
%adminapi ALL=(pgbackrest) NOPASSWD: /usr/bin/pgbackrest
30+
# pgBackRest restore stops PostgreSQL, restores PGDATA, then starts PostgreSQL.
31+
%adminapi ALL= NOPASSWD: /usr/bin/systemctl start postgresql.service
32+
%adminapi ALL= NOPASSWD: /usr/bin/systemctl stop postgresql.service
1833
%adminapi ALL= NOPASSWD: /usr/bin/systemctl reload postgresql.service
1934
%adminapi ALL= NOPASSWD: /usr/bin/systemctl restart postgresql.service
2035
%adminapi ALL= NOPASSWD: /usr/bin/systemctl show -p NRestarts postgresql.service
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
/var/log/pgbackrest/*.log {
2+
daily
3+
size 50M
4+
rotate 7
5+
compress
6+
delaycompress
7+
missingok
8+
notifempty
9+
create 0660 pgbackrest postgres
10+
}

ansible/files/pgbackrest_config/pgbackrest.conf

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,17 @@ archive-copy = y
44
backup-standby = prefer
55
compress-type = zst
66
delta = y
7-
expire-auto = n
7+
# expire-auto=y allows pgBackRest to expire old backups automatically after
8+
# each new backup completes, keeping retention in sync without a separate
9+
# expire command. Previously n; changed to work correctly with SAA-managed backups.
10+
expire-auto = y
811
link-all = y
912
log-level-console = info
1013
log-level-file = detail
1114
log-subprocess = y
1215
resume = n
1316
start-fast = y
14-
15-
[supabase]
16-
pg1-path = /var/lib/postgresql/data
17-
pg1-socket-path = /run/postgresql
18-
pg1-user = supabase_admin
17+
# Note: the [supabase] stanza (pg1-path, pg1-socket-path, pg1-user) has been
18+
# removed from this file. supabase-admin-agent owns that stanza and writes it
19+
# to /etc/pgbackrest/conf.d/ at enable time. Having the stanza in both places
20+
# causes pgBackRest error [031] (duplicate option).

ansible/files/postgresql_config/pg_hba.conf.j2

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,10 @@
7979
# TYPE DATABASE USER ADDRESS METHOD
8080

8181
# trust local connections
82-
local all supabase_admin scram-sha-256
82+
# supabase_admin: trust over Unix socket only — no network exposure. Access
83+
# requires OS-level shell access to the host, which is already a stronger
84+
# security boundary than a database password.
85+
local all supabase_admin trust
8386
local all all peer map=supabase_map
8487
host all all 127.0.0.1/32 trust
8588
host all all ::1/128 trust

ansible/files/postgresql_config/sbpostgres_apparmor

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,13 @@ profile sbpostgres flags=(attach_disconnected) {
2424
# lock permission for pgroonga
2525
/data/pgdata/pgroonga.log k,
2626

27+
# WAL directory - needs link permission for segment recycling
28+
/data/pgdata/pg_wal/** rwkl,
29+
/data/pgdata/pg_wal/ rw,
30+
2731
# Systemd notification socket (for Type=notify services)
28-
/run/systemd/notify w,
2932
/{,var/}run/systemd/notify w,
3033

31-
# Allow disconnected paths (for mount namespaces)
32-
@{run}/systemd/notify w,
33-
/run/systemd/notify w,
34-
3534
# Full network access
3635
network inet stream,
3736
network inet6 stream,
@@ -50,6 +49,8 @@ profile sbpostgres flags=(attach_disconnected) {
5049
/usr/bin/false ix,
5150
# kill needed for postgresql to reload itself
5251
/bin/kill ix,
52+
/usr/bin/kill ix,
53+
signal (send) set=(hup) peer=unconfined,
5354

5455
# When parent executes shell, transition to restricted profile
5556
# This accounts for popen, which postgres uses for any child process
@@ -71,7 +72,15 @@ profile sbpostgres flags=(attach_disconnected) {
7172
/usr/bin/nix Pix -> pgbackrest_shell,
7273
/usr/bin/sudo Pix -> pgbackrest_shell,
7374

74-
profile postgres_shell {
75+
# nix-built postgres popen() may use a nix store sh instead of /bin/sh
76+
/nix/store/*/bin/sh Pix -> postgres_shell,
77+
78+
# SAA called directly from sbpostgres context (e.g. shell Pix fallback)
79+
/opt/supabase-admin-agent/supabase-admin-agent Pix -> postgres_shell,
80+
/opt/supabase-admin-agent/supabase-admin-agent-linux-arm64 Pix -> postgres_shell,
81+
/opt/supabase-admin-agent/supabase-admin-agent-linux-amd64 Pix -> postgres_shell,
82+
83+
profile postgres_shell {
7584
#include <abstractions/base>
7685

7786
/usr/bin/* m,
@@ -93,17 +102,23 @@ profile sbpostgres flags=(attach_disconnected) {
93102
# backup things
94103
/usr/lib/postgresql/bin/pgsodium_getkey.sh ix,
95104
/usr/bin/admin-mgr ix,
105+
/opt/supabase-admin-agent/supabase-admin-agent ix,
106+
/opt/supabase-admin-agent/supabase-admin-agent-linux-arm64 ix,
107+
/opt/supabase-admin-agent/supabase-admin-agent-linux-amd64 ix,
96108
/nix/store/*/bin/.postgres-wrapped ix,
97109
/nix/store/*/bin/wal-g-2 ix,
98110
/nix/store/*/bin/pgbackrest ix,
99111
/nix/store/*/bin/pg_dump ix,
112+
/nix/store/*/bin/pg_archivecleanup ix,
100113

101114
# file path permissions
102115
/** r,
103116
/data/wal_fetch_dir/ rw,
104117
/tmp/wal_fetch_dir/ rw,
105118
/var/lib/postgresql/data rw,
106119
/data/pgdata rw,
120+
/data/pgdata/pg_wal/** rwkl,
121+
/data/pgdata/pg_wal/ rw,
107122
/data/latest-lsn-checkpoint-v2 rw,
108123
/data/previous-lsn-checkpoint-v2 rw,
109124
/var/lib/postgresql/data/recovery.signal rw,
@@ -136,11 +151,9 @@ profile sbpostgres flags=(attach_disconnected) {
136151
# tools have network access
137152
network,
138153

139-
# Block everything else
140-
deny /** x,
141154
}
142155

143-
profile pgbackrest_shell {
156+
profile pgbackrest_shell {
144157
#include <abstractions/base>
145158

146159
/usr/bin/* m,
@@ -167,6 +180,8 @@ profile sbpostgres flags=(attach_disconnected) {
167180
/tmp/wal_fetch_dir/ rw,
168181
/var/lib/postgresql/data rw,
169182
/data/pgdata rw,
183+
/data/pgdata/pg_wal/** rwkl,
184+
/data/pgdata/pg_wal/ rw,
170185
/data/latest-lsn-checkpoint-v2 rw,
171186
/data/previous-lsn-checkpoint-v2 rw,
172187
/var/lib/postgresql/data/recovery.signal rw,
@@ -196,7 +211,5 @@ profile sbpostgres flags=(attach_disconnected) {
196211
# tools have network access
197212
network,
198213

199-
# Block everything else
200-
deny /** x,
201214
}
202215
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#!/usr/bin/env bash
2+
# pgdata-chown — transfers PGDATA ownership for pgBackRest restore operations.
3+
#
4+
# Called via sudo by supabase-admin-agent (running as adminapi). Only two
5+
# actions are accepted, and the target path must resolve to /data/pgdata or a
6+
# path beneath it. realpath(1) is used to expand symlinks before the check,
7+
# which prevents directory-traversal attacks (e.g. /data/pgdata/../../etc/sudoers).
8+
#
9+
# Usage: pgdata-chown <to-pgbackrest|to-postgres> <path>
10+
set -euo pipefail
11+
12+
if [[ $# -ne 2 ]]; then
13+
echo "usage: pgdata-chown <to-pgbackrest|to-postgres> <path>" >&2
14+
exit 1
15+
fi
16+
17+
ACTION="$1"
18+
TARGET="$2"
19+
20+
REAL=$(realpath "$TARGET")
21+
if [[ "$REAL" != "/data/pgdata" && "$REAL" != /data/pgdata/* ]]; then
22+
echo "error: '${TARGET}' resolves to '${REAL}', which is not under /data/pgdata" >&2
23+
exit 1
24+
fi
25+
26+
case "$ACTION" in
27+
to-pgbackrest|to-postgres)
28+
exec /usr/bin/chown -R "${ACTION:3}:postgres" "$REAL"
29+
;;
30+
*)
31+
echo "error: unknown action '${ACTION}'; expected to-pgbackrest or to-postgres" >&2
32+
exit 1
33+
;;
34+
esac
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/usr/bin/env bash
2+
# pgdata-signal — creates or removes PostgreSQL signal files and stale pid files.
3+
# Called via sudo (as postgres) by supabase-admin-agent (running as adminapi).
4+
#
5+
# All file paths are hardcoded to prevent path injection. No external
6+
# path argument is accepted.
7+
#
8+
# Usage: pgdata-signal <create|remove> <recovery|standby>
9+
# pgdata-signal remove-pid
10+
set -euo pipefail
11+
12+
# Special-case: remove-pid removes the stale postmaster.pid file that would
13+
# prevent PostgreSQL from starting after a restore. Handled as a single-arg
14+
# command to keep the sudoers entry scoped to this script rather than allowing
15+
# a broad "rm" entry.
16+
if [[ $# -eq 1 && "$1" == "remove-pid" ]]; then
17+
exec /usr/bin/rm -f "/data/pgdata/postmaster.pid"
18+
fi
19+
20+
if [[ $# -ne 2 ]]; then
21+
echo "usage: pgdata-signal <create|remove> <recovery|standby>" >&2
22+
echo " pgdata-signal remove-pid" >&2
23+
exit 1
24+
fi
25+
26+
ACTION="$1"
27+
SIGNAL_TYPE="$2"
28+
29+
case "$SIGNAL_TYPE" in
30+
recovery|standby) FILE="/data/pgdata/${SIGNAL_TYPE}.signal" ;;
31+
*)
32+
echo "error: unknown signal type '${SIGNAL_TYPE}'; expected recovery or standby" >&2
33+
exit 1
34+
;;
35+
esac
36+
37+
case "$ACTION" in
38+
create) exec /usr/bin/touch "${FILE}" ;;
39+
remove) exec /usr/bin/rm -f "${FILE}" ;;
40+
*)
41+
echo "error: unknown action '${ACTION}'; expected create or remove" >&2
42+
exit 1
43+
;;
44+
esac

ansible/tasks/finalize-ami.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
- { file: 'logrotate-postgres-auth.conf' }
6262
- { file: 'logrotate-postgres-csv.conf' }
6363
- { file: 'logrotate-walg.conf' }
64+
- { file: 'logrotate-pgbackrest.conf' }
6465
loop_control:
6566
loop_var: 'logrotate_item'
6667

ansible/tasks/internal/supabase-admin-agent.yml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,33 @@
3131
dest: /etc/sudoers.d/supabase-admin-agent
3232
mode: "0440"
3333

34+
- name: supabase-admin-agent - pgbackrest helper scripts dir
35+
file:
36+
path: /usr/local/lib/supabase-admin-agent
37+
state: directory
38+
owner: root
39+
group: root
40+
mode: "0755"
41+
42+
- name: supabase-admin-agent - pgdata-chown script
43+
copy:
44+
src: files/supabase_admin_agent_config/pgdata-chown
45+
dest: /usr/local/lib/supabase-admin-agent/pgdata-chown
46+
owner: root
47+
group: root
48+
# 0700: always invoked via "sudo" (as root); no other user needs execute.
49+
mode: "0700"
50+
51+
- name: supabase-admin-agent - pgdata-signal script
52+
copy:
53+
src: files/supabase_admin_agent_config/pgdata-signal
54+
dest: /usr/local/lib/supabase-admin-agent/pgdata-signal
55+
owner: root
56+
group: root
57+
# 0755: invoked via "sudo -u postgres"; the postgres OS user needs the
58+
# world-execute bit since it is neither owner nor group member.
59+
mode: "0755"
60+
3461
- name: Setting arch (amd64)
3562
set_fact:
3663
arch: "amd64"

ansible/tasks/setup-pgbackrest.yml

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@
3030
- 'postgres ALL=(pgbackrest) NOPASSWD: /usr/bin/bash'
3131
- 'postgres ALL=(pgbackrest) NOPASSWD: /usr/bin/nix'
3232
- 'pgbackrest ALL=(pgbackrest) NOPASSWD: /usr/bin/bash'
33+
# Required for the NewRunnerAs() path: supabase-admin-agent runs the
34+
# /usr/bin/pgbackrest wrapper as the pgbackrest user, but the wrapper
35+
# internally calls "sudo -u pgbackrest <real_binary>". Without this entry
36+
# that inner sudo fails even though the outer caller is already pgbackrest.
37+
- 'pgbackrest ALL=(pgbackrest) NOPASSWD: /var/lib/pgbackrest/.nix-profile/bin/pgbackrest'
3338

3439
- name: Install pgBackRest
3540
ansible.builtin.shell: |
@@ -43,19 +48,42 @@
4348
- name: Create needed directories for pgBackRest
4449
ansible.legacy.file:
4550
group: postgres
46-
mode: '0770'
51+
mode: "{{ backrest_dir['mode'] | default('0770', true) }}"
4752
owner: pgbackrest
48-
path: "{{ backrest_dir }}"
53+
path: "{{ backrest_dir['dir'] }}"
4954
state: directory
5055
loop:
51-
- /etc/pgbackrest/conf.d
52-
- /var/lib/pgbackrest
53-
- /var/spool/pgbackrest
54-
- /var/log/pgbackrest
56+
# conf.d is setgid (02770) so files written by adminapi (postgres group)
57+
# automatically inherit the postgres group. Without setgid, pgbackrest
58+
# (running as the pgbackrest user) cannot read conf files created by adminapi.
59+
- {dir: /etc/pgbackrest/conf.d, mode: '02770'}
60+
- {dir: /var/lib/pgbackrest}
61+
- {dir: /var/spool/pgbackrest}
62+
- {dir: /var/log/pgbackrest}
5563
loop_control:
5664
loop_var: backrest_dir
5765
when:
58-
- nixpkg_mode
66+
- nixpkg_mode or stage2_nix
67+
68+
# supabase-admin-agent opens these log files with O_APPEND|O_WRONLY (no
69+
# O_CREATE). They must exist before SAA runs; a missing file causes the
70+
# pgBackRest enable command to fail before any backup work is attempted.
71+
# access_time/modification_time: preserve keeps this task idempotent.
72+
- name: Pre-create pgBackRest SAA log files
73+
ansible.builtin.file:
74+
access_time: preserve
75+
group: postgres
76+
mode: '0660'
77+
modification_time: preserve
78+
owner: pgbackrest
79+
path: "{{ item }}"
80+
state: touch
81+
loop:
82+
- /var/log/pgbackrest/saa-pgb.log
83+
- /var/log/pgbackrest/wal-push.log
84+
- /var/log/pgbackrest/wal-fetch.log
85+
when:
86+
- nixpkg_mode or stage2_nix
5987

6088
- name: Symlink pgbackrest.conf
6189
ansible.legacy.file:

0 commit comments

Comments
 (0)