Skip to content

Commit 2546ab6

Browse files
committed
docs(chart): medium-priority accuracy and maintainability fixes (#182)
1 parent b1d64c2 commit 2546ab6

5 files changed

Lines changed: 64 additions & 56 deletions

File tree

charts/cluster/docs/Getting Started.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ helm upgrade --install cnpg \
2424
## Creating a cluster configuration
2525

2626
Once you have the operator installed, the next step is to prepare the cluster configuration. Whether this will be managed
27-
via a GitOps solution or directly via Helm is up to you. The following sections outlines the important steps in both cases.
27+
via a GitOps solution or directly via Helm is up to you. The following sections outline the important steps in both cases.
2828

2929
### Choosing the database type
3030

@@ -88,15 +88,17 @@ There are several important cluster options. Here are the most important ones:
8888
`cluster.affinity.topologyKey` - The chart sets it to `topology.kubernetes.io/zone` by default which is useful if you are
8989
running a production cluster in a multi AZ cluster (highly recommended). If you are running a single AZ cluster, you may
9090
want to change that to `kubernetes.io/hostname` to ensure that cluster instances are not provisioned on the same node.
91-
`cluster.postgresql` - Allows you to override PostgreSQL configuration parameters example:
91+
`cluster.postgresql.parameters` - Allows you to override PostgreSQL configuration parameters, for example:
9292
```yaml
9393
cluster:
9494
postgresql:
95-
max_connections: "200"
96-
shared_buffers: "2GB"
95+
parameters:
96+
max_connections: "200"
97+
shared_buffers: "2GB"
9798
```
98-
`cluster.initSQL` - Allows you to run custom SQL queries during the cluster initialization. This is useful for creating
99-
extensions, schemas and databases. Note that these are as a superuser.
99+
`cluster.initdb.postInitSQL` - Allows you to run custom SQL queries during cluster initialization. This is useful for creating
100+
extensions, schemas, and databases. Use `cluster.initdb.postInitApplicationSQL` and `cluster.initdb.postInitTemplateSQL` when
101+
you need application-database or template-database specific initialization.
100102

101103
For a full list - refer to the Helm chart [configuration options](../README.md#Configuration-options).
102104

charts/cluster/docs/runbooks/CNPGClusterLogicalReplicationErrors.md

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -25,26 +25,30 @@ The `CNPGClusterLogicalReplicationErrors` alert indicates that a logical replica
2525
# Connect to the subscriber and check subscription status
2626
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
2727
SELECT
28-
subname,
29-
subenabled,
30-
apply_error_count,
31-
sync_error_count,
32-
stats_reset
33-
FROM pg_stat_subscription
34-
WHERE apply_error_count > 0 OR sync_error_count > 0;
28+
s.subname,
29+
s.subenabled,
30+
COALESCE(sss.apply_error_count, 0) AS apply_error_count,
31+
COALESCE(sss.sync_error_count, 0) AS sync_error_count,
32+
sss.stats_reset
33+
FROM pg_subscription s
34+
LEFT JOIN pg_stat_subscription_stats sss ON s.oid = sss.subid
35+
WHERE COALESCE(sss.apply_error_count, 0) > 0 OR COALESCE(sss.sync_error_count, 0) > 0;
3536
"
3637

3738
# Check the last error message
3839
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
3940
SELECT
40-
subname,
41-
last_msg_receipt_time,
42-
latest_end_time,
41+
s.subname,
42+
ss.last_msg_receipt_time,
43+
ss.latest_end_time,
4344
CASE
44-
WHEN apply_error_count > 0 THEN 'Apply errors detected'
45-
WHEN sync_error_count > 0 THEN 'Sync errors detected'
45+
WHEN COALESCE(sss.apply_error_count, 0) > 0 THEN 'Apply errors detected'
46+
WHEN COALESCE(sss.sync_error_count, 0) > 0 THEN 'Sync errors detected'
47+
ELSE 'No errors detected'
4648
END as error_type
47-
FROM pg_stat_subscription;
49+
FROM pg_subscription s
50+
LEFT JOIN pg_stat_subscription ss ON s.oid = ss.subid
51+
LEFT JOIN pg_stat_subscription_stats sss ON s.oid = sss.subid;
4852
"
4953
```
5054

@@ -96,21 +100,21 @@ FROM pg_publication;
96100
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
97101
SELECT
98102
subname,
99-
srconninfo,
100-
srschema,
101-
srslotname,
102-
srsynccommit
103+
subconninfo,
104+
subslotname,
105+
subsynccommit,
106+
subpublications
103107
FROM pg_subscription;
104108
"
105109

106110
# Check which tables are being replicated
107111
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
108112
SELECT
109-
relid::regclass as table_name,
110-
srsubstate as state
111-
FROM pg_subscription_rel
112-
JOIN pg_class ON relid = oid
113-
WHERE srsubstate NOT IN ('r', 's'); -- Not ready or synchronizing
113+
sr.srrelid::regclass as table_name,
114+
sr.srsubstate as state
115+
FROM pg_subscription_rel sr
116+
JOIN pg_class c ON sr.srrelid = c.oid
117+
WHERE sr.srsubstate NOT IN ('r', 's'); -- Not ready or synchronizing
114118
"
115119
```
116120

@@ -378,4 +382,4 @@ ALTER TABLE table_name ENABLE TRIGGER trigger_name;
378382
- You encounter frequent constraint violations
379383
- The schema cannot be synchronized
380384
- You need to skip transactions repeatedly
381-
- Error rate is increasing despite fixes
385+
- Error rate is increasing despite fixes

charts/cluster/docs/runbooks/CNPGClusterLogicalReplicationLagging.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -28,17 +28,19 @@ Connect to the subscriber and check the current state:
2828
```bash
2929
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
3030
SELECT
31-
subname,
32-
enabled,
33-
EXTRACT(EPOCH FROM (NOW() - last_msg_receipt_time)) as receipt_lag_seconds,
34-
EXTRACT(EPOCH FROM (NOW() - latest_end_time)) as apply_lag_seconds,
35-
pg_wal_lsn_diff(received_lsn, latest_end_lsn) as pending_bytes,
31+
s.subname,
32+
s.subenabled AS enabled,
33+
EXTRACT(EPOCH FROM (NOW() - ss.last_msg_receipt_time)) AS receipt_lag_seconds,
34+
EXTRACT(EPOCH FROM (NOW() - ss.latest_end_time)) AS apply_lag_seconds,
35+
COALESCE(pg_wal_lsn_diff(ss.received_lsn, ss.latest_end_lsn), 0) AS pending_bytes,
3636
CASE
37-
WHEN EXTRACT(EPOCH FROM (NOW() - last_msg_receipt_time)) > 60 THEN 'High receipt lag'
38-
WHEN EXTRACT(EPOCH FROM (NOW() - latest_end_time)) > 60 THEN 'High apply lag'
39-
WHEN pg_wal_lsn_diff(received_lsn, latest_end_lsn) > 1024^3 THEN 'High LSN distance'
37+
WHEN EXTRACT(EPOCH FROM (NOW() - ss.last_msg_receipt_time)) > 60 THEN 'High receipt lag'
38+
WHEN EXTRACT(EPOCH FROM (NOW() - ss.latest_end_time)) > 60 THEN 'High apply lag'
39+
WHEN COALESCE(pg_wal_lsn_diff(ss.received_lsn, ss.latest_end_lsn), 0) > 1024^3 THEN 'High LSN distance'
40+
ELSE 'Healthy'
4041
END as primary_issue
41-
FROM pg_stat_subscription;
42+
FROM pg_subscription s
43+
LEFT JOIN pg_stat_subscription ss ON s.oid = ss.subid;
4244
"
4345
```
4446

@@ -230,4 +232,4 @@ kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "\dRs+"
230232
- Lag continues to increase despite optimization
231233
- Network issues persist between clusters
232234
- Resource utilization is at maximum but lag continues
233-
- You experience frequent replication failures
235+
- You experience frequent replication failures

charts/cluster/docs/runbooks/CNPGClusterLogicalReplicationStopped.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,18 @@ The `CNPGClusterLogicalReplicationStopped` alert indicates that a logical replic
2525
# Check all subscriptions and their status
2626
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
2727
SELECT
28-
pg_subscription.subname,
29-
pg_subscription.enabled,
28+
s.subname,
29+
s.subenabled AS enabled,
3030
CASE
31-
WHEN pg_subscription.enabled = false THEN 'Explicitly disabled'
32-
WHEN pid IS NULL AND buffered_lag_bytes > 0 THEN 'Stuck (no worker)'
33-
WHEN pid IS NOT NULL THEN 'Active'
31+
WHEN NOT s.subenabled THEN 'Explicitly disabled'
32+
WHEN ss.pid IS NULL AND COALESCE(pg_wal_lsn_diff(ss.received_lsn, ss.latest_end_lsn), 0) > 0 THEN 'Stuck (no worker)'
33+
WHEN ss.pid IS NOT NULL THEN 'Active'
3434
ELSE 'Unknown'
3535
END as status,
36-
pg_wal_lsn_diff(received_lsn, latest_end_lsn) as pending_bytes,
37-
pid IS NOT NULL as has_worker
38-
FROM pg_subscription
39-
LEFT JOIN pg_stat_subscription ON pg_subscription.oid = pg_stat_subscription.subid;
36+
COALESCE(pg_wal_lsn_diff(ss.received_lsn, ss.latest_end_lsn), 0) AS pending_bytes,
37+
ss.pid IS NOT NULL AS has_worker
38+
FROM pg_subscription s
39+
LEFT JOIN pg_stat_subscription ss ON s.oid = ss.subid;
4040
"
4141
```
4242

@@ -63,10 +63,10 @@ WHERE application_name LIKE '%subscription%' OR backend_type = 'logical replicat
6363
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
6464
SELECT
6565
subname,
66-
srconninfo,
67-
srsynccommit,
68-
srslotname,
69-
srsyncstate as sync_state
66+
subconninfo,
67+
subsynccommit,
68+
subslotname,
69+
subpublications
7070
FROM pg_subscription;
7171
"
7272
```
@@ -86,7 +86,7 @@ kubectl logs -n NAMESPACE $POD --tail=200 | grep -i "subscription\|replication\|
8686
```bash
8787
# Extract connection info from subscription
8888
kubectl exec -it svc/SUBSCRIBER-CLUSTER-rw -n NAMESPACE -- psql -c "
89-
SELECT srconninfo FROM pg_subscription WHERE subname = 'your_subscription_name';
89+
SELECT subconninfo FROM pg_subscription WHERE subname = 'your_subscription_name';
9090
" | grep -o "host=[^ ]*" | cut -d= -f2
9191

9292
# Test connection
@@ -333,4 +333,4 @@ kubectl exec -it svc/CLUSTER-rw -n NS -- psql -c "SELECT * FROM pg_stat_activity
333333
- Workers fail to start despite adequate resources
334334
- WAL retention issues prevent catch-up
335335
- Frequent disconnections occur
336-
- Data cannot be resynchronized successfully
336+
- Data cannot be resynchronized successfully

charts/cluster/templates/console-statefulset.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ spec:
5555
apt install -y screen curl wget jq unzip gzip nano vim util-linux less htop
5656
cat <<EOF > /root/.bashrc
5757
echo -e "\nHere are some examples for connecting and running queries on the cluster:"
58-
echo ' nohup psql \$DB_SUPERUSER_URI"/DB_NAME" -c "SELECT 1;" 2>&1 > command.log &'
58+
echo ' nohup psql "$DB_SUPERUSER_URI/<db-name>" -c "SELECT 1;" > command.log 2>&1 &'
5959
echo -e "\nTo check up on the command, use:"
6060
echo " tail -f command.log"
61-
echo -e "\nYou can also use 'screen' for an interactive session. See https://github.com/paradedb/charts/blob/dev/charts/paradedb/docs/long-running-tasks.md for examples."
61+
echo -e "\nYou can also use 'screen' for an interactive session. See https://github.com/cloudnative-pg/charts/blob/main/charts/cluster/docs/Console.md for examples."
6262
echo -e "\n"
6363
EOF
6464
sleep infinity

0 commit comments

Comments
 (0)