Skip to content

Commit 880b0ca

Browse files
committed
Merge branch 'restore-in-place' into restore-in-place-with-pg-18
2 parents 421bd6d + 9347959 commit 880b0ca

File tree

12 files changed

+1797
-28
lines changed

12 files changed

+1797
-28
lines changed

docs/reference/cluster_manifest.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ Those parameters are grouped under the `metadata` top-level key.
4848
Labels that are set here but not listed as `inherited_labels` in the operator
4949
parameters are ignored.
5050

51+
* **annotations**
52+
A map of annotations to add to the `postgresql` resource. The operator reacts to certain annotations, for instance, to trigger specific actions.
53+
* `postgres-operator.zalando.org/action: restore-in-place`: When this annotation is present with this value, the operator will trigger an automated in-place restore of the cluster. This process requires a valid `clone` section to be defined in the manifest with a target `timestamp`. See the [user guide](../user.md#automated-restore-in-place-point-in-time-recovery) for more details.
54+
5155
## Top-level parameters
5256

5357
These parameters are grouped directly under the `spec` key in the manifest.

docs/reference/operator_parameters.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ configuration.
99
Variable names are underscore-separated words.
1010

1111
### ConfigMaps-based
12+
1213
The configuration is supplied in a
1314
key-value configmap, defined by the `CONFIG_MAP_NAME` environment variable.
1415
Non-scalar values, i.e. lists or maps, are encoded in the value strings using
@@ -25,6 +26,7 @@ operator CRD, all the CRD defaults are provided in the
2526
[operator's default configuration manifest](https://github.com/zalando/postgres-operator/blob/master/manifests/postgresql-operator-default-configuration.yaml)
2627

2728
### CRD-based configuration
29+
2830
The configuration is stored in a custom YAML
2931
manifest. The manifest is an instance of the custom resource definition (CRD)
3032
called `OperatorConfiguration`. The operator registers this CRD during the
@@ -187,6 +189,9 @@ Those are top-level keys, containing both leaf keys and groups.
187189
* **repair_period**
188190
period between consecutive repair requests. The default is `5m`.
189191

192+
* **pitr_backup_retention**
193+
retention time for PITR (Point-In-Time-Recovery) state ConfigMaps. The operator will clean up ConfigMaps older than the configured retention. The value is a [duration string](https://pkg.go.dev/time#ParseDuration), e.g. "168h" (which is 7 days), "24h". The default is `168h`.
194+
190195
* **set_memory_request_to_limit**
191196
Set `memory_request` to `memory_limit` for all Postgres clusters (the default
192197
value is also increased but configured `max_memory_request` can not be
@@ -934,6 +939,7 @@ key.
934939
```yaml
935940
teams_api_role_configuration: "log_statement:all,search_path:'data,public'"
936941
```
942+
937943
The default is `"log_statement:all"`
938944
939945
* **enable_team_superuser**

docs/user.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -891,6 +891,45 @@ original UID, making it possible retry restoring. However, it is probably
891891
better to create a temporary clone for experimenting or finding out to which
892892
point you should restore.
893893

894+
## Automated Restore in place (Point-in-Time Recovery)
895+
896+
The operator supports automated in-place restores, allowing you to restore a database to a specific point in time without changing connection strings on the application side. This feature orchestrates the deletion of the current cluster and the creation of a new one from a backup.
897+
898+
:warning: This is a destructive operation. The existing cluster's StatefulSet and pods will be deleted as part of the process. Ensure you have a reliable backup strategy and have tested the restore process in a non-production environment.
899+
900+
To trigger an in-place restore, you need to add a special annotation and a `clone` section to your `postgresql` manifest:
901+
902+
* **Annotate the manifest**: Add the `postgres-operator.zalando.org/action: restore-in-place` annotation to the `metadata` section.
903+
* **Specify the recovery target**: Add a `clone` section to the `spec`, providing the `cluster` name and the `timestamp` for the point-in-time recovery. The `cluster` name **must** be the same as the `metadata.name` of the cluster you are restoring. The `timestamp` must be in RFC 3339 format and point to a time in the past for which you have WAL archives.
904+
905+
Here is an example manifest snippet:
906+
907+
```yaml
908+
apiVersion: "acid.zalan.do/v1"
909+
kind: postgresql
910+
metadata:
911+
name: acid-minimal-cluster
912+
annotations:
913+
postgres-operator.zalando.org/action: restore-in-place
914+
spec:
915+
# ... other cluster parameters
916+
clone:
917+
cluster: "acid-minimal-cluster" # Must match metadata.name
918+
uid: "<original_UID>"
919+
timestamp: "2022-04-01T10:11:12+00:00"
920+
# ... other cluster parameters
921+
```
922+
923+
When you apply this manifest, the operator will:
924+
* See the `restore-in-place` annotation and begin the restore workflow.
925+
* Store the restore request and the new cluster definition in a temporary `ConfigMap`.
926+
* Delete the existing `postgresql` custom resource, which triggers the deletion of the associated StatefulSet and pods.
927+
* Wait for the old cluster to be fully terminated.
928+
* Create a new `postgresql` resource with a new UID but the same name.
929+
* The new cluster will bootstrap from the latest base backup prior to the given `timestamp` and replay WAL files to recover to the specified point in time.
930+
931+
The process is asynchronous. You can monitor the operator logs and the state of the `postgresql` resource to follow the progress. Once the new cluster is up and running, your applications can reconnect.
932+
894933
## Setting up a standby cluster
895934

896935
Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster)
@@ -1302,3 +1341,4 @@ As of now, the operator does not sync the pooler deployment automatically
13021341
which means that changes in the pod template are not caught. You need to
13031342
toggle `enableConnectionPooler` to set environment variables, volumes, secret
13041343
mounts and securityContext required for TLS support in the pooler pod.
1344+

manifests/operatorconfiguration.crd.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,9 @@ spec:
120120
repair_period:
121121
type: string
122122
default: "5m"
123+
pitr_backup_retention:
124+
type: string
125+
default: "168h"
123126
set_memory_request_to_limit:
124127
type: boolean
125128
default: false

manifests/postgresql-operator-default-configuration.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ configuration:
2323
min_instances: -1
2424
resync_period: 30m
2525
repair_period: 5m
26+
pitr_backup_retention: 168h
2627
# set_memory_request_to_limit: false
2728
# sidecars:
2829
# - image: image:123

pkg/apis/acid.zalan.do/v1/operator_configuration_type.go

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ package v1
33
// Operator configuration CRD definition, please use snake_case for field names.
44

55
import (
6-
"github.com/zalando/postgres-operator/pkg/util/config"
7-
86
"time"
97

8+
"github.com/zalando/postgres-operator/pkg/util/config"
9+
1010
"github.com/zalando/postgres-operator/pkg/spec"
1111
v1 "k8s.io/api/core/v1"
1212
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@@ -267,6 +267,7 @@ type OperatorConfigurationData struct {
267267
ResyncPeriod Duration `json:"resync_period,omitempty"`
268268
RepairPeriod Duration `json:"repair_period,omitempty"`
269269
MaintenanceWindows []MaintenanceWindow `json:"maintenance_windows,omitempty"`
270+
PitrBackupRetention Duration `json:"pitr_backup_retention,omitempty"`
270271
SetMemoryRequestToLimit bool `json:"set_memory_request_to_limit,omitempty"`
271272
ShmVolume *bool `json:"enable_shm_volume,omitempty"`
272273
SidecarImages map[string]string `json:"sidecar_docker_images,omitempty"` // deprecated in favour of SidecarContainers

0 commit comments

Comments
 (0)