From 1966c50115da96a4706c36a81eb94badd411ef0d Mon Sep 17 00:00:00 2001 From: Felix Kunde Date: Tue, 24 Aug 2021 18:34:48 +0200 Subject: [PATCH 1/4] enhance docs on clone and restore --- docs/administrator.md | 51 +++++++++++++++++++++++++++++-------------- docs/user.md | 11 +++++----- 2 files changed, 41 insertions(+), 21 deletions(-) diff --git a/docs/administrator.md b/docs/administrator.md index ad424cab8..2944a701f 100644 --- a/docs/administrator.md +++ b/docs/administrator.md @@ -157,20 +157,26 @@ from numerous escape characters in the latter log entry, view it in CLI with `PodTemplate` used by the operator is yet to be updated with the default values used internally in K8s. -The operator also support lazy updates of the Spilo image. That means the pod -template of a PG cluster's stateful set is updated immediately with the new -image, but no rolling update follows. This feature saves you a switchover - and -hence downtime - when you know pods are re-started later anyway, for instance -due to the node rotation. To force a rolling update, disable this mode by -setting the `enable_lazy_spilo_upgrade` to `false` in the operator configuration -and restart the operator pod. With the standard eager rolling updates the -operator checks during Sync all pods run images specified in their respective -statefulsets. The operator triggers a rolling upgrade for PG clusters that -violate this condition. - -Changes in $SPILO\_CONFIGURATION under path bootstrap.dcs are ignored when -StatefulSets are being compared, if there are changes under this path, they are -applied through rest api interface and following restart of patroni instance +The StatefulSet is replaced if the following properties change: +- annotations +- volumeClaimTemplates +- template volumes + +The StatefulSet is replaced and a rolling updates is triggered if the following +properties differ between the old and new state: +- container name, ports, image, resources, env, envFrom, securityContext and volumeMounts +- template labels, annotations, service account, securityContext, affinity, priority class and termination grace period + +Note that, changes in `SPILO_CONFIGURATION` env variable under `bootstrap.dcs` +path are ignored for the diff. They will be applied through Patroni's rest api +interface, following a restart of all instances. + +The operator also support lazy updates of the Spilo image. In this case the +StatefulSet is only updated, but no rolling update follows. This feature saves +you a switchover - and hence downtime - when you know pods are re-started later +anyway, for instance due to the node rotation. To force a rolling update, +disable this mode by setting the `enable_lazy_spilo_upgrade` to `false` in the +operator configuration and restart the operator pod. ## Delete protection via annotations @@ -734,8 +740,15 @@ WALE_S3_ENDPOINT='https+path://s3.eu-central-1.amazonaws.com:443' WALE_S3_PREFIX=$WAL_S3_BUCKET/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION} ``` -If the prefix is not specified Spilo will generate it from `WAL_S3_BUCKET`. -When the `AWS_REGION` is set `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are +The operator sets the prefix to an empty string so that spilo will generate it +from the configured `WAL_S3_BUCKET`. + +:warning: When you overwrite the configuration by defining `WAL_S3_BUCKET` in +the [pod_environment_configmap](#custom-pod-environment-variables) you have +to set `WAL_BUCKET_SCOPE_PREFIX = ""`, too. Otherwise Spilo will not find +the physical backups on restore (next chapter). + +When the `AWS_REGION` is set, `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are generated automatically. `WALG_S3_PREFIX` is identical to `WALE_S3_PREFIX`. `SCOPE` is the Postgres cluster name. @@ -817,6 +830,12 @@ on one of the other running instances (preferably replicas if they do not lag behind). You can test restoring backups by [cloning](user.md#how-to-clone-an-existing-postgresql-cluster) clusters. +If you need to provide a [custom clone environment](#custom-pod-environment-variables) +copy existing variables about your setup (backup location, prefix, access +keys etc.) and prepend the `CLONE_` prefix to get them copied to the correct +directory within Spilo. + + ## Logical backups The operator can manage K8s cron jobs to run logical backups (SQL dumps) of diff --git a/docs/user.md b/docs/user.md index be7b41cfe..ef3277436 100644 --- a/docs/user.md +++ b/docs/user.md @@ -733,20 +733,21 @@ spec: uid: "efd12e58-5786-11e8-b5a7-06148230260c" cluster: "acid-batman" timestamp: "2017-12-19T12:40:33+01:00" + s3_wal_path: "s3:///spilo///wal/" ``` Here `cluster` is a name of a source cluster that is going to be cloned. A new cluster will be cloned from S3, using the latest backup before the `timestamp`. Note, that a time zone is required for `timestamp` in the format of +00:00 which -is UTC. The `uid` field is also mandatory. The operator will use it to find a -correct key inside an S3 bucket. You can find this field in the metadata of the -source cluster: +is UTC. You can specify the `s3_wal_path` of the source cluster or let the +operator try to find it based on the configured `wal_[s3|gs]_bucket` and the +specified `uid`. You can find the UID of the source cluster in its metadata: ```yaml apiVersion: acid.zalan.do/v1 kind: postgresql metadata: - name: acid-test-cluster + name: acid-batman uid: efd12e58-5786-11e8-b5a7-06148230260c ``` @@ -799,7 +800,7 @@ no statefulset will be created. ```yaml spec: standby: - s3_wal_path: "s3 bucket path to the master" + s3_wal_path: "s3:///spilo///wal/" ``` At the moment, the operator only allows to stream from the WAL archive of the From 8878a9163251de7753477539c6c966df63332aa6 Mon Sep 17 00:00:00 2001 From: Felix Kunde Date: Thu, 26 Aug 2021 14:50:55 +0200 Subject: [PATCH 2/4] add chapter about upgrading the operator --- docs/administrator.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/administrator.md b/docs/administrator.md index 2944a701f..7a26275b1 100644 --- a/docs/administrator.md +++ b/docs/administrator.md @@ -3,6 +3,21 @@ Learn how to configure and manage the Postgres Operator in your Kubernetes (K8s) environment. +## Upgrading the operator + +The Postgres Operator is upgraded by changing the docker image within the +deployment. Before doing so, it is recommended to check the release notes +for new configuration options or changed behavior you might want to reflect +in the ConfigMap or config CRD. E.g. a new feature might get introduced which +is enabled or disabled by default and you want to change it to the opposite +with the corresponding flag option. + +When using helm, be aware that installing the new chart will not update the +`Postgresql` and `OperatorConfiguration` CRD. Make sure to update them before +with the provided manifests in the `crds` folder. Otherwise, you might face +errors about new Postgres manifest or configuration options being unknown +to the CRD schema validation. + ## Minor and major version upgrade Minor version upgrades for PostgreSQL are handled via updating the Spilo Docker @@ -835,6 +850,21 @@ copy existing variables about your setup (backup location, prefix, access keys etc.) and prepend the `CLONE_` prefix to get them copied to the correct directory within Spilo. +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: postgres-pod-config +data: + AWS_REGION: "eu-west-1" + AWS_ACCESS_KEY_ID: "****" + AWS_SECRET_ACCESS_KEY: "****" + ... + CLONE_AWS_REGION: "eu-west-1" + CLONE_AWS_ACCESS_KEY_ID: "****" + CLONE_AWS_SECRET_ACCESS_KEY: "****" + ... +``` ## Logical backups From 687122510c75c38970a2a5c5613de4284d3859d5 Mon Sep 17 00:00:00 2001 From: Felix Kunde Date: Thu, 26 Aug 2021 16:57:43 +0200 Subject: [PATCH 3/4] add section for standby clusters --- docs/administrator.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/docs/administrator.md b/docs/administrator.md index e5ccf91f4..8da85b77e 100644 --- a/docs/administrator.md +++ b/docs/administrator.md @@ -688,6 +688,12 @@ if it ends up in your specified WAL backup path: envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data" ``` +You can also check if Spilo is able to find any backups: + +```bash +envdir "/home/postgres/etc/wal-e.d/env" wal-g backup-list +``` + Depending on the cloud storage provider different [environment variables](https://github.com/zalando/spilo/blob/master/ENVIRONMENT.rst) have to be set for Spilo. Not all of them are generated automatically by the operator by changing its configuration. In this case you have to use an @@ -923,6 +929,15 @@ data: ... ``` +### Standby clusters + +The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is very +similar to cloning. At the moment, the operator only allows for streaming from +the S3 WAL archive of the master specified in the manifest. Like with cloning, +if you are using [additional environment variables](#custom-pod-environment-variables) +to access your backup location you have to copy those variables and prepend the +`STANDBY_` prefix for Spilo to find the backups and WAL files to stream. + ## Logical backups The operator can manage K8s cron jobs to run logical backups (SQL dumps) of From 78b342a40a8710fdd22090d897e4871917f2232b Mon Sep 17 00:00:00 2001 From: Felix Kunde Date: Fri, 27 Aug 2021 10:29:08 +0200 Subject: [PATCH 4/4] Update docs/administrator.md Co-authored-by: Alexander Kukushkin --- docs/administrator.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/administrator.md b/docs/administrator.md index 8da85b77e..9408541d0 100644 --- a/docs/administrator.md +++ b/docs/administrator.md @@ -691,7 +691,7 @@ envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata You can also check if Spilo is able to find any backups: ```bash -envdir "/home/postgres/etc/wal-e.d/env" wal-g backup-list +envdir "/run/etc/wal-e.d/env" wal-g backup-list ``` Depending on the cloud storage provider different [environment variables](https://github.com/zalando/spilo/blob/master/ENVIRONMENT.rst)