Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support pgbackrest #40

Merged
merged 27 commits into from
Jun 23, 2020
Merged

add support pgbackrest #40

merged 27 commits into from
Jun 23, 2020

Conversation

vitabaks
Copy link
Owner

@vitabaks vitabaks commented May 27, 2020

For bootstrap a patroni cluster and create replicas from backups.

  • With support of point-in-time recovery, PITR.

pgBackRest User Guide: https://pgbackrest.org/user-guide-index.html
pgBackRest Configuration Reference: https://pgbackrest.org/configuration.html

  • Installation pgbackrest on target servers (or update pgbackrest on repository host).
  • Configure pgbackrest options on target servers.
  • Generate and exchange of SSH Keys with pgbackrest server (Dedicated Repository Host).
  • Bootstrap cluster using pgbackrest from Dedicated Repository Host (ssh).
  • Create replicas using pgbackrest from Dedicated Repository Host (ssh).
  • Bootstrap cluster using pgbackrest from S3 storage.
  • Create replicas using pgbackrest from S3 storage.

For bootstrap a patroni cluster and create replicas from backups.
@vitabaks
Copy link
Owner Author

To test for these changes, please clone the 'dev' branch:

git clone https://github.com/vitabaks/postgresql_cluster.git --branch dev

vitabaks added 4 commits May 28, 2020 15:15
…g_hba.conf

Before that, we generated the user pg_hba.conf file immediately after checking whether PostgreSQL is running and whether it accepts connections on the Master server.
When using the custom bootstrap method, the patroni service can overwrite our pg_hba.conf file after the actual initialization of the new cluster is completed.

Now we wait until the master writes the leader key in the DCS and when the Patroni node is running as the leader.
Remove epel-release package does not help resolve the "Failed to download metadata for repo 'epel'" error in Github Actions for CentOS 8.
Variable: patroni_cluster_point_in_time_recovery

# if true, the database cluster directory will be cleaned (for "wal-g") or overwritten (for "pgbackrest" --delta restore).
# And also the patroni cluster "{{ patroni_cluster_name }}" will be removed from the DCS.
# For pgbackrest (only) - Reinitialize cluster members (replicas) wil be run after successful recovery on the master server.
# Do not use during initial deployment of a cluster. This mode is intended for the PITR of an existing patroni cluster only.

requirements:
pip3 install jmespath   # (on ansible server and master server)
pip3 install pexpect  # (on master server)

Specify in inventory file:
ansible_python_interpreter='/usr/bin/python3'   # is required for use python3

Run ansible with tag for PITR:
ansible-playbook deploy_pgcluster.yml --tags point_in_time_recovery
@vitabaks
Copy link
Owner Author

vitabaks commented May 29, 2020

Added support of Point-In-Time Recovery, PITR (commits 874a074 , 9735c2f, 8c4b06c)

In the default mode (initdb), there is protection against deleting data from the database directory (PGDATA), and also there will be no attempt to remove the cluster from DCS.

But, if the variable "patroni_cluster_bootstrap_method" = "pgbackrest" or "wal-g":

  1. the database cluster directory will be cleared (for "wal-g") or overwritten (for "pgbackrest" - delta restore).
  2. and also the cluster "{{ patroni_cluster_name }}" will be deleted from DCS (if exist) before recovery.

requirements:
(will be installed automatically from pip repository if not exist)
pip3 install pexpect
pip3 install ruamel.yaml

For PITR only - run ansible with tag:

ansible-playbook deploy_pgcluster.yml --tags point_in_time_recovery

  • Describe the details in the README.md file.

vitabaks added 2 commits May 30, 2020 22:44
The json parsing conditions are rewritten to use ansible only (without json_query).
The variable patroni_cluster_point_in_time_recovery has been deleted. Now there is no need to explicitly enable the PITR mode.

In the default mode (initdb), there is protection against deleting data from the database directory (PGDATA), and also there will be no attempt to remove the cluster from DCS.

But, if the variable "patroni_cluster_bootstrap_method" = "pgbackrest" or "wal-g":
1) the database cluster directory will be cleared (for "wal-g") or overwritten (for "pgbackrest" - delta restore).
2) and also the cluster cluster "{{patroni_cluster_name}}" will be deleted from DCS (if exist).
@github-actions
Copy link

github-actions bot commented Jun 1, 2020

yamllint Failed (fixed 603efbf)

Show Output
./roles/patroni/tasks/main.yml
  497:1     error    trailing spaces  (trailing-spaces)

Workflow: Yamllint, Action: karancodeyamllint-github-action, Lint: .

vitabaks added 3 commits June 1, 2020 18:57
…it the cluster members (replicas).

We cannot reinitialize replicas for recovery at a specific point in time (--target), except for the latest backup. Because patroni will replace recovery conf the created earlier with pgbackrest.
I set the conditions in the wrong order, due to which an error occurred:

    TASK [patroni : Prepare PostgreSQL | generate default postgresql config files] ***
fatal: [10.172.0.20]: FAILED! => {"msg": "The conditional check 'not postgresql_conf_file.stat.exists and (ansible_os_family == \"Debian\" and postgresql_packages is not search(\"postgrespro\"))' failed. The error was: error while evaluating conditional (not postgresql_conf_file.stat.exists and (ansible_os_family == \"Debian\" and postgresql_packages is not search(\"postgrespro\"))): 'dict object' has no attribute 'stat'\n\nThe error appears to be in '/home/runner/work/postgresql_cluster/postgresql_cluster/roles/patroni/tasks/main.yml': line 497, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: Prepare PostgreSQL | generate default postgresql config files\n      ^ here\n"}
@vitabaks
Copy link
Owner Author

vitabaks commented Jun 4, 2020

Tested with Debain and Ubuntu.

vitabaks added 4 commits June 5, 2020 15:46
Now you do not need to explicitly specify "ansible_python_interpreter=/usr/bin/python3" in the inventory file.
Fixed:
ERROR: [031]: option 'target-action' not valid without option 'type' in ('immediate', 'name', 'time', 'xid')"'time', 'xid')".
@vitabaks
Copy link
Owner Author

vitabaks commented Jun 8, 2020

Tested with CentOS.

vitabaks added 9 commits June 10, 2020 17:59
ansible module for modifying yaml files.
Added python modules:
pexpect - required for PITR (for ansible module "expect")
ruamel.yaml - required for PITR (for ansible module "yedit")

Removed python requests module from explicit requirements (since patroni version 1.6.2)
patroni/patroni@90a4208#diff-b4ef698db8ca845e5845c4618278f29a
It wasn't used for anything critical, but causing a lot of problems when the new version of urllib3 is released.
…r from the specified.

PITR: Make sure the superuser and replication password does not differ from the specified.

If the password is different, it will be replaced with a new one.
If there is no replication user, it will be created.

See the variables (in vars/main.yml):
patroni_superuser_username
patroni_superuser_password
patroni_replication_username
patroni_replication_password

You can also restore databases from another postgresql clusters.
Workaround for CentOS 8.0/8.1.
If you are using @pgbackrest
 on CentOS 8.1 (not RHEL), you will need to install libzstd RPM from an archived EPEL 8.1 release. The problem will be solved when CentOS 8.2 will be released.

Fixed:
TASK [pgbackrest : Install pgbackrest] ******************************************************************************************************************************************************************
fatal: [10.128.64.157]: FAILED! => {"changed": false, "failures": [], "msg": "Depsolve Error occured: \n Problem: cannot install the best candidate for the job\n  - nothing provides libzstd.so.1()(64bit) needed by pgbackrest-2.27-2.rhel8.x86_64", "rc": 1, "results": []}
…covery]

# for pgbackrest only (for use --delta restore)

This is because pg_control on the standby remembers the previous primary server's max_connections.
So you'll either have to have higher settings on the standby for at least one restart or simply start the standby for with hot_standby = off, and then re-enable it after it has replayed pending WAL.

Fixed:
TASK [patroni : Start PostgreSQL for Recovery]
FATAL:  hot standby is not possible because max_connections = 100 is a lower setting than on the master server (its value was 500)
…tabase list]

Fixed:
FATAL:  Peer authentication failed for user "postgres".
This role is optional.
The lines specified in the "etc_hosts" variable will be added to the hosts file for postgresql_cluster nodes.
@vitabaks
Copy link
Owner Author

vitabaks commented Jun 21, 2020

Tested with S3.
For testing, MinIO was used. Installed in accordance with this instructions.

example:
vars/main.yml

patroni_cluster_bootstrap_method: "pgbackrest"

patroni_create_replica_methods:
  - pgbackrest
  - basebackup

postgresql_restore_command: "pgbackrest --stanza={{ pgbackrest_stanza }} archive-get %f %p"
# pgBackRest
pgbackrest_install: true
pgbackrest_install_from_pgdg_repo: true
pgbackrest_stanza: "s3_stanza"
pgbackrest_repo_type: "s3"
pgbackrest_conf_file: "/etc/pgbackrest.conf"
pgbackrest_conf:
  global:  # [global] section
    - {option: "log-level-file", value: "detail"}
    - {option: "log-path", value: "/var/log/pgbackrest"}
    - {option: "repo1-type", value: "{{ pgbackrest_repo_type |lower }}"}
    - {option: "repo1-path", value: "/repo"}
    - {option: "repo1-s3-endpoint", value: "minio.local"}
    - {option: "repo1-s3-bucket", value: "pgbackrest"}
    - {option: "repo1-s3-verify-tls", value: "n"}
    - {option: "repo1-s3-key", value: "accessKey"}
    - {option: "repo1-s3-key-secret", value: "superSECRETkey"}
    - {option: "repo1-s3-region", value: "eu-west-3"}
    - {option: "repo1-retention-full", value: "4"}
    - {option: "start-fast", value: "y"}
    - {option: "delta", value: "y"}
  stanza:  # [stanza_name] section
    - {option: "pg1-path", value: "{{ postgresql_data_dir }}"}
    - {option: "process-max", value: "2"}
    - {option: "recovery-option", value: "recovery_target_action=promote"}
    - {option: "log-level-console", value: "info"}
pgbackrest_patroni_cluster_restore_command:
  '/usr/bin/pgbackrest --stanza={{ pgbackrest_stanza }} --delta restore'

vars/system.yml

etc_hosts:
  - "10.128.64.143 pgbackrest.minio.local minio.local s3.eu-west-3.amazonaws.com"

See more S3 repository options https://pgbackrest.org/configuration.html#section-repository

pgbackres accepts the DNS name format only.
vitabaks added 3 commits June 23, 2020 17:48
An example configuration for S3 will be described separately.
In order to reduce the number of lines in the variable file.
In v2.02 the default location of the pgBackRest configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf.

If /etc/pgbackrest/pgbackrest.conf does not exist, the /etc/pgbackrest.conf file will be loaded instead, if it exists.
@vitabaks vitabaks merged commit c2de3da into master Jun 23, 2020
@vitabaks vitabaks deleted the dev branch August 3, 2020 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant