Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the the original PostgreSQL configuration file #41

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

khaledk2
Copy link
Contributor

This role uses a fixed template for the configuration file and uses it for all the installed PostgreSQL versions
Sometimes the defaults are changed from one version to another and the configuration file is optimised for the target version. Even some of the default attributes are commented inside the configuration file but the commented attributes reflect the values of the current attributes.
I think using one configuration file for all the versions should not be used as it may misconfigure the database server and/or do not reflect the current values for the attributes for the server configuration

This PR fixes that by using the configuration file for the installed version, it reads the configuration PostgreSQL file from the remote machine. It uses a template to add postgresql_server_conf to the file's contents and copies it to the host machine.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overall approach is sensible i.e. keep the configuration file created by the system package and add a final Ansible-managed section allowing to override the properties.

A few inline comments from an initial round of code review.

templates/postgresql-conf.j2 Show resolved Hide resolved
tasks/initialise.yml Outdated Show resolved Hide resolved
tasks/initialise.yml Show resolved Hide resolved
@joshmoore
Copy link
Member

@khaledk2: did you identify the configuration setting that was causing the slow down?

@khaledk2
Copy link
Contributor Author

khaledk2 commented May 26, 2024

@sbesson I have restored the listen_addresses attribute.
I have added a backup file to the original file configuration which can be used later if the user wants to re-run the role with a new configuration set. I have added a test to check the creation of the backup file.
I have tested the role locally on a machine which runs Rocky Linux 9 and it worked as expected.

@joshmoore I could not identify which configuration slows down the server when using the original role template.

@joshmoore
Copy link
Member

@joshmoore I could not identify which configuration slows down the server when using the original role template.

Nothing against this PR, but that makes me worry that we might be on the wrong track. In the diff that I saw, there were no changed elements that weren't commented out. Where is the behavior difference coming from??

@sbesson
Copy link
Member

sbesson commented Jun 3, 2024

As discussed earlier today, the best candidate identified by @khaledk2 is effective_cache_size which has been set manually prod121. This property used in the Query Planning was never set in IDR previously and the defaults was always used.

Proposed next step is for @khaledk2 to open a PR against https://github.com/IDR/deployment to 1- consume this branch of the role and 2- set a reasonable value for effective_cache_size. This can then be integrated in a new test122b environment and hopefully confirm we have captured all changes allowing to deploy IDR on Rocky Linux 9 with a PSQL configuration allowing to at least match the historical query times.

Note this will also need to be tested in the context of an upgrade of a PSQL server installed by a previous version of this role (which is not the use case of IDR).

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this change is being currently evaluated in the context of IDR via IDR/deployment#424.

While I think the proposed changes work when installing a PSQL database from scratch using the new role, I have concerns the logic does not achieve the desired intent in the case of an upgrade of an existing PSQL deployments using earlier versions of the role. We should probably define and test very clearly the expectations as these will be the most common scenarios for the OMERO deployments managed by OME (IDR, learning, training) and will also impact anyone in the community. /cc @pwalczysko @jburel

templates/postgresql-conf.j2 Outdated Show resolved Hide resolved
become_user: "{{ postgresql_become_user }}"
ansible.builtin.copy:
src: "{{ postgresql_dist_confdir }}/postgresql.conf"
dest: "{{ postgresql_dist_confdir }}/postgresql.conf.backup"
Copy link
Member

@sbesson sbesson Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly the logic, this is meant to create a one-off copy of the configuration file assuming the latter has not originally been modified (and templated) by Ansible.

.backup implies to me that a copy of the modified configuration. Would a different suffix e.g. .orig would better convey that this is meant to be the version of the configuration installed by the package manager?

tasks/initialise.yml Outdated Show resolved Hide resolved
# conf_sample_file: sample configuration file, it should be copied during the installation by the installation package

# the following two variables are used to download link ofr the source to get the sample configuration file
# It will be used in case of the sample file is not find locally for any reason
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have a hard fail in case conf_sample_file does not exist?

Are you aware of any scenario where this file could legitimately be expected to be missing?

Copy link
Contributor Author

@khaledk2 khaledk2 Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On our RHEL 9 test machine, there are 4 PostgreSQL versions have been installed (13, 14, 15, 16). Only version 16 folder has the sample file and the others do not have this file. I am not sure how these versions have been installed but I think the file should be there. I am not sure why it is missing. So, I have added the logic of downloading and extracting the file to account for similar cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting (and slightly concerning). Note all of the PSQL sample configuration file checks seem to be currently skipped in the Molecule tests on both Rocky Linux. I don't know if that's expected or indicates another issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will check for the sample file if the original file does not exist and the configuration file has been edited before, i.e. in case of installing the same PostgreSQL version with a previous role version.

@@ -25,17 +26,16 @@
- name: postgres | Check for presence of "Modified by ome postgresql ansible role" in file
ansible.builtin.lineinfile:
path: "{{ postgresql_dist_confdir }}/postgresql.conf"
line: "#Modified by ome postgresql ansible role"
line: "#Ansible managed"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the new logic is to always recreate a configuration file from the combination of the original PSQL configuration file and the role variables (in the same spirit as the previous template task), my impression is this check is redundant and the config_file_changed logic should be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modified file check should work for the fresh installation, as the original configure file will be copied from the configuration file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous RedHat/Rocky template had # {{ ansible_managed }} so this will mark this file as not changed. Is that the expectation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the template to include #{{ ansible_managed }}.

It will check for it, i.e. looking for "#Ansible managed" in the file; if it is not found it will mark it as an alerted configuration file and start looking for the org file (postgresql.conf.org) that may exist in the current folder, if not it will invoke the tasks inside org_config_file.yml to determine the sample file.
If the configuration file does not have the "#Ansible managed" it will assume the file is the original configuration file then copy it to postgresql.conf.org and use its contents in the configuration template.

The configuration file does not have "#Ansible managed" on test122b; it will assume it is the original configuration file and copy it to org.
I think because you are in check mode, the file is not copied so it failed when trying to get the file contents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the example of test122b here

[sbesson@test122b-database ~]$  sudo tail -n 20 /var/lib/pgsql/16/data/postgresql.conf


#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

# Add settings for extensions here

listen_addresses = '*'

shared_buffers = 8002MB


listen_addresses = '*'

effective_cache_size = 23880MB

shared_buffers = 7960MB

[sbesson@test122b-database ~]$ sudo head -n 2 /var/lib/pgsql/16/data/postgresql.conf
#Modified by ome postgresql ansible role
# -----------------------------

It does not have the #Ansible managed marker but it is also not the original configuration.
In that case, copying it as postgresql.conf.org is incorrect and also means that every invocation of the role will append the options at the bottom of the configuration file.

tasks/initialise.yml Outdated Show resolved Hide resolved
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few inline comments. The org_conf_file.yml tasks are all skipped in the Molecule tests both in the converge and idempotence steps on Ubuntu 22.04 and Rocky Linux 9.

Also tested the integration in the context of test122b through IDR/deployment#424 in check-mode. The same steps are skipped as the configuration file has been modified manually during the prod121 release without including the top-level # Ansible managed comment and the later step will fail with

TASK [ome.postgresql : postgres | set permissions on data directory] ***********************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | initialise PostgreSQL cluster (skip if data directory already exists)] ***********************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | check for presence of "Modified by ome postgresql ansible role" in file] *********************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | check that the postgresql.conf.backup file exists] *******************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | copy a postgresql.conf to postgresql.conf.org] ***********************************************************************************************************************************************************************
changed: [test122b-database]

TASK [ome.postgresql : postgres | Check that the postgresql.conf.sample file exists (ubuntu)] **********************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | postgres | Check that the postgresql.conf.sample file exists (redhat)] ***********************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | copy the sample file to org] *****************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | download the source code] ********************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | extract the files] ***************************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres |  copy the file to org file] ******************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | get the postgres conf file contents] *********************************************************************************************************************************************************************************
fatal: [test122b-database]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3.9"}, "changed": false, "msg": "file not found: /var/lib/pgsql/16/data/postgresql.conf.org"}

The failure and the absence of test coverage reinforce the concerns expressed in #41 (comment). What is the value of checking whether a configuration file has been modified and having some conditional logic for updating it?

@@ -25,17 +26,16 @@
- name: postgres | Check for presence of "Modified by ome postgresql ansible role" in file
ansible.builtin.lineinfile:
path: "{{ postgresql_dist_confdir }}/postgresql.conf"
line: "#Modified by ome postgresql ansible role"
line: "#Ansible managed"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous RedHat/Rocky template had # {{ ansible_managed }} so this will mark this file as not changed. Is that the expectation?

tasks/org_config_file.yml Outdated Show resolved Hide resolved
tasks/initialise.yml Outdated Show resolved Hide resolved
#------------------------------------------------------------------------------
# ANSIBLE OPTIONS
#------------------------------------------------------------------------------
#{{ ansible_managed }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above. If we are using this as the marker of managed file, we'll need to use the same whitespace.

Note also that postgresql-conf-10-ubuntu.j2 did not include this comment so the logic will not work with PSQL installations on Ubuntu with earlier versions of the role

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the last commit together with IDR/deployment#424 outside check-mode against test122b

Playbook execution
PLAY [test122b-database-hosts] *********************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************
[WARNING]: Platform linux on host test122b-database is using the discovered Python interpreter at /usr/bin/python3.9, but future installation of another Python interpreter could change the meaning of that path. See
https://docs.ansible.com/ansible-core/2.12/reference_appendices/interpreter_discovery.html for more information.
ok: [test122b-database]

TASK [postgres | Include ome.postgresql_client role] ***********************************************************************************************************************************************************************************************

TASK [ome.postgresql_client : Import a key for postgres (x86_64)] **********************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql_client : Import a key for postgres (aarch64)] *********************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql_client : Postgres | setup dnf repository for arch64] **************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql_client : Postgres | setup dnf repository for amd64] ***************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql_client : Postgres | install client packages] **********************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql_client : Postgres | install gpg for apt repo] *********************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql_client : Postgres | get postgresql apt key] ***********************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql_client : Postgres | setup apt repo] *******************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql_client : Postgres | install client packages] **********************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | fail if postgresql_users_databases defined] **********************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | fail if postgresql_install_server true] **************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | install server packages] *****************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | install extension packages] **************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | install ansible prerequisites] ***********************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : get langpack for en] ********************************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | set redhat dist variables] ***************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | install packages] ************************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | install ansible prerequisites] ***********************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | set debian dist variables] ***************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | set permissions on data directory] *******************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | initialise PostgreSQL cluster (skip if data directory already exists)] *******************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | check for presence of "#Ansible Managed" in file] ****************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | check that the postgresql.conf.org file exists] ******************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | copy a postgresql.conf to postgresql.conf.org] *******************************************************************************************************************************************************************
changed: [test122b-database]

TASK [ome.postgresql : postgres | Check that the postgresql.conf.sample file exists (ubuntu)] ******************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | Check that the postgresql.conf.sample file exists (redhat)] ******************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | copy the sample file to org] *************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | download the source code] ****************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | extract the files] ***********************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres |  copy the file to org file] **************************************************************************************************************************************************************************************
skipping: [test122b-database]

TASK [ome.postgresql : postgres | get the postgres conf file contents] *****************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : set_fact] *******************************************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | copy postgresql config file] *************************************************************************************************************************************************************************************
--- before: /var/lib/pgsql/16/data/postgresql.conf
+++ after: /Users/sbesson/.ansible/tmp/ansible-local-62773z2yh5esj/tmp86ty5vl1/postgresql-conf.j2
@@ -1,3 +1,4 @@
+#Ansible managed
#Modified by ome postgresql ansible role
# -----------------------------
# PostgreSQL configuration file
@@ -833,3 +834,9 @@
shared_buffers = 7960MB


+
+
+listen_addresses = '*'
+
+effective_cache_size = 23880MB
+shared_buffers = 7960MB

changed: [test122b-database]

TASK [ome.postgresql : postgres | configure client authorisation] **********************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | start service] ***************************************************************************************************************************************************************************************************
ok: [test122b-database]

TASK [ome.postgresql : postgres | create users] ****************************************************************************************************************************************************************************************************
ok: [test122b-database] => (item={'user': 'omero', 'password': '2NGS5rrKNqsnR6sasDLa71CG+IpoBmIoQwEVXarc2cto', 'databases': ['idr']})
ok: [test122b-database] => (item={'user': 'omeroreadonly', 'password': 'GDj2PSoCYYwt2BvWvsT3xePW42TocQ8ZaIpfFWIBjCJX', 'databases': ['idr']})

TASK [ome.postgresql : postgres | create databases] ************************************************************************************************************************************************************************************************
ok: [test122b-database] => (item={'name': 'idr', 'owner': 'omero', 'restrict': True})

TASK [ome.postgresql : postgres | revoke default permissions] **************************************************************************************************************************************************************************************
ok: [test122b-database] => (item={'name': 'idr', 'owner': 'omero', 'restrict': True})

TASK [ome.postgresql : postgres | revoke default schema permissions] *******************************************************************************************************************************************************************************
ok: [test122b-database] => (item={'name': 'idr', 'owner': 'omero', 'restrict': True})

TASK [ome.postgresql : postgres | grant database owner public schema privileges] *******************************************************************************************************************************************************************
ok: [test122b-database] => (item={'name': 'idr', 'owner': 'omero', 'restrict': True})

TASK [ome.postgresql : postgres | grant connect privileges] ****************************************************************************************************************************************************************************************
ok: [test122b-database] => (item=[{'user': 'omero', 'password': '2NGS5rrKNqsnR6sasDLa71CG+IpoBmIoQwEVXarc2cto'}, 'idr'])
ok: [test122b-database] => (item=[{'user': 'omeroreadonly', 'password': 'GDj2PSoCYYwt2BvWvsT3xePW42TocQ8ZaIpfFWIBjCJX'}, 'idr'])

TASK [ome.postgresql : postgres | grant usage privileges on default public schema] *****************************************************************************************************************************************************************
ok: [test122b-database] => (item=[{'user': 'omero', 'password': '2NGS5rrKNqsnR6sasDLa71CG+IpoBmIoQwEVXarc2cto'}, 'idr'])
ok: [test122b-database] => (item=[{'user': 'omeroreadonly', 'password': 'GDj2PSoCYYwt2BvWvsT3xePW42TocQ8ZaIpfFWIBjCJX'}, 'idr'])

RUNNING HANDLER [ome.postgresql : restart postgresql] **********************************************************************************************************************************************************************************************
changed: [test122b-database]

As mentioned in #41 (comment), the current logic appends the role to the bottom of the PostgreSQL configuration file (which has no Ansible comment at its top) on each execution.

@khaledk2
Copy link
Contributor Author

I have simplified the original file copying in case of updating the configuration of an installed Postgres server using a role previous version. In this case, the current configuration file will be cleaned by removing the added variables and the Ansible managed marker and copied to the org file.
So, it will not duplicate the variables by the end of the file as it is in #41 (comment).

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few cleanup comments.

The current version of the changes reads the portion of the configuration file delimited between the top level # Ansible managed comment and # Add settings for extensions here, stores this as postgresql.conf.org and uses it as the template for generating the new configuration file using the role variable. This feels like an approach which should cover the different scenarios above but we will definitely want some testing in a few contexts including IDR and the UoD deployments /cc @pwalczysko

In terms of testing, the biggest limitation is that when an upgrade an existing PostgreSQL cluster deployed with a prior version of the Ansible role (i.e. without an existing postgresql.conf.org file), running the role in check-mode will fail with a file not found exception. Said otherwise, I do not see an option to review the diff of the changes without executing the role outside check-mode.

On the IDR front, my plan is to execute this change against test122 once prod122 is promoted (hopefully later this week).

defaults/main.yml Outdated Show resolved Hide resolved
tasks/initialise.yml Outdated Show resolved Hide resolved
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per IDR/deployment#424 (comment), I think the latest version of this role works as expected in the context of an IDR deployment (with a version of the configuration file created by an earlier version of this role) but I think this should definitely be tested in the context of the UoD OMERO deployments too to make sure there are no regressions.

@pwalczysko
Copy link
Member

pwalczysko commented Jul 23, 2024

I have deployed omero-server using this PR for postgres on the fresh VM where there was neither postgres or OMERO installed (ome-train-wp3). All went fine, I can log in and import an image. Next step, I will run the changes in this PR over a postgres+server which was deployed with the released version of this role.

notify:
- restart postgresql

become_user: "{{ postgresql_become_user }}"
- name: postgres | Remove org config file in check mode and file does not exist
Copy link
Member

@pwalczysko pwalczysko Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: postgres | Remove org config file in check mode and file does not exist
- name: postgres | Remove org config file if not in check mode and file does not exist

Really not sure what this sentence was aiming to say, so suggesting ^^^ - but this might not be right - please do check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revise the formulation ^^^ - I am still not sure my suggestion is right

@pwalczysko
Copy link
Member

pwalczysko commented Jul 23, 2024

Tested also on a VM with pre-existing install of postgres and omero-server which was done without this PR (ome-train4-wp1).
With this PR, the rerun of the playbook made the expected changes and the functionality of the server and web was as expected. Also tried to restart the server after the playbook was run and import images before and after this restart. Everything as expected.

Copy link
Member

@pwalczysko pwalczysko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformulate the text in #41 (comment) as appropriate - after this, ready to merge, Thank you

Copy link
Member

@pwalczysko pwalczysko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants