Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow args to be passed to Postgres during init #317

Closed
wants to merge 1 commit into from

Conversation

spmason
Copy link

@spmason spmason commented Jul 24, 2017

..sometimes it's desirable to pass additional arguments to Postgres when it's started for the init scripts

This change allows those to be given by setting a POSTGRES_INIT_ARGS environment variable

..sometimes it's desirable to pass additional arguments to Postgres when it's started for the init scripts

This change allows those to be given by setting a POSTGRES_INIT_ARGS environment variable
@spmason
Copy link
Author

spmason commented Jul 24, 2017

I'm open to changing the name of this environment variable, it's chosen to be similar to the existing POSTGRES_INITDB_ARGS variable, but might be so similar as to be confusing

@tianon
Copy link
Member

tianon commented Jul 24, 2017

Interesting! Can you elaborate a bit on what the use case of this is? What sorts of things would you specify with this feature that you'd want enabled on the initial run of the database but not on subsequent runs?

@spmason
Copy link
Author

spmason commented Jul 25, 2017

Hi @tianon - sure. The first time I wanted this feature was when our init script started giving back errors about checkpoint_segments being set too low. I wanted the ability to be able to set it higher just for the initial init run.

Recently I've realised that for our integration tests I can set fsync off and our init scripts run about twice as fast, which reduces the time taken for test runs. Obviously it's not recommended to have this setting off during normal operation

So this would be useful to us. For the timebeing we've hacked it by wrapping docker-entrypoint.sh in our own entrypoint script and have that run the following before executing the actual entrypoint script:

sed -i 's/-c listen_addresses=/-F -c full_page_writes=off -c synchronous_commit=off -c checkpoint_segments=100 -c listen_addresses=/' /docker-entrypoint.sh

Obviously this isn't particularly clean and is pretty prone to breaking should the structure of docker-entrypoint change

@tianon
Copy link
Member

tianon commented Jul 28, 2017

Interesting, so there are parameters that are useful to tweak while you're loading your initial dataset, but then you don't want those same parameter changes at runtime? Won't you potentially run into the same errors/warnings at runtime if you then reset those parameters after the initial data load?

@spmason
Copy link
Author

spmason commented Jul 28, 2017

It depends on the setting

During normal operation I wouldn't expect to be getting enough load to hit the checkpoint_segments warning, and if I did it would probably need to be more carefully tuned. The initial data load just wants to go as fast as it can.

Turning off fsync massively helps with the speed of the initial data load but you definitely don't want to be running that during normal operation. For the initial load it's acceptable to turn off because if your db crashes during your load it's very hard to write a script that can be restarted halfway through anyway (at least, it's easier and safer to just start a new container especially if it's fast)

@rgolangh
Copy link

rgolangh commented Sep 6, 2017

I'd like this change as well, in ovirt we tweak the postgres configuration for max_connection and autovacuum etc.

I would add to that enabling the include_dir by default in postrgresql.conf so one can pass a directory with custom configuration

@JeanMertz
Copy link

JeanMertz commented Nov 3, 2017

I'm looking for the same functionality as well.

Our use-case is running two postgres containers in the same pod on Kubernetes.

To do this, we currently

  • set POSTGRES_USER to postgres and postgres2,
  • set PGDATA to '/var/lib/postgresql/data/{postgres,postgres2}
  • set the args to postgres -p 5430/5431
  • and expose ports 5430 on the first postgres container and 5431 on the other

This works, except that pg_ctl still uses 5432 for both containers, and thus a port collision occurs:

waiting for server to start....LOG:  could not bind IPv6 socket: Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
LOG:  could not bind IPv4 socket: Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
WARNING:  could not create listen socket for "localhost"
FATAL:  could not create any TCP/IP sockets
LOG:  database system is shut down
pg_ctl: could not start server
Examine the log output.
 stopped waiting

@JeanMertz
Copy link

For our use-case, I've switched to https://github.com/sameersbn/docker-postgresql#creating-databases, which supports:

Additionally, more than one database can be created by specifying a comma separated list of database names in DB_NAME. For example, the following command creates two new databases named dbname1 and dbname2.

@driverpt
Copy link

@tianon , sometimes we just want to "Tweak" some minor configuration in Postgres Server without the need to create and maintain a specific docker image.
See example in #391

@tianon
Copy link
Member

tianon commented Dec 14, 2017

@driverpt right, that's completely possible today; see @yosifkit's recent documentation PR: docker-library/docs#1095

@driverpt
Copy link

@tianon , that has a problem. docker-entrypoint skips all the defaults if postgres is contained in the COMMAND part

@yosifkit
Copy link
Member

yosifkit commented Dec 14, 2017

@driverpt, I don't follow. If the command contains postgres (or is just flags beginning with -) is when all the initdb stuff happens. What is "skipped"?

$ docker run -d --name some-postgres postgres:10.1 -c 'shared_buffers=256MB' -c 'max_connections=20000'
4f5db51d1fe68d9b51c0f10b4a420ec431224b815c2649d23056f86c26a50e7d
$ docker run -it --rm --link some-postgres postgres psql -h some-postgres -U postgres
psql (10.1)
Type "help" for help.

postgres=# show shared_buffers;
 shared_buffers 
----------------
 256MB
(1 row)

postgres=# show max_connections;
 max_connections 
-----------------
 20000
(1 row)

postgres=# 

Regardless, this issue is about arguments to the temporary postgres server that is run only during initialization.

@gsf
Copy link

gsf commented Sep 13, 2018

I submitted #499 due to the same needs as @spmason (sorry missed this in my search of issues). On initialization we're loading a lot of data and the modification of fsync and max_wal_size is crucial for performance. There is currently no way to pass options into the pg_ctl start command during setup.

@tianon
Copy link
Member

tianon commented Nov 1, 2018

I think #496 is really our best bet here -- functionalizing the entrypoint would allow for trivially implementing edge-case custom behavior like this without explicit official support via a new environment variable.

@gsf
Copy link

gsf commented Nov 5, 2018

Breaking the entrypoint into reusable functions is totally worth doing but in this case I want entrypoint to do everything it does. I only need to pass options into pg_ctl start during setup, which seems like something the entrypoint should handle without an override.

@vyskocilm
Copy link

I agree with @gsf . There are two modes I would like to have

  1. fast one (relax WAL as much as possible) during DB dump load (pg_ctl command in docker-entrypoint.sh script)
  2. reasonable defaults for production.

Right now official image allows me to tweak 2.), where this PR adds easy way how to tweak 1.)

@tianon
Copy link
Member

tianon commented Jun 23, 2020

This was resolved in #496 -- it allows for two things relevant to this proposal:

  1. the flags/options passed to the main postgres server now get passed to the temporary server:

    -o "$(printf '%q ' "$@")" \

    docker_temp_server_start "$@"

  2. users needing more complicated behavior now have an explicit framework for implementing that without repeating everything from the image entrypoint script (invoking docker_temp_server_start with alternate arguments, in the case of this proposal)

@tianon tianon closed this Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants