Skip to content

Latest commit

 

History

History
851 lines (610 loc) · 34 KB

README.md

File metadata and controls

851 lines (610 loc) · 34 KB

Scenario Tests deploy-image

Table of Contents

docker-postgis

A simple docker container that runs PostGIS

Visit our page on the docker hub at: https://hub.docker.com/r/kartoza/postgis/

There are a number of other docker postgis containers out there. This one differentiates itself by:

  • Provides SSL support out of the box and enforces SSL client connections
  • Connections are restricted to the docker subnet
  • A default database gis is created for you so you can use this container 'out of the box' when it runs with e.g. QGIS
  • Streaming replication and logical replication support included (turned off by default)
  • Ability to create multiple database when starting the container.
  • Ability to create multiple schemas when starting the container.
  • Enable multiple extensions in the database when setting it up.
  • Gdal drivers automatically registered for pg raster.
  • Support for out-of-db rasters.

We will work to add more security features to this container in the future with the aim of making a PostGIS image that is ready to be used in a production environment (though probably not for heavy load databases).

There is a nice 'from scratch' tutorial on using this docker image on Alex Urquhart's blog here - if you are just getting started with docker, PostGIS and QGIS, we recommend that you read it and try out the instructions specified on the blog.

Tagged versions

The following convention is used for tagging the images we build:

kartoza/postgis:[POSTGRES_MAJOR_VERSION]-[POSTGIS_MAJOR_VERSION].[POSTGIS_MINOR_RELEASE]

So for example:

kartoza/postgis:17-3.5 Provides PostgreSQL 17.0, PostGIS 3.5

Note: We highly recommend that you use tagged versions because successive minor versions of PostgreSQL write their database clusters into different database directories - which will cause your database to appear to be empty if you are using persistent volumes for your database storage.

Getting the image

There are various ways to get the image onto your system:

The preferred way (but using most bandwidth for the initial image) is to get our docker trusted build like this,

docker pull kartoza/postgis:image_version

Building the image

Self build using Repository checkout

To build the image yourself do:

docker build -t kartoza/postgis git://github.com/kartoza/docker-postgis

Alternatively clone the repository and build against any preferred branch

git clone git://github.com/kartoza/docker-postgis
git checkout branch_name

Then do:

docker build -t kartoza/postgis .

Or build against a specific PostgreSQL version

docker build --build-arg POSTGRES_MAJOR_VERSION=13 --build-arg POSTGIS_MAJOR=3 -t kartoza/postgis:POSTGRES_MAJOR_VERSION .

Alternative base distributions builds

There are build args for DISTRO (=debian), IMAGE_VERSION (=buster) and IMAGE_VARIANT (=slim) which can be used to control the base image used (but it still needs to be Debian based and have PostgreSQL official apt repo).

For example making Ubuntu 20.04 based build (for better arm64 support) Edit the .env file to change the build arguments,

DISTRO=ubuntu 
IMAGE_VERSION=focal 
IMAGE_VARIANT="" 

Then run the script

./build.sh

Locales

By default, the image build will include all locales to cover any value for locale settings such as DEFAULT_COLLATION, DEFAULT_CTYPE or DEFAULT_ENCODING.

You can use the build argument: GENERATE_ALL_LOCALE=0

This will build with the default locate and speed up the build considerably.

Environment variables

Cluster Initializations

With a minimum setup, our image will use an initial cluster located in the DATADIR environment variable. If you want to use persistence, mount these locations into your volume/host. By default, DATADIR will point to /var/lib/postgresql/{major-version}. You can instead mount the parent location like this:

-v data-volume:/var/lib/postgresql

This default cluster will be initialized with default locale settings C.UTF-8. If, for instance, you want to create a new cluster with your own settings (not using the default cluster). You need to specify different empty directory, like this

-v data-volume:/opt/postgres/data \
-e DATADIR:/opt/postgres/data \
-e DEFAULT_ENCODING="UTF8" \
-e DEFAULT_COLLATION="id_ID.utf8" \
-e DEFAULT_CTYPE="id_ID.utf8" \
-e PASSWORD_AUTHENTICATION="md5" \
-e INITDB_EXTRA_ARGS="<some more initdb command args>" \
-v pgwal-volume:/opt/postgres/pg_wal \
-e POSTGRES_INITDB_WALDIR=/opt/postgres/pg_wal

The containers will use above parameters to initialize a new db cluster in the specified directory. If the directory is not empty, then the initialization parameter will be ignored.

These are some initialization parameters that will only be used to initialize a new cluster. If the container uses an existing cluster, it is ignored (for example, when the container restarts).

  • DEFAULT_ENCODING: cluster encoding
  • DEFAULT_COLLATION: cluster collation
  • DEFAULT_CTYPE: cluster ctype
  • WAL_SEGSIZE: WAL segsize option
  • PASSWORD_AUTHENTICATION : PASSWORD AUTHENTICATION
  • INITDB_EXTRA_ARGS: extra parameter that will be passed down to initdb command
  • POSTGRES_INITDB_WALDIR: parameter to tell Postgres about the initial waldir location. Note: You must always mount persistent volume to this location. Postgres will expect that the directory will always be available, even though it doesn't need the environment variable anymore. If you didn't persist this location, Postgres will not be able to find the pg_wal directory and consider the instance to be broken.

In addition to that, we have another parameter: RECREATE_DATADIR that can be used to force database re-initializations. If this parameter is specified as TRUE it will act as explicit consent to delete DATADIR and create new db cluster.

  • RECREATE_DATADIR: Force database re-initialization in the location DATADIR

If you used RECREATE_DATADIR and successfully created a new cluster. Remember that you should remove this parameter afterwards. Because, if it was not omitted, it will always recreate new db cluster after every container restarts.

Postgres Encoding

The database cluster is initialized with the following encoding settings

-E "UTF8" --lc-collate="en_US.UTF-8" --lc-ctype="en_US.UTF-8"

or

-E "UTF8" --lc-collate="C.UTF-8" --lc-ctype="C.UTF-8"

If you use default DATADIR location.

If you need to set up a database cluster with other encoding parameters you need to pass the environment variables when you initialize the cluster.

  • -e DEFAULT_ENCODING="UTF8"
  • -e DEFAULT_COLLATION="en_US.UTF-8"
  • -e DEFAULT_CTYPE="en_US.UTF-8"

Initializing a new cluster can be done by using different DATADIR location and mounting an empty volume. Or use parameter RECREATE_DATADIR to forcefully delete the current cluster and create a new one. Make sure to remove parameter RECREATE_DATADIR after creating the cluster.

See the postgres documentation about encoding for more information.

PostgreSQL extensions

The container ships with some default extensions i.e. postgis,hstore,postgis_topology,postgis_raster,pgrouting

You can use the environment variable POSTGRES_MULTIPLE_EXTENSIONS to activate a subset or multiple extensions i.e.

-e POSTGRES_MULTIPLE_EXTENSIONS=postgis,hstore,postgis_topology,postgis_raster,pgrouting`

Note: Some extensions require extra configurations to get them running properly otherwise they will cause the container to exit. Users should also consult documentation relating to that specific extension i.e. timescaledb, pg_cron, pgrouting

You can also install tagged version of extensions i.e

POSTGRES_MULTIPLE_EXTENSIONS=postgis,pgrouting:3.4.0

where pgrouting:3.4.0 The extension name is fixed with the version name with the delimiter being a colon.

Note In some cases, some versions of extensions might not be available for install. To enable them you can do the following inside the container:

wget --directory-prefix /usr/share/postgresql/15/extension/ https://raw.githubusercontent.com/postgres/postgres/master/contrib/hstore/hstore--1.1--1.2.sql

Then proceed to install it the normal way.

Shared preload libraries

Some PostgreSQL extensions require shared_preload_libraries to be specified in the conf files. Using the environment variable SHARED_PRELOAD_LIBRARIES you can pass comma separated values that correspond to the extensions defined using the environment variable POSTGRES_MULTIPLE_EXTENSIONS.

The default libraries that are loaded are pg_cron,timescaledb if the image is built with timescale support otherwise only pg_cron is loaded. You can pass the env variable,

  -e SHARED_PRELOAD_LIBRARIES='pg_cron,timescaledb'

Note You cannot pass the environment variable SHARED_PRELOAD_LIBRARIES without specifying the PostgreSQL extension that correspond to the SHARED_PRELOAD_LIBRARIES. This will cause the container to exit immediately.

Basic configuration

You can use the following environment variables to pass a username, password and/or default database name(or multiple databases comma separated).

  • -e POSTGRES_USER=<PGUSER>

  • -e POSTGRES_PASS=<PGPASSWORD>

    Note: You should use a strong passwords. If you are using docker-compose make sure docker can interpolate the password. Example using a password with a $ you will need to escape it ie $$

  • -e POSTGRES_DBNAME=<PGDBNAME>

  • -e SSL_CERT_FILE=/your/own/ssl_cert_file.pem

  • -e SSL_KEY_FILE=/your/own/ssl_key_file.key

  • -e SSL_CA_FILE=/your/own/ssl_ca_file.pem

  • -e DEFAULT_ENCODING="UTF8"

  • -e DEFAULT_COLLATION="en_US.UTF-8"

  • -e DEFAULT_CTYPE="en_US.UTF-8"

  • -e POSTGRES_TEMPLATE_EXTENSIONS=true

  • -e ACCEPT_TIMESCALE_TUNING=TRUE Useful to tune PostgreSQL conf based on timescaledb-tune. Defaults to FALSE.

  • -e TIMESCALE_TUNING_PARAMS Useful to configure none default settings to use when running ACCEPT_TIMESCALE_TUNING=TRUE. This defaults to empty so that we can use the default settings provided by the timescaledb-tune. Example,

    docker run -it --name timescale -e ACCEPT_TIMESCALE_TUNING=TRUE \
      -e POSTGRES_MULTIPLE_EXTENSIONS=postgis,hstore,postgis_topology,postgis_raster,pgrouting,timescaledb \
      -e TIMESCALE_TUNING_PARAMS="-cpus=4" kartoza/postgis:17-3.5

Note: ACCEPT_TIMESCALE_TUNING environment variable will overwrite all configurations based on the timescale configurations

Specifies whether extensions will also be installed in template1 database.

Schema Initialization

  • -e SCHEMA_NAME=<PGSCHEMA> You can pass a comma separated value of schema names which will be created when the database initializes. The default behavior is to create the schema in the first database specified in the environment variable POSTGRES_DBNAME. If you need to create matching schemas in all the databases that will be created you use the environment variable ALL_DATABASES=TRUE.

Configures archive mode

This image uses the initial PostgreSQL values which disables the archiving option by default. When ARCHIVE_MODE is changed to on, the archiving command will copy WAL files to /opt/archivedir

More info: 19.5. Write Ahead Log

  • -e ARCHIVE_MODE=off
  • -e ARCHIVE_COMMAND="test ! -f /opt/archivedir/%f && cp %p /opt/archivedir/%f" More info
  • -e ARCHIVE_CLEANUP_COMMAND="pg_archivecleanup /opt/archivedir %r"
  • -e RESTORE_COMMAND='cp /opt/archivedir/%f "%p"'

Configure WAL level

  • -e WAL_LEVEL=replica

    More info. Maximum size to let the WAL grow to between automatic WAL checkpoints.

  • -e WAL_SIZE=4GB

  • -e MIN_WAL_SIZE=2048MB

  • -e WAL_SEGSIZE=1024

  • -e MAINTENANCE_WORK_MEM=128MB

Configure networking

You can open up the PG port by using the following environment variable. By default, the container will allow connections only from the docker private subnet.

  • -e ALLOW_IP_RANGE=<0.0.0.0/0> By default

Postgres conf is set up to listen to all connections and if a user needs to restrict which IP address PostgreSQL listens to you can define it with the following environment variable. The default is set to listen to all connections,

  • -e IP_LIST=<*>

Additional configuration