Version 11 terminates unexpectedly #520

chranmat · 2018-10-30T10:04:38Z

Today, when trying to build and run my application I discover that Postgres suddenly terminates unexpectedly during the initialization of the application.

When doing further investigation I can see that latest Postgres image now is version 11. (I hadn't specified to use specific version in Dockerfile).

My application does a bunch of initialization tasks on start, and Postgres terminates during one of these.

Going back to version 10.5 solves the issue for me, but there is certainly an issue either with Postgres it self, or configuration issue with the default config in pg11 for the Docker image.

Here is my log output:

api_1 | Exception: SQLSTATE[HY000]: General error: 7 server closed the connection unexpectedly
api_1 | This probably means the server terminated abnormally
api_1 | before or while processing the request.
api_1 | Exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 7 no connection to the server'
db_1 | 2018-10-30 09:48:50.610 UTC [1] LOG: server process (PID 71) was terminated by signal 11: Segmentation fault
db_1 | 2018-10-30 09:48:50.610 UTC [1] DETAIL: Failed process was running: DELETE FROM "dashboard_layouts" WHERE "position"=$1
db_1 | 2018-10-30 09:48:50.610 UTC [1] LOG: terminating any other active server processes
db_1 | 2018-10-30 09:48:50.610 UTC [67] WARNING: terminating connection because of crash of another server process
db_1 | 2018-10-30 09:48:50.610 UTC [67] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
db_1 | 2018-10-30 09:48:50.610 UTC [67] HINT: In a moment you should be able to reconnect to the database and repeat your command.
db_1 | 2018-10-30 09:48:50.613 UTC [1] LOG: all server processes terminated; reinitializing
db_1 | 2018-10-30 09:48:50.627 UTC [72] LOG: database system was interrupted; last known up at 2018-10-30 09:48:46 UTC
db_1 | 2018-10-30 09:48:50.944 UTC [72] LOG: database system was not properly shut down; automatic recovery in progress
db_1 | 2018-10-30 09:48:50.950 UTC [72] LOG: redo starts at 0/1654800
db_1 | 2018-10-30 09:48:50.971 UTC [72] LOG: invalid record length at 0/17C72F8: wanted 24, got 0
db_1 | 2018-10-30 09:48:50.971 UTC [72] LOG: redo done at 0/17C72D0
db_1 | 2018-10-30 09:48:50.971 UTC [72] LOG: last completed transaction was at log time 2018-10-30 09:48:50.600994+00
db_1 | 2018-10-30 09:48:51.054 UTC [1] LOG: database system is ready to accept connections

wglambert · 2018-10-30T19:33:30Z

Can you give all the commands you ran and any relevant contextual information or files for reproducing the issue

yosifkit · 2018-10-30T22:06:30Z

The most likely culprit would be that the volume of data you had for postgres was for version 10.5 and PostgreSQL cannot read data directories of older versions and is unable to auto-upgrade. Related to and duplicate of #37

chranmat · 2018-10-31T07:38:44Z

@yosifkit, there were no previous data volume attached to the container. I've read the case #37 you refer to, how can you possibly assume this and close the case?

chranmat · 2018-10-31T07:43:45Z

@wglambert, I will see if I can reproduce it without providing all sources :)

erikdstock · 2018-11-02T13:04:29Z

Edit: postgres:10.5 worked for me as well.
I'm having what seems to be the exact same issue.

db_1   | 2018-11-02 01:00:22.248 UTC [1] LOG:  server process (PID 57) was terminated by signal 11: Segmentation fault
db_1   | 2018-11-02 01:00:22.248 UTC [1] DETAIL:  Failed process was running: COMMIT
db_1   | 2018-11-02 01:00:22.249 UTC [1] LOG:  terminating any other active server processes
db_1   | 2018-11-02 01:00:22.249 UTC [52] WARNING:  terminating connection because of crash of another server process
db_1   | 2018-11-02 01:00:22.249 UTC [52] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
db_1   | 2018-11-02 01:00:22.249 UTC [52] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
db_1   | 2018-11-02 01:00:22.254 UTC [1] LOG:  all server processes terminated; reinitializing
db_1   | 2018-11-02 01:00:22.281 UTC [58] LOG:  database system was interrupted; last known up at 2018-11-02 00:52:29 UTC
db_1   | 2018-11-02 01:00:22.282 UTC [59] FATAL:  the database system is in recovery mode
db_1   | 2018-11-02 01:00:22.758 UTC [58] LOG:  database system was not properly shut down; automatic recovery in progress
db_1   | 2018-11-02 01:00:22.762 UTC [58] LOG:  redo starts at 0/183B4E8
db_1   | 2018-11-02 01:00:22.763 UTC [58] LOG:  redo done at 0/183B4E8
db_1   | 2018-11-02 01:00:22.785 UTC [1] LOG:  database system is ready to accept connections

I can try to provide more context around this but here are a few basics:

had recently updated a django/wagtail cms from sqlite/docker to postgres/docker-compose
the error occurs in a django bootstrapping script after build: manage.py migrate and createadminuser tasks work fine, but a custom task fails on its first db operation.
it worked fine on a handful of development machines (all osx)
it did not work on another 2 machines- gave the segfault as above. These two machines were building later and have a different image id.
I also tried removing all associated volumes/images/containers and beginning from a freshly cloned repo. This included attempting it with a clean checkout in a new directory- i was worried some bind mount was shadowing files on the running container.

izakp · 2018-11-02T17:29:02Z

Same problem here - getting terminated by signal 11: Segmentation fault when trying to load data, starting with no initial data or volume mounted to the /var/lib/postgresql/data directory. As with @chranmat the postgres:10.5 image works as expected. @yosifkit I don't think this is a duplicate of #37 either... will you reopen?

wglambert · 2018-11-02T17:59:24Z

So @izakp you're getting this error from a blank startup with no data mounted? I think your example would be the most concise for reproducing the issue, could you post any relevant files that you have

erikdstock · 2018-11-02T18:00:34Z

(updated my comment to point out that 10.5 did work for me- I double-checked it after seeing izakp's comment)

raarts · 2018-11-05T16:39:43Z

I don't know if I have the exact same problem, but it's also in a DELETE query, and I can reproduce it with odoo on my mac:

docker run --name postgres -d postgres:11-alpine -c log_min_duration_statement=0

Enter the container and create an odoo user:

CREATE ROLE odoo with LOGIN CREATEDB PASSWORD 'odoo';

and now run odoo:

docker run -it --link postgres:postgres -p 8000:8069 -e DB_PORT_5432_TCP_ADDR=postgres -d odoo

Now:

connect to localhost:8000
create a database (fill in some email address/password, and choose English & United States, no demo data)
install the 'Website' module (it's at the top).

This fires a lot of queries, and after a while postgres crashes, the odoo container shows the query it crashed on. Issuing it by hand repeats the crash.

The crash does not happen with v10.

EDIT: I found that only deleting very long rows crashed the server.

tianon · 2018-11-05T16:49:21Z

Sounds like it's probably a bug in PostgreSQL 11 -- is it only something that can be reproduced on the Alpine variant? Can you reproduce with PostgreSQL 11 installed outside Docker? It sounds like it's probably worth trying to create a more minimal reproducer (starting all of "odoo" is a bit heavy, heh) and reporting upstream if it can be reliably reproduced on the officially-supported upstream packages as well.

izakp · 2018-11-05T17:25:03Z

@wglambert replying to your questions above...

you're getting this error from a blank startup with no data mounted?

Sorry, not quite this... Postgres successfully initializes in the container from a blank data directory with nothing mounted. Once it is up and running, when I try and batch INSERT data into the server, it segfaults. Unfortunately, I can't post the data here as it's sensitive.

I think your example would be the most concise for reproducing the issue, could you post any relevant files that you have

I'll try and narrow it down to the specific query that topples it during the importer.

labkey-tchad · 2018-11-08T16:41:21Z

This appears to be a known issue in Postgres 11.0
It is fixed in 11.1

yosifkit · 2018-11-09T01:22:53Z

11.1 will be built and pushed once docker-library/official-images#5054 merges.

You can test early by building the current 11 context: https://github.com/docker-library/postgres/tree/64bec4b1617291e3646e4e7dbbae1174404c3fd9/11.

raarts · 2018-11-09T09:59:07Z

Thanks, will test and report back.

tedivm · 2018-11-10T23:09:29Z

I'm seeing the same issue and the new 11.1 images has not resolved it.

yosifkit · 2018-11-12T22:05:27Z

I am unable to reproduce the crash on 11.1. I used the query from the linked bug. It does reliably crash 11.0 but not 11.1:

create table foo (a int primary key, b int);
create table bar (a int references foo on delete cascade, b int);
insert into foo values (1, 1);
insert into foo values (2, 2);
alter table foo add c int;
alter table foo drop c;
delete from foo;

- https://www.postgresql.org/message-id/9cb4aa1c-12ba-59c3-fd75-545fa90fb92f%40lab.ntt.co.jp)

So the linked bug does fix one crash. If you are having other crashes, then it is still probably an upstream bug and you should report a minimal reproducer to them.

raarts · 2018-11-12T22:23:55Z

Can confirm that the odoo DELETE bug is also fixed in 11.1. Thanks!

tedivm · 2018-11-12T23:51:37Z

Thanks all- we've downgraded back to 10.5 and our issues have gone away. When I can properly replicate it I'll report it upstream.

tianon · 2018-11-21T22:36:29Z

👍

JaneJeon · 2021-12-25T06:50:36Z

At least on GitHub Actions, v13 & v14 reliably fails with this issue (I have confirmed v10.5-12 works). Just stand up a container with a small app to connect to it as "test"; then, it will fail always on CI. However, it doesn't fail on local which makes debugging all the more tricky.

wglambert added the question Usability question, not directly related to an error with the image label Oct 30, 2018

yosifkit closed this as completed Oct 30, 2018

wglambert reopened this Nov 2, 2018

wglambert added Issue and removed question Usability question, not directly related to an error with the image labels Nov 2, 2018

tianon closed this as completed Nov 21, 2018

JaneJeon added a commit to JaneJeon/objection-authorize that referenced this issue Dec 25, 2021

https://github.com/docker-library/postgres/issues/520

32d656b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 11 terminates unexpectedly #520

Version 11 terminates unexpectedly #520

chranmat commented Oct 30, 2018

wglambert commented Oct 30, 2018

yosifkit commented Oct 30, 2018

chranmat commented Oct 31, 2018 •

edited

Loading

chranmat commented Oct 31, 2018

erikdstock commented Nov 2, 2018 •

edited

Loading

izakp commented Nov 2, 2018

wglambert commented Nov 2, 2018

erikdstock commented Nov 2, 2018

raarts commented Nov 5, 2018 •

edited

Loading

tianon commented Nov 5, 2018 via email

izakp commented Nov 5, 2018

labkey-tchad commented Nov 8, 2018

yosifkit commented Nov 9, 2018

raarts commented Nov 9, 2018

tedivm commented Nov 10, 2018

yosifkit commented Nov 12, 2018

raarts commented Nov 12, 2018

tedivm commented Nov 12, 2018

tianon commented Nov 21, 2018

JaneJeon commented Dec 25, 2021

Version 11 terminates unexpectedly #520

Version 11 terminates unexpectedly #520

Comments

chranmat commented Oct 30, 2018

wglambert commented Oct 30, 2018

yosifkit commented Oct 30, 2018

chranmat commented Oct 31, 2018 • edited Loading

chranmat commented Oct 31, 2018

erikdstock commented Nov 2, 2018 • edited Loading

izakp commented Nov 2, 2018

wglambert commented Nov 2, 2018

erikdstock commented Nov 2, 2018

raarts commented Nov 5, 2018 • edited Loading

tianon commented Nov 5, 2018 via email

izakp commented Nov 5, 2018

labkey-tchad commented Nov 8, 2018

yosifkit commented Nov 9, 2018

raarts commented Nov 9, 2018

tedivm commented Nov 10, 2018

yosifkit commented Nov 12, 2018

raarts commented Nov 12, 2018

tedivm commented Nov 12, 2018

tianon commented Nov 21, 2018

JaneJeon commented Dec 25, 2021

chranmat commented Oct 31, 2018 •

edited

Loading

erikdstock commented Nov 2, 2018 •

edited

Loading

raarts commented Nov 5, 2018 •

edited

Loading