Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedora Docker-CE-Engine 20.10.13 consumes all available system memory (kernel 5.16.13) #43361

Closed
kevin0x90 opened this issue Mar 11, 2022 · 23 comments · Fixed by #45534
Closed

Comments

@kevin0x90
Copy link

kevin0x90 commented Mar 11, 2022

Description
The issue occured on my Fedora version: Fedora release 35
Kernel Information:
Linux 5.16.12-200.fc35.x86_64 #1 SMP PREEMPT Wed Mar 2 19:06:17 UTC 2022

When starting a docker-compose project with mysql with Docker-CE-Engine Version 20.10.13 it consumes all available system memory. With version 20.10.10 the issue is non existing and the docker-compose project requires only ~2GB of RAM.

Steps to reproduce the issue:

  1. setup a docker-compose project including mysql:5.6
  2. run the project with docker-compose
  3. monitor memory usage with for example activity monitor

Describe the results you received:
All available system memory is consumed and the system stops working at some point.

Describe the results you expected:
I would expected around the same memory consumption as with the old working Version 20.10.10

Additional information you deem important (e.g. issue happens only occasionally):
The issue was reproducible and only a downgrade to 20.10.10 could solve the issue.

Version info where the issue occured:
Client: Docker Engine - Community
Version: 20.10.13
API version: 1.41
Go version: go1.16.15
Git commit: a224086
Built: Thu Mar 10 14:08:18 2022
OS/Arch: linux/amd64
Context: default
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.13
API version: 1.41 (minimum version 1.12)
Go version: go1.16.15
Git commit: 906f57f
Built: Thu Mar 10 14:06:06 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.5.10
GitCommit: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc:
Version: 1.0.3
GitCommit: v1.0.3-0-gf46b6ba
docker-init:
Version: 0.19.0
GitCommit: de40ad0

Additional environment details (AWS, VirtualBox, physical, etc.):
physical hardware.

@vaceletm
Copy link

I got the same issue with mysql:5.7 image & 20.10.13 version, memory consumption is very high so system is swaping and the startup sequence is extremely slow. Can easily be reproduced with a docker run -i mysql:5.7

It seems that the problem doesn't seem to exist with mysql:8.0 and, as a matter of fact, everything worked with previous docker version.

@vaceletm
Copy link

I downgraded to 20.10.12, 20.10.11 and 20.10.10 (the 3 last versions available in the official repo) and I still hit the same issue. That's maybe a kernel issue.

@thaJeztah
Copy link
Member

Thanks for reporting; so to reproduce the issue, just a docker run -i mysql:5.7 (no other options) is sufficient?

If that's the case, that's odd indeed. As a workaround to prevent the system from running out of memory, you could of course add memory constraints to the container itself (but that wouldn't fix the underlying issue, just possibly prevent it from consuming all memory).

@thaJeztah
Copy link
Member

Could you perhaps also add the output of docker info ? (that contains additional information, such as kernel version, storage driver etc); of course feel free to redact information where needed.

@vaceletm
Copy link

Thanks for reporting; so to reproduce the issue, just a docker run -i mysql:5.7 (no other options) is sufficient?

Yes, it's as simple as that. The output will be the following for a while (while eating all the RAM) and eventually the init will continue.

2022-03-14 12:18:46+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.

As a workaround to prevent the system from running out of memory, you could of course add memory constraints to the container itself

Actually, setting a memory constraint makes mysql init fail:

docker run --memory 1073741824 -i mysql:5.7
2022-03-14 12:25:12+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
2022-03-14 12:25:16+00:00 [ERROR] [Entrypoint]: mysqld failed while attempting to check config
	command was: mysqld --verbose --help --log-bin-index=/tmp/tmp.sz9LdwWe78

(Same command works with mysql:8.0)

docker info, here it is:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 12
  Running: 0
  Paused: 0
  Stopped: 12
 Images: 80
 Server Version: 20.10.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc version: v1.0.3-0-gf46b6ba
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.14.10-300.fc35.x86_64
 Operating System: Fedora Linux 35 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.39GiB
 Name: localhost.localdomain
 ID: BVZQ:2MR3:XMZ6:OCVR:RHF2:SLKM:UIVC:KELR:PYSI:PW7R:2GX5:D3FB
 Docker Root Dir: /home/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

I switched back to older kernel version (5.14.10 here, identified the issue with 5.16.13)

@thaJeztah thaJeztah changed the title Fedora Docker-CE-Engine 20.10.13 consumes all available system memory Fedora Docker-CE-Engine 20.10.13 consumes all available system memory (kernel 5.16.13) Mar 14, 2022
@vaceletm
Copy link

FTR, @LeSuisse identified that a rollback to containerd.io-1.4.13-3.1.fc35 solves the problem

@kaittodesk
Copy link

Using a different kernels (5.14.18-300.fc35, 5.16.14-200.fc35) and Docker CE Engines (20.10.12, 20.10.11, 20.10.10) did not resolve the issue for me. Only downgrading to containerd.io-1.4.13-3.1.fc35 resolved the memory leak.

Here's my docker info output of the stable setup:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 12
  Running: 12
  Paused: 0
  Stopped: 0
 Images: 146
 Server Version: 20.10.13
 Storage Driver: btrfs
  Build Version: Btrfs v5.16.2 
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cc61520f4cd876b86e77edfeb88fbcd536d1f9d
 runc version: v1.0.3-0-gf46b6ba
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.16.13-200.fc35.x86_64
 Operating System: Fedora Linux 35 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 23.23GiB
 Name: localhost.localdomain
 ID: [REDACTED]
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

@thaJeztah
Copy link
Member

Are you seeing the same happening if you run the container through containerd?

Something like;

ctr image pull docker.io/library/mysql:5.7

ctr run --env MYSQL_ALLOW_EMPTY_PASSWORD=1 -t docker.io/library/mysql:5.7 mycontainer

@kaittodesk
Copy link

Running container through containerd (both 1.4.13-3.1.fc35 and 1.5.10-3.1.fc35) does not create the memory leak.

However in order to run the container I had to do some mounting trickery (hopefully it does not yield into an apples to oranges comparison):

cd /home/kait
mkdir run
chmod 777 run
ctr run --rm --mount "type=bind,src=/home/kait/run,dst=/var/run/mysqld,options=rbind:rw" --env MYSQL_ALLOW_EMPTY_PASSWORD=1 docker.io/library/mysql:5.7 mycontainer

Otherwise the container initialization would fail with error:

2022-03-18T07:28:20.058885Z 0 [ERROR] Could not create unix socket lock file /var/run/mysqld/mysqld.sock.lock.
2022-03-18T07:28:20.058893Z 0 [ERROR] Unable to setup unix socket lock file.
2022-03-18T07:28:20.058897Z 0 [ERROR] Aborting

And the server would shut down.

@vaceletm
Copy link

Is there anything we can do here to make it move forward ?
Should we report the issue to fedora as well ?

@kevin0x90
Copy link
Author

Is there any update about this?

@kevin0x90
Copy link
Author

Small update for the people using fedora with the upgrade to Fedora 36 there is no way to downgrade containerd. Just learned this the hard way after upgrading and the bug still existing 😅.

@kevin0x90
Copy link
Author

Maybe interesting for those who also upgraded already to fedora 36 i found a way to still downgrade to the working versions by specifying the fedora release version in dnf:

#!/bin/bash
sudo dnf --releasever=35 downgrade docker-ce-3:20.10.10 docker-ce-cli-3:20.10.10 containerd.io-1.4.13

@yannis-rossetto
Copy link

Hello,

I've upgraded my workstation to Fedora 36 with the last versions of containerd.io and docker-ce and the issue is still here. Only the downgrade suggested by @kevin0x90 seems to provide a running MySQL container without consuming all the memory.

How can we help you to solve this issue?

@pprishchepa
Copy link

Hello,

I've upgraded my workstation to Fedora 36 with the last versions of containerd.io and docker-ce and the issue is still here. Only the downgrade suggested by @kevin0x90 seems to provide a running MySQL container without consuming all the memory.

How can we help you to solve this issue?

This how it works for me on Fedora 36:

Downgrade containerd.io as @kevin0x90 wrote:

sudo dnf --releasever=35 downgrade docker-ce-3:20.10.10 docker-ce-cli-3:20.10.10 containerd.io-1.4.13

Then freeze containerd.io version to prevent further upgrading:

sudo dnf install 'dnf-command(versionlock)'
sudo dnf versionlock containerd.io-1.4.13

@kevin0x90
Copy link
Author

Maybe some good to know addition to the versionlock is that if you use gnome software for updates it will ignore the versionlock in dnf https://bugzilla.redhat.com/show_bug.cgi?id=1671489 I just stumbled about this recently.

@vaceletm
Copy link

For the record, switching from docker-ce to moby & all provided by fedora solved the issue for me

@pprishchepa
Copy link

@vaceletm could show a direction to dig about switching from docker-ce to mody?

@vaceletm
Copy link

Here is the full script of what I had to do, some of the change might be related to composer v2 switch (builtkit by default but I didn't track down everything):

$> dnf install moby-engine --allowerasing
$> sudo systemctl edit docker
[Service]
LimitNOFILE=1024
$> sudo systemctl daemon-reload
$> sudo setenforce disabled
$> vim /etc/selinux/config
SELINUX=permissive
$> sudo systemctl restart docker

Be careful: with this approach you disable selinux on your platform, you might be at risk then. Evaluate the consequences beforehand.

@krnhotwings
Copy link

Just came across this issue yesterday. Figure I'd provide additional info and which solution worked best for me.

Kernel: 5.18.18-200.fc36.x86_64
Docker version:

Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:59 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:39 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.7
  GitCommit:        0197261a30bf81f1ee8e6a4dd2dea0ef95d67ccb
 runc:
  Version:          1.1.3
  GitCommit:        v1.1.3-0-g6724737
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker run --rm apache/airflow:2.3.3-python3.9 scheduler worked fine.
docker run --rm apache/airflow:2.3.4-python3.9 scheduler ate up memory.

I tried uninstalling docker-ce's docker-engine and installing Fedora's moby-engine, which worked, but ran into SELinux issues as mentioned above.

What works decently well for me is Docker Desktop for Linux. I just enable "Start Docker Desktop when you log in" (and change other settings...), and then change the CLI's context via:

docker context ls
docker context use desktop-linux

What's nice is that you can run docker commands without sudo.

Other than running into UID-related issues, things seem to be working fine.

Client: Docker Engine - Community
 Cloud integration: v1.0.28
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:59 2022
 OS/Arch:           linux/amd64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.11.1 (84025)
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:23 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

@sam-thibault
Copy link
Contributor

potiuk added a commit to potiuk/airflow that referenced this issue Feb 25, 2023
Apparently with some recent releases of some OSes where new
containerd has been released, default docker community edition
causes Airlfow to immediately consume all memory.

This happens in Breeze and Docker Compose at least.

There is a workaround described in:
ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516
to use Docker Desktop instead.

The issue is tracked in containerd in this issue - proposing to
revert the change (as it impacts other applications run in docker,
not only Airlfow):
containerd/containerd#7566
potiuk added a commit to apache/airflow that referenced this issue Feb 25, 2023
Apparently with some recent releases of some OSes where new
containerd has been released, default docker community edition
causes Airlfow to immediately consume all memory.

This happens in Breeze and Docker Compose at least.

There is a workaround described in:
ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516
to use Docker Desktop instead.

The issue is tracked in containerd in this issue - proposing to
revert the change (as it impacts other applications run in docker,
not only Airlfow):
containerd/containerd#7566
@vyeve
Copy link

vyeve commented Mar 3, 2023

Hi. I have Fedora-36 and solved this issue by changing in /usr/lib/systemd/system/containerd.service LimitNOFILE=infinity to LimitNOFILE=1048576. Reboot and all works.

@thaJeztah
Copy link
Member

Looks like this is effectively a duplicate of / covered by #38814, and will be addressed by #45534

@thaJeztah thaJeztah closed this as not planned Won't fix, can't repro, duplicate, stale Jun 8, 2023
ahidalgob pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jun 12, 2023
Apparently with some recent releases of some OSes where new
containerd has been released, default docker community edition
causes Airlfow to immediately consume all memory.

This happens in Breeze and Docker Compose at least.

There is a workaround described in:
ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516
to use Docker Desktop instead.

The issue is tracked in containerd in this issue - proposing to
revert the change (as it impacts other applications run in docker,
not only Airlfow):
containerd/containerd#7566

GitOrigin-RevId: de2889c2e9779177363d6b87dc9020bf210fdd72
ahidalgob pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 7, 2023
Apparently with some recent releases of some OSes where new
containerd has been released, default docker community edition
causes Airlfow to immediately consume all memory.

This happens in Breeze and Docker Compose at least.

There is a workaround described in:
ttps://github.com/moby/moby/issues/43361#issuecomment-1227617516
to use Docker Desktop instead.

The issue is tracked in containerd in this issue - proposing to
revert the change (as it impacts other applications run in docker,
not only Airlfow):
containerd/containerd#7566

GitOrigin-RevId: de2889c2e9779177363d6b87dc9020bf210fdd72
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants