-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker fails to start containers with cgroup memory allocation error. #841
Comments
Do you have swap active? Try to disable the swap! |
Swap is disabled on the host. |
This could be related to a bug in the RHEL/CentOS kernels where kernel-memory cgroups doesn't work properly; we included a workaround for this in later versions of docker to disable this feature; moby/moby#38145 (backported to Docker 18.09 and up docker-archive/engine#121) Note that Docker 18.06 reached EOL, and won't be updated with this fix, so I recommend updating to a current version. I'm closing this issue because of the above, but feel free to continue the conversation |
Hello.
|
We are also seeing this issue in our cluster.
These are the software versions that we are on. Could you please advise?
Thanks |
@thaJeztah I'm facing the exact same issue in my environment.
|
same here, RedHat 7.7. |
This is the continuity of this kernel bug, at least on RH: |
repros on CentOS 7 |
Same Issue here Centos 7 Docker version 19.03.5, build 633a0ea Provisioned via Nomad Log:
|
Must be fixed with kernel kernel-3.10.0-1075.el7 |
Same issue here: To resolve this issue we are going to replace the kernel with kernel-lt 4.4.206 from elrepo. We are still using iptables, so first we will need to reconfigure our hosts for nftables usage. |
Just so you know, we've tried with various 4.x kernels as well and had the same issue. |
Can you list affected 4.x kernels please? Thank you! |
It took me around a week to trigger the issue until i reboot the host. |
I am also facing this issue with mentioned docker(19.03.5) and kernel(kernel-3.10.0-1062) version on RHEL 7.7 Could you also provide where should I add this parameter? |
@kanthasamyraja edit etc/default/grub then update the grub config |
@kanthasamyraja: Note that the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1507149 is not in Per https://bugzilla.redhat.com/show_bug.cgi?id=1507149#c131 there is possibly a different bug that affects later kernels as well, which is what this ticket was reopened for by @jpmenil . So if your kernel version was accurate, you should first upgrade to Or you can distinguish them as when the newer issue hits,
|
It is working now for me. I am using below version. (RHEL7.7.) $ sudo rpm -qa | grep kernel-3.10.0-1062 Thanks for the information. |
@thaJeztah, i think we can close (again) this one, since adding the cgroup.memory=nokmem kernel parameter do the trick. |
@jpmenil I'm running RH 7.6 3.10.0-957.1.3.el7.x86_6 and just want to be sure on applying the fix. 1 - Set the kernel parameter (cgroup.memory=nokmem) in /etc/default/grub Any additional steps not listed above? Thanks in Advanced. |
hi, if you leaked too much memory cgroups, new memory cgroup cannot be created and will fail with "Cannot allocate memory". |
@bamb00 only the kernel parameter is needed. no need to upgrade docker. |
@jpmenil Thanks! verified that it works when In https://bugzilla.redhat.com/show_bug.cgi?id=1507149, they mentioned that the issue has been fixed in |
Hello, today I had the same problem in the production of my environment. My kernel was kernel-3.10.0-1062.9.1, after upgrading to kernel-3.10.0-1062.12.1, all containers started Does anyone have any other alternative? This problem node is part of a k8s cluster. |
I had this issue out of the blue on an otherwise idle k8s v18 cluster, with a pretty recent CentOS 7 kernel, did an upgrade to the latest packages, added cgroup.memory=nokmem to boot params with grubby and haven't seen the issue since the reboot. The upgrade was docker-ce 19.03.12-3 => 19.03.13-3 and kernel 3.10.0-1127.13.1 => 3.10.0-1127.19.1. |
I had this issue with this kernel version: [root@master debug]# uname -a
Linux master 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@master debug]# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core docker server version |
Are you all adding the cgroup.memory kernel parameter to master nodes as well? Seems to only apply to nodes where deployments are scheduled, but for consistency, I'm wondering about the master nodes as well. |
On all redhat related distributions, it may also be something related to the enablement of cgroupsv2. |
I'm here with this error and it's because from Fedora >= 31 has moved to cgroups v2. Using podman with the podman-docker interface works OK, except of course containers need to also support cgroups v2 and CentOS 7 does not. :( |
I have the same issue on Ubuntu 18.04
|
I'm facing the same issue, but I'm not sure if the issue came from cgroup memory. I tried to create my self cgroups and delete them and works fine, but still have the issue. Logs from the Kubernetes node
Kernel
Disabling the memory accounting with the kernel parameter |
Fedora 33 Server here.. brand new install tonight. I added the kernel parameter with the fedora supplied docker and could not get hello-world to work. https://docs.docker.com/engine/install/fedora/ , removes fedora supplied docker and replaces it.. rebooted and removed the kernel parameter, docker images needed to be rm b/c of overlay.. but after docker rm image; "things seem ok so far" (tm) |
Hi I have the same problem with a kernel version that is later than the one that is supposed to fix this bug ( Kubernetes log:
As far as I understand, the workaround to disable cgroup kernel memory accounting is not safe. Am I right here? |
Thank you! TL;DR: the version in the Fedora repository (33 as for now) is legacy. Install |
Much happiness with this Fedora system vs the Clear system I previously was running.. Just a homelab.. root@fedora ~# docker version Server: Docker Engine - Community I see the unified_cgroup_hierarchy is still an argument.. I'll remove it and confirm.. |
still present in centos 7, kernel |
Just found this by accident.. have not tried or tested.. https://wiki.voidlinux.org/Docker (void does not use systemd fwiw) Troubleshooting You may get the following error while running docker:
(void is a rolling release..) So on the current version of Void..
From Docker docs.. https://docs.docker.com/engine/install/binaries/ Prerequisites
https://github.com/tianon/cgroupfs-mount/blob/master/cgroupfs-mount My 0.02.. I'm sure there are reasons to run 'latest and greatest' docker.. 20.10.x but I'm quite happy with the minimal overhead from void paired with the usefulness of the base packages.. to get docker containers going.. fwiw, haproxy in docker gets destroyed cpu-wise for some reason.. installed haproxy in void base.. back to a sleeping giant.. Have void running in esxi and bare metal.. YMMV |
According to this https://src.fedoraproject.org/rpms/moby-engine fedora 34 has moby-engine-20.10.5, but fedora 33 and fedora 32 are still stuck with 19.03.x at the time of writing |
This issue is still reproduced in the following environment without kernel parameter
|
+1 to 3.10.0-1160.24.1.el7.x86_64 having the same issue |
Same problem here: Starting apache_db_1 ... error ERROR: for apache_db_1 Cannot start service db: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: mkdir /sys/fs/cgroup/memory/docker/dd0e00f46b0c794d48f612d717858a39a060cd1496cfe152b0844d80239da588: cannot allocate memory: unknown docker version Server: Docker Engine - Community |
Same issue too: cat /etc/centos-release
uname -r
docker version
|
Facing same issue Server: Docker Engine - Community |
good morning, i'm a linux newbie, but here the same problem, can anyone help me? Linux Debian 10 uname -r 4.19.0 docker version Client: Docker Engine - Community Server: Docker Engine - Community fails starting containers with error: For help also mail to free-radio@gmx.de Thanks in advance, |
Debian 10 has 4.19 kernel? (could be..) |
yes, greetings, Martin |
A complete log: 2022-04-18 05:13:59,633+0000 INFO [FelixStartLevel] *SYSTEM org.sonatype.nexus.pax.logging.NexusLogActivator - start |
Expected behavior
Docker should successfully start hello-world container.
Actual behavior
After a certain amount of time, docker fails to start any containers on a host with the following error:
[root@REDACTED]# docker run hello-world docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:279: applying cgroup configuration for process caused \"mkdir /sys/fs/cgroup/memory/docker/fe4159ed6f4ec16af63ba0c2af53ec9c6b0c0c2ac42ff96f6816d5e28a821b4e: cannot allocate memory\"": unknown. ERRO[0000] error waiting for container: context canceled
This issue has been fixed in the past by restarting the docker daemon or rebooting the machine although the docker daemon is active and running at the time of running the container. The machine has ample available memory and cpus and should have no problem starting the container.
Steps to reproduce the behavior
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.)
At the time of running the container, the host has 500GB of available memory and around 50+free cores.
The text was updated successfully, but these errors were encountered: