-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroup v1/v2 compatibility issue when setting memory below the current usage #3509
Comments
could we use |
I don't have a complete understanding at this point but are we talking about cgroup memory limit applied at the time of container creation? And if that's the case, is the difference then the fact that in cgroupv2 the kernel isn't returning an EBUSY anymore?
And then have runc parse it and fail early instead of the container being OOMKilled? |
This is when we try to update the memory limit of an already running container to a value that is less than what it is currently using. In v1, we got EBUSY, but in v2, kernel applies the value and if it is low, the container is OOM Killed. |
From the vertical pod autoscaler POV -- yes. Meaning, it will still have to distinguish between v1 and v2. Meaning, it does not make sense to add a flag I have proposed in the description. |
I think that will have to be phase 2 with cgroups v2 in k8s. Phase 1 is just a direct mapping to v1. |
Is it possible to get the current memory usage from |
This setting can be used to mimic cgroup v1 behavior on cgroup v2, when setting the new memory limit during update operation. In cgroup v1, a limit which is lower than the current usage is rejected. In cgroup v2, such a low limit is causing an OOM kill. Ref: opencontainers/runc#3509 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Is there a similar problem with other configurations other than memory? |
Not that I know of. |
With cgroup v1, when we set the memory limit to below the current usage (
runc update
on a running container), the kernel returns EBUSY and runc fails with a nice error message:With cgroup v2, when do do this, kernel OOM killer just kill the container. This makes this behavior incompatible with cgroup v1.
One (imperfect) workaround is to add a flag to OCI spec that disallows to set memory limit to the value lower than the current usage. This is borderline ugly but at least in most cases we'll return an error instead of letting the container being OOM killed.
(the other, much less serious part of the problem is, when container is disappearing in the middle of runc update, we get all sorts of ugly messages)
The text was updated successfully, but these errors were encountered: