-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log something about OOMKilled containers #69676
Comments
@dims in #sig-sheduling mentionned this https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/docker_container.go#L356-L359 |
related discussion in sig-schedling slack https://kubernetes.slack.com/messages/C09TP78DV/ |
https://kubernetes.slack.com/archives/C09TP78DV/p1539255149000100 to be exact but it might fall outside the slack allowed history for free tier pretty soon. |
FYI there is a related Prometheus metric that is able to grep OOMKilled: https://github.com/kubernetes/kube-state-metrics/blob/master/Documentation/pod-metrics.md
It might be a bit tricky to implement as an alert, thus Still, it would be nice if a logline was printed by k8s by default. |
|
@nikopen Hi, I just tried |
Indeed, it seems to work :) @brancz do you know why this happens? also tried it in 1.3.1.
|
Because the |
@brancz my test is in 1.3.1 and |
Strange, you're right, looks like we have a mistake in the changelog. The code seems to be there in 1.3.1: https://github.com/kubernetes/kube-state-metrics/blob/v1.3.1/collectors/pod.go#L125 |
@anderson4u2 I am a bit confused by your last comment. You wrote:
But in the example below you use So as far as I see, the new (very useful) metric |
This has been discussed in #sig-instrumentation on Slack and was brought up on the sig-node call yesterday to determine a path forward. There are two requests:
To summarize what's currently available in kube-state-metrics:
The issues are:
For example, given a Pod that is sometimes being OOMKilled, and sometimes crashing, it's desired to be able to view the historical termination reasons over time. As a note: it was discussed and it appears the design of kube-state-metrics prevents aggregating the reason gauge into counters, and it's preferred if this happens at the source. Implementing both of the above requests will significantly improve the ability of cluster-users and monitoring vendors to debug when Pods are failing. Can @kubernetes/sig-node-feature-requests provide some guidance on the next steps here? CC: @dchen1107 |
long-term-issue (note to self) |
cc @kubernetes/sig-node-feature-requests |
/cc |
This query combines container restart and termination reason:
|
Our team came up with a custom controller to implement the idea of having an OOMKilled event tied to the Pod. Please find it here: https://github.com/xing/kubernetes-oom-event-generator From the README: We would be very happy to get feedback on it. |
Meta point, there are several times we have had OOM events show up in dmesg that cadvisor missed them and they did not end up as events... |
@brancz thanks! I just wanted to point out that cadvisor misses OOM events and that should be considered when relying on it... |
I see. If not already that should be reported to cAdvisor :) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Is this still relevant after #108004? It seems to me that it is covering the gaps kube-state-metrics has with OOMKilled events. |
The problem here is that a pod can disappear and there's no record of why. A metric is useful in that it lets you know something is wrong but it doesn't actually tell you what is wrong. K8s shouldn't be killing pods without leaving a record of why it killed which pod in an obvious place. |
@lukeschlather for the record, the kernel kills pods, not k8s. that's the whole problem with this issue :( please google for ( "oom kill kernel" ) |
Are the memory requests and limits just cgroups under the hood? |
Following, same issue here |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
In GKE the only evidence I can find that there's been a problem is often that I can find a log like
Looking at the kubernetes side I can see that the pod was restarted, and I'm sure that it was restarted because the process was OOMkilled. But even in the metrics, I think it was killed before any metric could be created that showed it was using a concerning amount of memory. Can that "Memory cgroup out of memory" be modified to emit a log that gives the actual pod name? Or does k8s need to look for Memory cgroup out of memory events? Or is there some other piece of data I'm missing that makes it clear why that specific container was restarted? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
I briefly looked through cadvisor's code responsible for reporting the metric and it is looking directly into /dev/kmsg to surface the OOM Kill events performed by the Kernel oom killer https://github.com/google/cadvisor/blob/master/utils/oomparser/oomparser.go. As far as I can tell it should catch all the OOM kill events, so maybe you are facing a corner case? The parser is relying on a couple of regex that only works for Linux 5.0+ https://github.com/google/cadvisor/blob/master/utils/oomparser/oomparser.go#L30-L33 so it would be worth checking your kernel logs and verifying if and how the log message was produced there. |
A log wouldn't point you to the culprit either, the oom killer just knows about processes oom score and will take decisions based on that when memory is overloaded. If you want to know why the memory usage increased to the point that some processes had to be killed, there are other metrics you can look at. For example the container-level memory usage metrics can tell you which pods on a particular node had a spike in memory utilization. The goal of the OOM killed metric is to tell you exactly which container got OOM killed in case you want to know what happened to your containers, nothing more. But there are other metrics that are meant to answer your other questions. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
@frittentheke are you seeing any gaps in how we report OOMKill that should still be covered by Kubernetes? Based on my previous comments, I would be inclined to close this issue as fixed by #108004. |
Thanks a bunch for asking @dgrisonnet! I believe the OP (@sylr) originally was asking for an event to be "logged" or rather emitted by the Kubelet:
I somehow believe this is not an entirely unreasonable idea. Going OOM is quite an event for a container ;-) But the request / idea was likely due to the reasons behind the issue kubernetes/kube-state-metrics#2153 and the ephemeral nature of the last terminated reason (being the source for the Yes, I believe #108004 does indeed fix the issue of loosing any OOM event by count. But coming back to the idea of (also) having an event "logged", I believe it's still a good idea to NOT rely on metrics and counters that would need to be scraped, but to also expose them via the API somehow to they are available for debugging via Currently it's just that the Pod resource only contains the current
|
Or it doesn't really (didn't actually check, only following the links): google/cadvisor#3015 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
What happened:
Container gets killed because it tries to use more memory than allowed.
What you expected to happen:
Have an
OOMKilled
event tied to the pod and logs about this/sig node
The text was updated successfully, but these errors were encountered: