-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incident: Kubernetes couldn't schedule pods #287
Comments
The problem wasn't really that. top reveals that the Scope probes (v0.11.1) I deployed yesterday as part of #282 are consuming 70% CPU per node, peaking at 90%. I think that is the culprit |
As an action, we should at the very least have an alarm when the cluster runs out of resources. K8s knows it so I guess there should be a way for k8s to trigger an alarm from the k8s events. |
Related: weaveworks/scope#812 |
I will stop the Scope probes in production until weaveworks/scope#812 is resolved. |
Another cause is that kubelet is consuming ~40% of CPU in the production nodes (almost 80% in the dev nodes). We should investigate this. |
This is already tracked by other tickets, so I am closing:
|
Timeline
All times UTC on Wed 13 Jan 2016
kubectl get events -w
, saw lots of Pod FailedScheduling - Failed for reason PodExceedsFreeCPU and possibly othersDowntime
The oldest Pending container was in a bad state for a total of 18 hours.
Root cause
We hit cluster resource limits faster than anticipated. Kubernetes refused to schedule new workloads.
Fix
Action items
The text was updated successfully, but these errors were encountered: