-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customize node configuration: add pod-max-pids to avoid PID exhaustion #2276
Comments
Hi tdihp, AKS bot here 👋 I might be just a bot, but I'm told my suggestions are normally quite good, as such:
|
related: #323 |
Triage required from @Azure/aks-pm |
Action required from @Azure/aks-pm |
Issue needing attention of @Azure/aks-leads |
2 similar comments
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
A PR has been raised to enable this in the azure cli for the next release, and the documentation will also be updated. I'll keep this open until this has been completed |
Hi I am experiencing a similar issue on Kubernetes version 1.19.11. The PID space of a random node gets exhausted and my only solution for now is to restart that node. Do we have any updates on this feature? |
Hi this is also impacting us with Kubernetes version 1.191.11. When the PID space is exhausted this impacts our calico-node pod which impacts everything on the node. Restarting the node does seem to resolve the issue. Has this change been released yet? |
Action required from @Azure/aks-pm |
Issue needing attention of @Azure/aks-leads |
For users affected by this, I'd suggest to identify the culprit application causing the exhaustion. |
Closing as custom node configuration is now GA with |
What happened:
Applications can allocate too many threads, triggering EAGAIN when kubelet/containerd tries to create new thread with pthread_create. We observe PLEG failures and nodes not ready due to some offending application.
What you expected to happen:
Add pod-pid-limits as an option for custom node configuration. Configure a smaller value for pods should provide safety for node readiness.
How to reproduce it (as minimally and precisely as possible):
Simply add a testing Python pod in 2 steps and wait for around 6 mins to wait for node not ready:
Environment:
kubectl version
): 1.19.7The text was updated successfully, but these errors were encountered: