-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zombie processes caused by health checks #2441
Comments
Hi, Thank you for the input! We will work on fixing the issue. I will put PR as soon as I have one. |
i still get those zombies with current version, only setting |
See #10002 (comment). The same happens for the redis / valkey containers, with the exception, that these do not reap childrens regularly. Thus we run out of pids in the cgroup and redis/valkey crashes at some point in time. We were not able to fix this using a different timeoutSeconds value. I think this can only be fixed by enabling shareProcessNamespace or removing the timout command from the readiness/liveness probes. |
Which chart:
bitnami redis template version 9.0.2 and also 10.5.7
Describe the bug
on related docker nodes, you'll find redis-cli zombie processes:
those zombies seem to get caused from readiness / liveness health checks, when slave or master could not be reached within related timeout.
see also https://github.com/bitnami/bitnami-docker-redis/issues/165
To Reproduce
Steps to reproduce the behavior:
simulate connection problem by iptables DROP rule:
iptables -I INPUT -p tcp --dport 6379 -j DROP
login into redis-master or redis-slave container (pod) and execute health check:
Expected behavior
having no zombie processes on docker nodes after some days the docker redis service is running.
Version of Helm and Kubernetes:
helm version
:kubectl version
:Additional context
after looking inside the health checks mounted from helm chart via config map at
/health
and running some tests, i found a solution:instead of
timeout -s 9
just usetimeout -s 3
in following scripts:which are dynamic generated by https://github.com/bitnami/charts/blob/master/bitnami/redis/templates/health-configmap.yaml
when using kill signal 3 instead of 9, no zombie process will be spawned any more
because the kill signal for the
timeout
command is hard coded, please change it to 3 or replace it with an environment variable for more flexibility.thx
The text was updated successfully, but these errors were encountered: