-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
child: use 127.0.0.1 for httpGet probes if hostNetwork
field is true
#434
Conversation
Hey, @BirknerAlex. Do we really need |
In my case I kept I only bound the web to 127.0.0.1 instead, so the PodIP is still the node IP (public, which is used for the probes as well), so if you ask me it's required to override it). |
Yes. It is needed for:
I mean if we just do the following:
It will do, right? Is there a case when |
Yes you're right, that should be fine as well instead of the additional value. //edit: Okay I might found one reason to keep it, not in my use case but using |
what about changing the type of the readiness/liveness probes to exec and simplifying it ? - livenessProbe:
- failureThreshold: 3
- httpGet:
- path: /api/v1/info
- port: http
- scheme: HTTP
- periodSeconds: 30
- successThreshold: 1
- timeoutSeconds: 1
+ livenessProbe:
+ failureThreshold: 3
+ exec:
+ command:
+ - /usr/sbin/netdatacli
+ - ping
+ periodSeconds: 30
+ successThreshold: 1
+ timeoutSeconds: 1 |
@stelfrag can we use netdatacli ping/pong for liveness/readiness probes? Does a ping response mean Netdata is running and ready to serve requests? |
The command processing (cli) is initialized last so receiving a pong means the agent is ready to respond to requests at that point -- otherwise you get
This will be the case after startup and while the metrics database is initializing (if it is taking a while) Requests (queries) to "children" may not be available even if you get a pong as those continue to initialize async and populate contexts etc. |
This is also true for HTTP GET to |
if the
Is it also followed by the exit code other than 0 ? |
Yes root@pve-deb-work:/opt/netdata/usr/sbin# ./netdatacli ping; echo $?
pong
0
root@pve-deb-work:/opt/netdata/usr/sbin# systemctl stop netdata
root@pve-deb-work:/opt/netdata/usr/sbin# ./netdatacli ping; echo $?
uv_pipe_connect(): no such file or directory
Make sure the netdata service is running.
255 |
@witalisoft One caveat is that "timeoutSeconds" becomes almost useless because Startup time:
|
so potentially the implication will be that we can get more often liveness/readiness failures when the netdata is slowly responding ? |
We're comparing the time it takes for Kubernetes to determine an application is unhealthy based on HTTP and EXEC types of liveness/readiness probes.
My point is that users who have previously relied on the |
probably, but having a bigger installation of netdata requires you to tune those values nevertheless in our use case changing from HTTP to EXEC probes doesn't make any difference, the timeout is way below the
|
Right. I meant that it might create problems for existing installations, but I don't see it as a blocker. |
Closing in favour of #436 |
As suggested in https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#http-probes
cc @BirknerAlex