Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchdog: introduce endpoints for health and metrics #547

Closed
alexellis opened this issue Mar 1, 2018 · 9 comments
Closed

Watchdog: introduce endpoints for health and metrics #547

alexellis opened this issue Mar 1, 2018 · 9 comments

Comments

@alexellis
Copy link
Member

alexellis commented Mar 1, 2018

Update the OpenFaaS Watchdog to introduce two new endpoints:

  • /_/health

This should check for the lock file at /tmp/.lock - use the existing behavior for this.

  • /_/metrics

Vendor Prometheus go library and expose:

Add a Prometheus counter:

function_invocation{}

This counter should represent the requests/second.

Need to review Prometheus naming guidelines before approving the name function_invocation for the bucket. https://prometheus.io/docs/practices/naming/

Once complete - make equivalent change to the of-watchdog here:

https://github.com/openfaas-incubator/of-watchdog

Worries / other concerns

  • This may be a large chunk of work - so I would prefer two separate Pull Requests for this
  • There will be a size increase of watchdog binary when adding Prometheus go library, we need to evaluate this before merging.
  • This is the first time the watchdog will override any URL - normally these are marshaled directly to the underlying function
@alexellis alexellis changed the title Watchdog: introduce /_/ endpoints - for health and metrics Watchdog: introduce endpoints for health and metrics Mar 1, 2018
@viveksyngh
Copy link
Contributor

viveksyngh commented Mar 2, 2018

I can start looking this if it is not urgent.

@stefanprodan
Copy link
Contributor

Adding the Prometheus client will bring the watchdog binary size to 6MB

@alexellis
Copy link
Member Author

Thanks for commenting. The current size is around 3.92MB - so adding instrumentation almost doubles the size of the binary.

@viveksyngh
Copy link
Contributor

@alexellis for _/health endpoint, what if .lock suppressed with suppressLock configuration? We need to return 404 in that case.

@alexellis
Copy link
Member Author

Great question.. Can you check what Kubernetes expects to fail a health-check? (liveness probe)https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-http-request

@viveksyngh
Copy link
Contributor

viveksyngh commented Mar 15, 2018

Anything greater than 200 to less than 400 is a success. Any other code indicates failure. For failed, the example uses 500 response.

@alexellis
Copy link
Member Author

Sounds good to me

viveksyngh added a commit to viveksyngh/faas that referenced this issue Mar 16, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions  which check for `/tmp/.lock` file

Fixes first part of openfaas#547 issue.

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
viveksyngh added a commit to viveksyngh/of-watchdog that referenced this issue Mar 16, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions which checks for `/tmp/.lock` file

Issues: openfaas/faas#547

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
viveksyngh added a commit to viveksyngh/faas that referenced this issue Mar 18, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions  which check for `/tmp/.lock` file

Fixes first part of openfaas#547 issue.

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
alexellis pushed a commit that referenced this issue Mar 20, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions  which check for `/tmp/.lock` file

Fixes first part of #547 issue.

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
viveksyngh added a commit to viveksyngh/of-watchdog that referenced this issue Apr 3, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions which checks for `/tmp/.lock` file

Added tests for healthHandler

Issues: openfaas/faas#547

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
viveksyngh added a commit to viveksyngh/of-watchdog that referenced this issue Apr 15, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions which checks for `/tmp/.lock` file

Added tests for healthHandler

Issues: openfaas/faas#547

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
viveksyngh added a commit to viveksyngh/of-watchdog that referenced this issue Apr 15, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions which checks for `/tmp/.lock` file

Added tests for healthHandler

Issues: openfaas/faas#547

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
alexellis pushed a commit to openfaas/of-watchdog that referenced this issue Sep 17, 2018
Introduce new endpoint `/_/health` to watchdog for health status of
functions which checks for `/tmp/.lock` file

Added tests for healthHandler

Issues: openfaas/faas#547

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
@kdubuc
Copy link

kdubuc commented Feb 8, 2019

Dumb question, but what is the difference betweek HEALTHCHECK Dockerfile and the /_/health endpoint ? They returns the same information. OpenFaaS cannot read the docker service HEALTHCHECK status ?

@alexellis
Copy link
Member Author

I have done this work now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants