-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ added health probes #419
✨ added health probes #419
Conversation
/assign @pwittrock |
/kind feature |
readinessEndpointName string | ||
|
||
// Liveness probe endpoint name | ||
livenessEndpointName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are pretty standard in kubernetes (e.g. /healthz
and /readyz
), so we should default them at the very least
pkg/manager/internal.go
Outdated
mux.Handle(cm.readinessEndpointName, readinessHandler) | ||
} | ||
if cm.livenessEndpointName != "" { | ||
var livenessHandler http.HandlerFunc = func(w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should copy the k8s.io/apiserver
healthz code, and then switch to that whenever it moves to component-base (@luxas ?). It provides support for pluggable health checks, so we could have multiple in a single manager (e.g. health check per controller).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we directly use k8s.io/apiserver
? Its description is Library for writing a Kubernetes-style API server.
If it's a lib, why can't we just use it instead of copying it?
Did I miss anything?
@DirectXMan12 updated code. Now it's using healthz package from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few last comments inline, mainly around cleaning up the imported code.
pkg/healthz/healthz.go
Outdated
checks = []Checker{PingHealthz} | ||
} | ||
|
||
log.V(5).Info("Installing healthz checkers:", formatQuoted(checkerNames(checks...)...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err... this log line isn't quite right, since it's structured logging. At an rate, I'd recommend logging once per checker anyway, like:
log.V(1).Info("installing healthz checker", "checker", check.Name
pkg/healthz/healthz.go
Outdated
} | ||
// always be verbose on failure | ||
if failed { | ||
log.V(2).Info(fmt.Sprintf("%vhealthz check failed", verboseOut.String())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no fmt
in our logging. Log messages should be constant. Use key-value pairs to attach variable information.
pkg/healthz/healthz_test.go
Outdated
contentType = "text/plain; charset=utf-8" | ||
) | ||
|
||
func TestInstallPathHandler(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please adapt these to ginkgo, like below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact, you can probably prune a few of these, if they overlap heavily with the ginkgo tests below.
c01da8c
to
ea95ee0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few more comments inline. sorry for the hug delay here
pkg/healthz/healthz.go
Outdated
// specified in exactly one call to InstallPathHandler. Calling | ||
// InstallPathHandler more than once for the same path and mux will | ||
// result in a panic. | ||
func InstallPathHandler(mux mux, path string, checks ...Checker) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should have a better name for this. We can also just take an http.Handler
here, probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, healthz code will be moved to component-base, and we planning to switch to it, right? In that case, isn't it better to leave this function as it is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're going to carry an interface in controller-runtime, I'd rather have it up to controller-runtime standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PR
pkg/healthz/healthz.go
Outdated
// InstallPathHandler more than once for the same path and mux will | ||
// result in a panic. | ||
func InstallPathHandler(mux mux, path string, checks ...Checker) { | ||
if len(checks) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of this logic here -- I'd rather just have something where you call a single call per health check, like Register
with the webhook server
pkg/healthz/healthz.go
Outdated
if err := check.Check(r); err != nil { | ||
// don't include the error since this endpoint is public. If someone wants more detail | ||
// they should have explicit permission to the detailed checks. | ||
log.V(4).Info("healthz check failed", "checker", check.Name(), "error", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally don't log at anything above 1 by default in CR.
pkg/healthz/healthz.go
Outdated
// don't include the error since this endpoint is public. If someone wants more detail | ||
// they should have explicit permission to the detailed checks. | ||
log.V(4).Info("healthz check failed", "checker", check.Name(), "error", err) | ||
fmt.Fprintf(&verboseOut, "[-]%v failed: reason withheld\n", check.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be %q
or %s
, since name is a string?
pkg/healthz/healthz.go
Outdated
return | ||
} | ||
fmt.Fprint(w, "healthz check passed\n") | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we not need a WriteHeader here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already set to http.StatusOK
by default
pkg/manager/internal.go
Outdated
@@ -180,6 +201,32 @@ func (cm *controllerManager) SetFields(i interface{}) error { | |||
return nil | |||
} | |||
|
|||
// AddHealthzChecks allows you to add Healthz checkers | |||
func (cm *controllerManager) AddHealthzChecks(checks ...healthz.Checker) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably just make this AddHealthzCheck
and have it built up the list
ea95ee0
to
1692d51
Compare
pkg/healthz/healthz.go
Outdated
// specified in exactly one call to InstallPathHandler. Calling | ||
// InstallPathHandler more than once for the same path and mux will | ||
// result in a panic. | ||
func InstallPathHandler(mux mux, path string, checks ...Checker) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're going to carry an interface in controller-runtime, I'd rather have it up to controller-runtime standards.
1692d51
to
b28d04b
Compare
5d06f04
to
864a936
Compare
/ok-to-test |
864a936
to
0085b01
Compare
@DirectXMan12 Looks solid to me. Do we intend to plug in in scaffolded manifests in KB ? (opt-in may be through a kustomize patch ?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but with a few nits
cm.healthzHandler = &healthz.Handler{Checks: map[string]healthz.Checker{}} | ||
} | ||
|
||
cm.healthzHandler.Checks[name] = check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a check to see if the name already presents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♀️ we don't have a way to replace otherwise, so I'm not sure. Let's leave it like this for the moment, and see what people do with it.
cm.readyzHandler = &healthz.Handler{Checks: map[string]healthz.Checker{}} | ||
} | ||
|
||
cm.readyzHandler.Checks[name] = check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto above
Changes: - Add Helm Chart testing scripts - Fix LABEL entry on our Dockerfiles - Add license boilerplate for our clients pkg - Fix helm chart installation (hardcoded URL for etcd cluster) - Add missing entries in the chart to be compliant with official chart-testing validators - Remove duplicated `platform` directory (content moved under /internal/platform pkg) - Remove `contrib` dir, and leave only `hack` folder, reasons: Contrib meaning: > It is for software that has been contributed to the project, but which might not actually be maintained by the core developers. Naming it "contrib" or "Contrib" is a long-established convention - remove the `heath` package from `pkg` directory, why: - In `pkg` we should put library code that's ok to use by external applications (e.g., /pkg/mypubliclib). So it's good to have there our `apis` and `clients` because it's our contract and we are supporting those libs and it's normal that someone will vendor that in own project. In case of `health` pkg we shouldn't expose such a contract because is not helm-broker domain and w do not want to support that. What's more, the `health` pkg will be removed when this will be merge: kubernetes-sigs/controller-runtime#419. **We should think twice before putting something here :-)**
This refactors the health probes, bringing the style a bit more in line with CR. The health probes themselves are now their own handler, and you add checks simply by using a map from check name to checker (which is just a function). The internal structure should now be more condusive to JSON output as well.
@mengqiy ready for another round |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: GrigoriyMikhalkin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -304,6 +332,19 @@ func defaultNewClient(cache cache.Cache, config *rest.Config, options client.Opt | |||
}, nil | |||
} | |||
|
|||
// defaultHealthProbeListener creates the default health probes listener bound to the given address | |||
func defaultHealthProbeListener(addr string) (net.Listener, error) { | |||
if addr == "" || addr == "0" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we intend for health probes to not be served by default? By my reading, since Options.HealthProbeBindAddress
has no default value set apart from the empty string, we get no probe server unless we set it explicitly.
Is this documented somewhere? It took me a while to figure how to enable it, and I had to read the code... |
@omerlh Does |
Generally, the godoc: https://godoc.org/sigs.k8s.io/controller-runtime/pkg/healthz https://godoc.org/sigs.k8s.io/controller-runtime is really useful IMO |
Attempt to solve #297. Adds server, that can serve readiness and liveness probes. Probes return response based on manager's fields
isReady
andisAlive
, that can be changed via methodsSetReady
/SetAlive