Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance health check endpoint to support serializable request #13399

Merged
merged 1 commit into from
Nov 15, 2021

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Oct 7, 2021

Fix #13340

Please let me know whether the solution is accepted. If yes, then I can continue to add some unit test for V3 server.

@ahrtr ahrtr force-pushed the serializable_health_check branch 2 times, most recently from d9cd065 to 554d319 Compare October 8, 2021 00:42
@ahrtr
Copy link
Member Author

ahrtr commented Oct 12, 2021

@serathius and others, any comments please?

@ahrtr
Copy link
Member Author

ahrtr commented Oct 26, 2021

Could anyone take a look at this PR?

@serathius
Copy link
Member

cc @hexfusion @ptabor

@ahrtr ahrtr force-pushed the serializable_health_check branch from 058cfdd to 754cce1 Compare November 14, 2021 20:33
@ahrtr
Copy link
Member Author

ahrtr commented Nov 14, 2021

Rebased the PR to resolve the conflict.

Copy link
Contributor

@hexfusion hexfusion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr thanks overall this is an improvement one small nit. Can you please also update the 3.6 changelog?

@@ -65,7 +67,15 @@ func NewHealthHandler(lg *zap.Logger, hfunc func(excludedAlarms AlarmSet) Health
return
}
excludedAlarms := getExcludedAlarms(r)
h := hfunc(excludedAlarms)
// Kubernetes Probes (i.e. livenessProbe) use "/health" endpoint to make a decision whether to restart a specific container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: Can we condense the docs a bit, this feels overly verbose. Perhaps it would be good for folks revisiting this PR to add the full docs to the PR description above. Perhaps something like below?

Passing the query parameter "serializable=true" ensures that the health of the local etcd is checked vs the health of the cluster. This is useful for probes attempting to validate the liveness of the etcd process vs readiness of the cluster to serve requests. 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hexfusion Thanks for the comment. Fixed.

I also squashed all the commits.

@ahrtr ahrtr force-pushed the serializable_health_check branch from 754cce1 to 09ff051 Compare November 14, 2021 21:59
@ahrtr
Copy link
Member Author

ahrtr commented Nov 14, 2021

@ahrtr thanks overall this is an improvement one small nit. Can you please also update the 3.6 changelog?

Thanks & Done!

Copy link
Contributor

@hexfusion hexfusion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you

Copy link
Member

@spzala spzala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
Thanks @ahrtr

@pacoxu
Copy link
Contributor

pacoxu commented Feb 17, 2022

See kubernetes/kubeadm#2567 (comment).

Could this be cherry-picked to etcd 3.5 and release a patch? Or Kubernetes has to wait for etcd 3.6 for this liveness enhancement.

@ahrtr
Copy link
Member Author

ahrtr commented Feb 17, 2022

See kubernetes/kubeadm#2567 (comment).

Could this be cherry-picked to etcd 3.5 and release a patch? Or Kubernetes has to wait for etcd 3.6 for this liveness enhancement.

Sure, let me backport the fix to 3.5.

brandond added a commit to brandond/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to brandond/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to brandond/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to rancher/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to rancher/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to rancher/rke2 that referenced this pull request Jun 16, 2022
Refs:
* kubernetes/kubernetes#110072
* etcd-io/etcd#13399

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Provide a better liveness probe for when etcd runs as a Kubernetes pod
5 participants