Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHC should surface unhealthy machines even if maxUnhealthy has been reached #3163

Closed
benmoss opened this issue Jun 8, 2020 · 9 comments
Closed
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@benmoss
Copy link

benmoss commented Jun 8, 2020

User Story

As an operator I would like to be able to see what machines are unhealthy in my cluster even when I have configured a maxUnhealthy limit on my MHC and it has been triggered, so that I can have help manually determining the problems.

Detailed Description

Right now when maxUnhealthy is hit we log a message and emit an event on the MachineHealthCheck object. Since we are introducing conditions and there's a lot of overlap with events, we can possibly replace the event or at least supplement it.

Anything else you would like to add:

We need to agree what the condition should be. @fabriziopandini proposed using OwnerRemediated=false, Reason= RemediationRestricted but that seems a little kludgy to me.

/kind feature

xref #3108

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 8, 2020
@vincepri
Copy link
Member

/milestone Next

@vincepri vincepri added this to the Next milestone Jun 10, 2020
@k8s-ci-robot
Copy link
Contributor

@vincepri: The label(s) priority/important-longerm cannot be applied, because the repository doesn't have them

In response to this:

/priority important-longerm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vincepri
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 10, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2020
@fabriziopandini
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 9, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 8, 2020
@fabriziopandini
Copy link
Member

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 9, 2020
@vincepri
Copy link
Member

/close
in favor of #5451

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closing this issue.

In response to this:

/close
in favor of #5451

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

5 participants