-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MachineHealthCheck documentation should clarify which use cases covers #2861
Comments
As I wrote in Slack, MHC only currently reconciles Machines that have an OwnerReference to a MachineSet. We will be adding control plane remediation in a future release. |
If this isn't super clear in the docs, we should make it. I also got confused at first and assume it'd work with any Machine. |
/kind documentation |
related #2836 |
/retitle MachineHealthCheck documentation should clarify which use cases covers |
@jayunit100 I just had a quick read through this and noticed a couple of things about your examples, not sure if this was mentioned in slack or not so will post here for posterity
@vincepri Re docs: While this is already mentioned in the docs as part of the limitations and caveats section, I'm aware this has come up a few times so I'm thinking not many people are making it to the bottom of the page, do you think moving the limitations section higher up the page might be better?
|
Yeah, either a warning or informational sign at the top would be great. |
I've added a PR to highlight the limitation at the top of the MHC docs #2875 |
Although this is a bug report, I think it also might correspond to a feature request: Adding a status field to MachineHealthChecks, which allows for easy inspection of wether the MHC is targeting a non-null set of nodes.
What steps did you take and what happened:
cluster-name
What did you expect to happen:
A log message showing what nodes my MHC was targeting, and maybe another log message saying something along the lines of "this machine exceeded its failure timeout, recreating!".
But I saw neither...
Anything else you would like to add:
I actually also saw all logs for capi-controller-manager freeze during this time and restarted it.
First time MHC user, so forgive me if I did something wrong
Heres the list of clusters, definitely the smoke-test-1... machine is out of commision.
The health check which targetted the smoke-test-1 machines:
The machine that deleted manually, which I expected to be cleaned up and recreated :
Environment:
kubectl version
): 1.17.3/kind bug
The text was updated successfully, but these errors were encountered: