fix: annotate nodes for reboot before aborting due to blocked #749
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR moves the checking of "reboot blockers" (e.g., matching podSelectors running on the node to-be-rebooted) further down in the conditional reboot flow so that those checks happen after we (conditionally) annotate the nodes.
This enables the following when a reboot is detected (by default the presence of the
/var/run/reboot-required
file):--annotate-nodes
is true, we annotate the nodes with the following annotations:"weave.works/kured-reboot-in-progress"
(indicates that this node will be rebooted at some point in the near future)"weave.works/kured-most-recent-reboot-needed"
(marks a timestamp of when this need for reboot was detected by kured)I'm simplifying a bit above, there is a bit more complexity, but the key point is that we are now marking "this node is definitely going to be reboote" prior to blockers. This allows other, complementary tooling to know about the state of the node ("gonna be rebooted soon") and do things like extra-manual stateful application migration off of that node, add additional taints to better prevent future scheduling, etc.
Fixes #702