Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Taint nodes with PreferNoSchedule during rollouts #7631

Closed
wants to merge 1 commit into from

Conversation

hiromu-a5a
Copy link
Contributor

What this PR does / why we need it:

If the MachineSet or MachineDeployment has many replicas, and each node has many pods, changes to an existing MachineDeployment or MachineSet infrastructure template can result in unnecessary pod churn. As the first node is drained, pods previously running on that node may be scheduled onto nodes who have yet to be replaced, but will be torn down soon. This can happen over and over again.

To avoid the above problem, this PR changes the machine controller to add PreferNoSchedule taint to nodes in old MachineDeployment. As mentioned in #7043 (comment), tainting should be triggered by MachineDeployment controller. In the MachineDeployment controller, the old MachineSets are given an annotation, then the annotation is propagated from MachineSet to Machines, and finally Machine with the annotation sets taint to Node.

This behavior may be related to #493

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #7043

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 28, 2022
@k8s-ci-robot
Copy link
Contributor

Welcome @hiromu-a5a!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 28, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @hiromu-a5a. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 28, 2022
@hiromu-a5a
Copy link
Contributor Author

/cc @enxebre

@sbueringer
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 29, 2022
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
@enxebre
Copy link
Member

enxebre commented Dec 5, 2022

This approach is effectively implementing in place propagation of an annotation->taint. If we continue that direction this should be blocked until https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20221003-In-place-propagation-of-Kubernetes-objects-only-changes.md is implemented so we can have a clearer path.

Alternatively I can see this being built-in business logic as suggested in https://github.com/kubernetes-sigs/cluster-api/pull/7631/files#r1039744048

cc @sbueringer @fabriziopandini thoughts?

@fabriziopandini
Copy link
Member

This approach is effectively implementing in place propagation of an annotation->taint. If we continue that direction this should be blocked until https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20221003-In-place-propagation-of-Kubernetes-objects-only-changes.md is implemented so we can have a clearer path.

I kind of agree that we should wait for label propagation from machine set to machines to be implemented in the context of the work above. instead, annotation to -> taint is something specific of this PR and it won't be addressed by the work above

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiromu-a5a thanks for taking up this issue, this will be super valuable for users!
MD/MS code base is not ideal for first-time contributors, but it seems you are navigating it well; however we should sync with the label propagation effort in order to avoid duplicating/conflicting work
cc @ykakarap who is starting to look into this

internal/controllers/machine/machine_controller_noderef.go Outdated Show resolved Hide resolved
@ykakarap
Copy link
Contributor

ykakarap commented Dec 9, 2022

This is a really good issue to solve. The node labels proposal is also looking at using taints to solve the problem of workloads getting scheduled to nodes that it should not. I will take a closer look at it in a day or two and see if there are any conflicts and how we can get proceed further.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 14, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: hiromu-a5a / name: Hiromu Asahina (b049fc4)

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 14, 2022
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Dec 14, 2022
@k8s-ci-robot k8s-ci-robot removed the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 10, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 10, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2023
@vincepri
Copy link
Member

/retitle ✨ Taint nodes with PreferNoSchedule during rollouts

@k8s-ci-robot k8s-ci-robot changed the title 🌱 Add Taint during rolling update ✨ Taint nodes with PreferNoSchedule during rollouts Jul 10, 2023
@hiromu-a5a
Copy link
Contributor Author

@enxebre Yes. I did.

@vincepri
Copy link
Member

Changes LGTM

/assign @sbueringer

@sbueringer
Copy link
Member

I'll try to get to this soon

@enxebre
Copy link
Member

enxebre commented Nov 22, 2023

I wonder if it is really the right approach to propagate down the information via the MD controller to MachineSets and then to Machines. This requires a lot of API calls across the entire hierarchy.

Tangential discussion, but If we supported API input for taint propagation then MD could use computeDesiredMachineSet to just apply the opinionated taints to old machinesets while reusing already required api calls to perform a rollout.

Code changes lgtm.
Note the the PR is including watcher on MSs, not clear to me if wanted to include that initially based on #7631 (comment)

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/assign @sbueringer

This seems all very reasonable to get in

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 29, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2b1ef82b6213fdd832b79c253676dbe451a90b62

@sbueringer
Copy link
Member

Sorry for the delay. Taking a look now

@fabriziopandini
Copy link
Member

I will try to get a look as well, it seems a nice change to have

Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped reviewing unit tests for now.

Thank you very much for your continued work on this. Mostly smaller findings from my side.

I'll try to review this PR quicker from now on, so we can get it merged soon :)

util/util.go Show resolved Hide resolved
util/util.go Show resolved Hide resolved
test/e2e/data/test-extension/deployment.yaml Show resolved Hide resolved
@sbueringer
Copy link
Member

@hiromu-a5a Do you have time to follow-up on the review comments?

@sbueringer
Copy link
Member

@hiromu-a5a Do you have time to follow-up? Otherwise we could take over the remaining work. (we would like to get this feature into the next CAPI minor release)

@chrischdi
Copy link
Member

/close

Superseeded by:

Kudos to @hiromu-a5a for the awesome work!

@k8s-ci-robot
Copy link
Contributor

@chrischdi: Closed this PR.

In response to this:

/close

Superseeded by:

Kudos to @hiromu-a5a for the awesome work!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CAPI controller should taint outdated nodes with PreferNoSchedule
9 participants