Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Use Out-of-service taint in Node remediation in place of deletion #1808

Merged

Conversation

clobrano
Copy link
Contributor

What this PR does / why we need it:
Currently, Metal3Remediation deletes the Node object to speed up the remediation, however, starting from Kubernetes 1.28 (GA) the new out-of-service taint is available, and CAPM3 can use it in place of deleting the node.

Which issue(s) this PR fixes:
Fixes #1725

@metal3-io-bot metal3-io-bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 27, 2024
@metal3-io-bot
Copy link
Contributor

Hi @clobrano. Thanks for your PR.

I'm waiting for a metal3-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@metal3-io-bot metal3-io-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 27, 2024
@tuminoid
Copy link
Member

/ok-to-test

@metal3-io-bot metal3-io-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 28, 2024
@tuminoid
Copy link
Member

/cc @kashifest @Rozzii @mboukhalfa

@clobrano
Copy link
Contributor Author

@clobrano: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
unit 60009dc link true /test unit

Full PR test history. Your PR dashboard.

I must have forgot one latest change. I'll fix it right away

@clobrano clobrano force-pushed the feature/out-of-service-taint-0 branch 3 times, most recently from f0ca462 to 97888f7 Compare June 28, 2024 11:12
@Rozzii Rozzii added this to the 1.8.0 milestone Jun 28, 2024
@metal3-io-bot metal3-io-bot added the needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2024
@clobrano clobrano force-pushed the feature/out-of-service-taint-0 branch from 97888f7 to a76c074 Compare July 1, 2024 15:16
@metal3-io-bot metal3-io-bot removed the needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2024
@adilGhaffarDev
Copy link
Member

/test ?

@metal3-io-bot
Copy link
Contributor

@adilGhaffarDev: The following commands are available to trigger required jobs:

  • /test build
  • /test generate
  • /test gomod
  • /test manifestlint
  • /test markdownlint
  • /test metal3-centos-e2e-integration-test-main
  • /test metal3-ubuntu-e2e-integration-test-main
  • /test shellcheck
  • /test test
  • /test unit

The following commands are available to trigger optional jobs:

  • /test metal3-centos-e2e-basic-test-main
  • /test metal3-centos-e2e-feature-test-main
  • /test metal3-e2e-1-26-1-27-upgrade-test-main
  • /test metal3-e2e-1-27-1-28-upgrade-test-main
  • /test metal3-e2e-1-28-1-29-upgrade-test-main
  • /test metal3-e2e-clusterctl-upgrade-test-main
  • /test metal3-ubuntu-e2e-basic-test-main
  • /test metal3-ubuntu-e2e-feature-test-main

Use /test all to run the following jobs that were automatically triggered:

  • build
  • generate
  • gomod
  • manifestlint
  • unit

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@adilGhaffarDev
Copy link
Member

/test metal3-ubuntu-e2e-feature-test-main

Copy link
Member

@kashifest kashifest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the feature. I am in a double mind thinking if this feature should come from CAPI side or from infra provider's side, specially since it is talking about k8s node behavior. I will probably pose it as an open question for now. I dont see any harm on doing it from infra provider's side to be honest, but its just that we dont usually do node operations like taint nodes or check for it to drain from CAPM3. Anyways, a couple of comments inline for now.

func (r *RemediationManager) RemoveOutOfServiceTaint(ctx context.Context, clusterClient v1.CoreV1Interface, node *corev1.Node) error {
newTaints := []corev1.Taint{}

var IsPopOutOfServiceTaintServiceTaint bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable name seems to use ServiceTaint twice is that intentional or shall we just keep it simpler like outOfServiceTaintDropped ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a typo, thanks!

}

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;delete;deletecollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need all the verbs like update, delete, deletecollection ? As far as I can see in the code list and watch should suffice or may be only list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see pod list on L545, annotation get on L170, but fail to see where client would need any of the other verbs. Can you elaborate why they'd be needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You both are correct. I need to delete some verbs.
I will test if get is also necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that listing VA does need the get verbs

pkg/mod/k8s.io/client-go@v0.29.5/tools/cache/reflector.go:229: Failed to watch *v1.VolumeAttachment: unknown (get volumeattachments.storage.k8s.io)

}

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;delete;deletecollection
// +kubebuilder:rbac:groups=storage.k8s.io,resources=volumeattachments,verbs=get;list;watch;update;delete;deletecollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question, do we need all the verbs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see List on L561, but nothing else being used. Same question, where do we need them?

@kashifest
Copy link
Member

kashifest commented Jul 2, 2024

/cc @zaneb @dtantsur @lentzi90 @tuminoid @mboukhalfa @honza it would be good to have your feedback on this

@metal3-io-bot
Copy link
Contributor

@kashifest: GitHub didn't allow me to request PR reviews from the following users: it, would, be, your, on, good, top, have, this.

Note that only metal3-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @zaneb @dtantsur @lentzi90 @tuminoid @mboukhalfa @honza it would be good top have your feedback on this

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@tuminoid tuminoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RBAC scope questions.

}

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;delete;deletecollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see pod list on L545, annotation get on L170, but fail to see where client would need any of the other verbs. Can you elaborate why they'd be needed?

}

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;delete;deletecollection
// +kubebuilder:rbac:groups=storage.k8s.io,resources=volumeattachments,verbs=get;list;watch;update;delete;deletecollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see List on L561, but nothing else being used. Same question, where do we need them?

@clobrano clobrano force-pushed the feature/out-of-service-taint-0 branch from a76c074 to 11a74e9 Compare July 3, 2024 13:47
@zaneb
Copy link
Member

zaneb commented Jul 4, 2024

I was going to tag @clobrano as the expert on this subject, but I see that he is already the author of the PR 🙂

@clobrano clobrano force-pushed the feature/out-of-service-taint-0 branch from 11a74e9 to 608174e Compare July 5, 2024 09:02
@metal3-io-bot
Copy link
Contributor

metal3-io-bot commented Jul 5, 2024

@clobrano: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
metal3-ubuntu-e2e-feature-test-main a76c074 link false /test metal3-ubuntu-e2e-feature-test-main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Rozzii
Copy link
Member

Rozzii commented Jul 8, 2024

@clobrano could you please squash the commits and sign the resulting commit off , to be compliant with the DCO enforcement policy https://github.com/metal3-io/cluster-api-provider-metal3/pull/1808/checks?check_run_id=27076588793

@clobrano
Copy link
Contributor Author

clobrano commented Jul 8, 2024

@clobrano could you please squash the commits and sign the resulting commit off , to be compliant with the DCO enforcement policy https://github.com/metal3-io/cluster-api-provider-metal3/pull/1808/checks?check_run_id=27076588793

sure, I'll do it right now

If the Out of service taint (OOST) is supported (k8s server version >=
1.28), enable Metal3RemediationController to set the OOST taint on the
target node instead than delete it.

Ensure the target node is drained (no stateful pod running) before
moving on to waiting state, and host power on again.

When the host is powered on, remove the OOST from the target node.

Signed-off-by: Carlo Lobrano <c.lobrano@gmail.com>
@clobrano clobrano force-pushed the feature/out-of-service-taint-0 branch from 8be7da9 to c6cd725 Compare July 8, 2024 13:48
@Rozzii
Copy link
Member

Rozzii commented Jul 9, 2024

/test metal3-centos-e2e-integration-test-main
/test metal3-ubuntu-e2e-integration-test-main

Copy link
Member

@Rozzii Rozzii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Just writing down my understanding, because I was unsure first about this topic:
The logic of detecting the "success" of the draining is a bit messy IMO, but it is a kubernetes thing. AFAIK there is no clear indication whether draining was done for a node because as far as I understand there is no such process as "draining" from K8s API perspective, instead what is happening is just pod deletion and/or pod eviction on a node scale thus each eviction+deletion is done in the context of a pod and not in the context of the node but done on all pods of a node, and because this situation the only way to detect the "status of the draining" is to check whether any Pod (of the draining Node) is still around with a deletion timestamp. Please correct me if I am wrong!!!

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 9, 2024
@Rozzii
Copy link
Member

Rozzii commented Jul 9, 2024

Thanks for adding the feature. I am in a double mind thinking if this feature should come from CAPI side or from infra provider's side, specially since it is talking about k8s node behavior. I will probably pose it as an open question for now. I dont see any harm on doing it from infra provider's side to be honest, but its just that we dont usually do node operations like taint nodes or check for it to drain from CAPM3. Anyways, a couple of comments inline for now.

after a bit of contemplation I came to the conclusion that IMO CAPM3 is a acceptable location for this feature because it only enhances the existing remediation process and the remediation is the responsibility of the infra provider.

@clobrano
Copy link
Contributor Author

clobrano commented Jul 9, 2024

/lgtm Just writing down my understanding, because I was unsure first about this topic: The logic of detecting the "success" of the draining is a bit messy IMO, but it is a kubernetes thing. AFAIK there is no clear indication whether draining was done for a node because as far as I understand there is no such process as "draining" from K8s API perspective, instead what is happening is just pod deletion and/or pod eviction on a node scale thus each eviction+deletion is done in the context of a pod and not in the context of the node but done on all pods of a node, and because this situation the only way to detect the "status of the draining" is to check whether any Pod (of the draining Node) is still around with a deletion timestamp. Please correct me if I am wrong!!!

correct :)

@adilGhaffarDev
Copy link
Member

/approve
Thank you for working on this.

Let's also update our book: https://github.com/metal3-io/metal3-docs/blob/main/docs/user-guide/src/capm3/remediaton.md#metal3-remediation Can you open a PR for that too?

I am running the e2e feature again to be on the safe side.
/test metal3-ubuntu-e2e-feature-test-main

@metal3-io-bot
Copy link
Contributor

@adilGhaffarDev: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test build
  • /test generate
  • /test gomod
  • /test manifestlint
  • /test markdownlint
  • /test metal3-centos-e2e-integration-test-main
  • /test metal3-ubuntu-e2e-integration-test-main
  • /test shellcheck
  • /test test
  • /test unit

The following commands are available to trigger optional jobs:

  • /test metal3-centos-e2e-basic-test-main
  • /test metal3-centos-e2e-feature-test-main-features
  • /test metal3-centos-e2e-feature-test-main-pivoting
  • /test metal3-centos-e2e-feature-test-main-remediation
  • /test metal3-e2e-1-26-1-27-upgrade-test-main
  • /test metal3-e2e-1-27-1-28-upgrade-test-main
  • /test metal3-e2e-1-28-1-29-upgrade-test-main
  • /test metal3-e2e-clusterctl-upgrade-test-main
  • /test metal3-ubuntu-e2e-basic-test-main
  • /test metal3-ubuntu-e2e-feature-test-main-features
  • /test metal3-ubuntu-e2e-feature-test-main-pivoting
  • /test metal3-ubuntu-e2e-feature-test-main-remediation

Use /test all to run the following jobs that were automatically triggered:

  • build
  • generate
  • gomod
  • manifestlint
  • unit

In response to this:

/approve
Thank you for working on this.

Let's also update our book: https://github.com/metal3-io/metal3-docs/blob/main/docs/user-guide/src/capm3/remediaton.md#metal3-remediation Can you open a PR for that too?

I am running the e2e feature again to be on the safe side.
/test metal3-ubuntu-e2e-feature-test-main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@metal3-io-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adilGhaffarDev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@metal3-io-bot metal3-io-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 25, 2024
@adilGhaffarDev
Copy link
Member

/test metal3-centos-e2e-feature-test-main-remediation

@adilGhaffarDev
Copy link
Member

/hold
lets wait for test to pass

@metal3-io-bot metal3-io-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2024
@clobrano
Copy link
Contributor Author

/approve Thank you for working on this.

Let's also update our book: https://github.com/metal3-io/metal3-docs/blob/main/docs/user-guide/src/capm3/remediaton.md#metal3-remediation Can you open a PR for that too?

I am running the e2e feature again to be on the safe side. /test metal3-ubuntu-e2e-feature-test-main

sure no problem

🤞 for the last tests

@adilGhaffarDev
Copy link
Member

/hold cancel

@metal3-io-bot metal3-io-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 26, 2024
@metal3-io-bot metal3-io-bot merged commit a599911 into metal3-io:main Jul 26, 2024
17 checks passed
@clobrano clobrano deleted the feature/out-of-service-taint-0 branch July 26, 2024 07:11
clobrano added a commit to clobrano/metal3-docs that referenced this pull request Jul 29, 2024
Update the Metal3 Remediation section with the new step Out-of-Service
taint replacing node deletion.

See metal3-io/cluster-api-provider-metal3#1808
clobrano added a commit to clobrano/metal3-docs that referenced this pull request Jul 29, 2024
Update the Metal3 Remediation section with the new step Out-of-Service
taint replacing node deletion.

See metal3-io/cluster-api-provider-metal3#1808

Signed-off-by: Carlo Lobrano <c.lobrano@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: CAPM3 WIP
Development

Successfully merging this pull request may close these issues.

Add OutOfServiceTaint to Node Instead of Deleting Node Object
7 participants