Add CP node deletion param configurable #6571

gandhipr · 2024-02-27T03:34:22Z

What type of PR is this?

feature

What this PR does / why we need it:

Making cloud-provider node deletion param configurable so all cloud-providers can set this as per their need if needed.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

maxCloudProviderNodeDeletionTime (max-cloud-provider-node-deletion-time) can now be configured externally by the users.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

cluster-autoscaler/config/autoscaling_options.go

tallaxes

I don't see maxCloudProviderNodeDeletionTime propagated into / captured in AutoscalingOptions?

cluster-autoscaler/main.go

tallaxes · 2024-03-22T00:24:08Z

@gandhipr, please also update the user-facing section in the description. I am pretty sure it becomes part of the release notes, so needs to document the new flag (even if it has a default value that preserves current behavior).

linux-foundation-easycla · 2024-04-02T00:52:12Z

The committers listed above are authorized under a signed CLA.

✅ login: gandhipr / name: Prachi Gandhi (4c0950e, 3cb49d3)

cluster-autoscaler/main.go

jackfrancis · 2024-04-09T22:15:22Z

/test ls

k8s-ci-robot · 2024-04-09T22:15:25Z

@jackfrancis: The specified target(s) for /test were not found.
The following commands are available to trigger optional jobs:

/test pull-cluster-autoscaler-e2e-azure

In response to this:

/test ls

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jackfrancis · 2024-04-09T22:15:57Z

/test pull-cluster-autoscaler-e2e-azure

jackfrancis

/lgtm

This seems sensible to me.

cc @MaciekPytel @gjtempleton @elmiko

I was going to request a helm chart update to include this new config option, but I noticed that there are a bunch of options not covered by the chart. I can do a follow-up PR to do all of that work inclusive of this new change.

Also is there some additional documentation reinforcement we can do to ensure that the configurable surface area is fully documented, I see gaps there as well.

k8s-ci-robot · 2024-05-01T19:44:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gandhipr, jackfrancis, rakechill, tallaxes
Once this PR has been reviewed and has the lgtm label, please assign towca for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jackfrancis · 2024-06-05T17:31:01Z

ping @MaciekPytel @gjtempleton @towca for approval as this touches the general cloudprovider code (not just Azure)

gjtempleton

I feel like I'm maybe missing something here, the PR description describes allowing cloud-providers to set this as necessary, however it's being exposed as a user configurable flag (and the mention of following up with helm chart changes backs this up).

What's the motivation here? Users with nodes which may require extended times for safe node shutdowns?

gjtempleton · 2024-06-26T21:46:02Z

cluster-autoscaler/main.go

-	frequentLoopsEnabled        = flag.Bool("frequent-loops-enabled", false, "Whether clusterautoscaler triggers new iterations more frequently when it's needed")
+	provisioningRequestsEnabled      = flag.Bool("enable-provisioning-requests", false, "Whether the clusterautoscaler will be handling the ProvisioningRequest CRs.")
+	frequentLoopsEnabled             = flag.Bool("frequent-loops-enabled", false, "Whether clusterautoscaler triggers new iterations more frequently when it's needed")
+	maxCloudProviderNodeDeletionTime = flag.Duration("max-cloud-provider-node-deletion-time", 5*time.Minute, "Maximum time needed by cloud provider to delete a node.")


If we're adding this in as a new flag we should update the ridiculously long table of options in the FAQ as well.

(We should really be auto-generating that, have raised an issue in the hope someone picks that up.)

elmiko

this make sense to me, but i agree with @gjtempleton 's comments about the description and the updates to the FAQ.

elmiko · 2024-06-27T09:03:21Z

cluster-autoscaler/main.go

-	frequentLoopsEnabled        = flag.Bool("frequent-loops-enabled", false, "Whether clusterautoscaler triggers new iterations more frequently when it's needed")
+	provisioningRequestsEnabled      = flag.Bool("enable-provisioning-requests", false, "Whether the clusterautoscaler will be handling the ProvisioningRequest CRs.")
+	frequentLoopsEnabled             = flag.Bool("frequent-loops-enabled", false, "Whether clusterautoscaler triggers new iterations more frequently when it's needed")
+	maxCloudProviderNodeDeletionTime = flag.Duration("max-cloud-provider-node-deletion-time", 5*time.Minute, "Maximum time needed by cloud provider to delete a node.")


jackfrancis · 2024-06-27T16:41:18Z

thx @gjtempleton @elmiko

What's the motivation here? Users with nodes which may require extended times for safe node shutdowns?

Right, I think the triggers that occur after node deletion timeout occurs don't cover the rare edge case behaviors where infra + node deletion takes longer than 5 mins, so making this configurable enables providers to tweak that for certain use cases.

Sounds like the action items are:

updated description
helm chart integration
docs update

I'm not sure if @gandhipr is able to move this forward, if that's correct we'll move this to a new PR w/ the above feedback incorporated. Stay tuned!

k8s-ci-robot · 2024-08-22T15:04:24Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 27, 2024

k8s-ci-robot requested a review from feiskyer February 27, 2024 03:35

k8s-ci-robot added the area/cluster-autoscaler label Feb 27, 2024

k8s-ci-robot requested a review from vadasambar February 27, 2024 03:35

rakechill approved these changes Mar 12, 2024

View reviewed changes

cluster-autoscaler/config/autoscaling_options.go Show resolved Hide resolved

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 16, 2024

tallaxes suggested changes Mar 22, 2024

View reviewed changes

cluster-autoscaler/main.go Outdated Show resolved Hide resolved

gandhipr force-pushed the prachigandhi/make-cp-node-deletion-configurable branch from cc5b031 to dabc007 Compare April 2, 2024 00:52

add cp node deletion param configurable

4c0950e

gandhipr force-pushed the prachigandhi/make-cp-node-deletion-configurable branch from dabc007 to 4c0950e Compare April 2, 2024 00:54

add maxCloudProviderNodeDeletionTime to autoscaling options assignment

3cb49d3

tallaxes approved these changes Apr 4, 2024

View reviewed changes

cluster-autoscaler/main.go Show resolved Hide resolved

jackfrancis approved these changes May 1, 2024

View reviewed changes

k8s-ci-robot assigned jackfrancis May 1, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 1, 2024

gjtempleton reviewed Jun 26, 2024

View reviewed changes

elmiko reviewed Jun 27, 2024

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CP node deletion param configurable #6571

Add CP node deletion param configurable #6571

gandhipr commented Feb 27, 2024 •

edited

Loading

tallaxes left a comment

tallaxes commented Mar 22, 2024

linux-foundation-easycla bot commented Apr 2, 2024 •

edited

Loading

jackfrancis commented Apr 9, 2024

k8s-ci-robot commented Apr 9, 2024

jackfrancis commented Apr 9, 2024

jackfrancis left a comment

k8s-ci-robot commented May 1, 2024

jackfrancis commented Jun 5, 2024

gjtempleton left a comment

gjtempleton Jun 26, 2024

elmiko Jun 27, 2024

elmiko left a comment

elmiko Jun 27, 2024

jackfrancis commented Jun 27, 2024

k8s-ci-robot commented Aug 22, 2024

Add CP node deletion param configurable #6571

Are you sure you want to change the base?

Add CP node deletion param configurable #6571

Conversation

gandhipr commented Feb 27, 2024 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

tallaxes left a comment

Choose a reason for hiding this comment

tallaxes commented Mar 22, 2024

linux-foundation-easycla bot commented Apr 2, 2024 • edited Loading

jackfrancis commented Apr 9, 2024

k8s-ci-robot commented Apr 9, 2024

jackfrancis commented Apr 9, 2024

jackfrancis left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented May 1, 2024

jackfrancis commented Jun 5, 2024

gjtempleton left a comment

Choose a reason for hiding this comment

gjtempleton Jun 26, 2024

Choose a reason for hiding this comment

elmiko Jun 27, 2024

Choose a reason for hiding this comment

elmiko left a comment

Choose a reason for hiding this comment

elmiko Jun 27, 2024

Choose a reason for hiding this comment

jackfrancis commented Jun 27, 2024

k8s-ci-robot commented Aug 22, 2024

gandhipr commented Feb 27, 2024 •

edited

Loading

linux-foundation-easycla bot commented Apr 2, 2024 •

edited

Loading