Add support for all-or-nothing scale-up strategy #6821

aleksandra-malinowska · 2024-05-13T09:26:43Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Implements all-or-nothing scaling strategy. This is needed for supporting atomic-scale-up.kubernetes.io ProvisioningRequests: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/provisioning-request.md

Issue: #6815

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/cc @yaroslava-serdiuk @towca

yaroslava-serdiuk

Can you add some test cases?

yaroslava-serdiuk · 2024-05-13T13:31:42Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

+			return stopAllOrNothingScaleUp(podEquivalenceGroups, skippedNodeGroups, nodeGroups)
+		}
+	}
+
 	// Execute scale up.
 	klog.V(1).Infof("Final scale-up plan: %v", scaleUpInfos)
 	aErr, failedNodeGroups := o.scaleUpExecutor.ExecuteScaleUps(scaleUpInfos, nodeInfos, now)


We should also add allOrNothing variable here, and use AtomicIncreaseSize() method if allOrNothing=true.

yaroslava-serdiuk · 2024-05-13T13:33:40Z

cluster-autoscaler/provisioningrequest/orchestrator/wrapper_orchestrator.go

@@ -79,9 +80,9 @@ func (o *WrapperOrchestrator) ScaleUp(
 	}

 	if o.scaleUpRegularPods {
-		return o.podsOrchestrator.ScaleUp(regularPods, nodes, daemonSets, nodeInfos)
+		return o.podsOrchestrator.ScaleUp(regularPods, nodes, daemonSets, nodeInfos, allOrNothing)


I think for podsOrchestrator allOrNothing=false

It is false and set in static_autoscaler.go, I'm just passing the variable here.

I wanted this to be set to whatever is the default for the autoscaler, and the custom orchestrators to override it only as needed (like atomic scale-up for ProvisioningRequests will).

yaroslava-serdiuk · 2024-05-14T13:50:37Z

/lgtm

kisieland · 2024-05-14T14:22:53Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

+		if allOrNothing && bestOption.NodeGroup.MaxSize() < newNodes {
+			klog.V(1).Infof("Can only create a new node group with max %d nodes, need %d nodes", bestOption.NodeGroup.MaxSize(), newNodes)
+			return stopAllOrNothingScaleUp(podEquivalenceGroups, skippedNodeGroups, nodeGroups)
+		}


Is this needed? Will it not be checked in 155-156 and ComputeExpansionOption?

In 155-156 we only exclude cases when there's at least one pod that can't be scheduled on nodes belonging to this group at all. If all pods are compatible with that node group, we'll just get a higher node count (possibly higher than group max size - but we can still balance across similar groups later).

In computeExpansionOption we only cap the node count for special case (groups that scale from 0 to max).

We don't check group max size for all cases in either of those places because of balancing across similar groups, which we only do just before executing scale-up.

So without this check, for a scale-up not affected by the above two cases we could create a node group, then decide not to scale it up.

I don't think MaxSize() is checked before balancing node groups below, because the amount over max can be distributed to the similar node groups.

But why check newNodes against the max size of just one group, if we're balancing between groups below (and verifying that the total capacity is big enough) anyway?

But why check newNodes against the max size of just one group, if we're balancing between groups below (and verifying that the total capacity is big enough) anyway?

This is checked only for autoprovisioned group just before it's created. I think we currently don't have information how many similar groups there'll be. I suspect estimating this may require changes on cloud provider side (for example, if creating a group in one zone implicitly creates groups in another zone with identical taint/labels/resources).

We agreed with @kisieland offline that it's acceptable to limit it for autoprovisioning to max size of one group initially and leave an open issue for handling this better for cases where an autoprovisioned groups will have unobvious similar groups.

Do I understand correctly that the final total capacity check would still catch this (because the balancing processor also caps all the requests to MaxSize()), but this lets us avoid an unnecessary node group creation in that case?

In any case, this seems okay. As a nit, maybe leave a TODO for the issue when you have it.

Do I understand correctly that the final total capacity check would still catch this (because the balancing processor also caps all the requests to MaxSize()), but this lets us avoid an unnecessary node group creation in that case?

Yes, that's exactly the case.

As a nit, maybe leave a TODO for the issue when you have it.

Will do.

kisieland · 2024-05-14T14:30:01Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator_test.go

Please add some test cases

All of my new test cases are in the other PR: #6824

I can add some here as well to have more granularity.

Added test case which fails if allOrNothing is false. More extensive test cases remain in #6824

kisieland · 2024-05-14T14:31:27Z

cluster-autoscaler/core/scaleup/orchestrator/executor.go

+	var err error
+	if allOrNothing {
+		err = info.Group.AtomicIncreaseSize(increase)
+	} else {
+		err = info.Group.IncreaseSize(increase)
+	}
+	if err != nil {


Also, this should fallback to IncreaseSize(increase) if AtomicIncreaseSize(increase) is not implemented.

cluster-autoscaler/core/scaleup/orchestrator/executor.go

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

towca · 2024-05-14T17:13:59Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

+		if allOrNothing && bestOption.NodeGroup.MaxSize() < newNodes {
+			klog.V(1).Infof("Can only create a new node group with max %d nodes, need %d nodes", bestOption.NodeGroup.MaxSize(), newNodes)
+			return stopAllOrNothingScaleUp(podEquivalenceGroups, skippedNodeGroups, nodeGroups)
+		}


I don't think MaxSize() is checked before balancing node groups below, because the amount over max can be distributed to the similar node groups.

But why check newNodes against the max size of just one group, if we're balancing between groups below (and verifying that the total capacity is big enough) anyway?

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

cluster-autoscaler/provisioningrequest/orchestrator/orchestrator.go

towca · 2024-05-15T15:54:16Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

@@ -89,6 +89,7 @@ func (o *ScaleUpOrchestrator) ScaleUp(
 	nodes []*apiv1.Node,
 	daemonSets []*appsv1.DaemonSet,
 	nodeInfos map[string]*schedulerframework.NodeInfo,
+	allOrNothing bool, // Either request enough capacity for all unschedulablePods, or don't request it at all.


nit: Could you also copy the comment to the Orchestrator interface (e.g. in the follow-up PR)?

Will do in follow-up PR

towca · 2024-05-15T16:05:57Z

cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go

+			// Can't execute a scale-up that will accommodate all pods, so nothing is considered schedulable.
+			klog.V(1).Info("Not attempting scale-up due to all-or-nothing strategy: not all pods would be accommodated")
+			markedEquivalenceGroups := markAllGroupsAsUnschedulable(podEquivalenceGroups, AllOrNothingReason)
+			return buildNoOptionsAvailableStatus(markedEquivalenceGroups, skippedNodeGroups, nodeGroups), nil


Could you use the helper function for the 2 other ScaleUpNoOptionsAvailable returns from this method?

This can be fixed in the follow-up PR as well if you prefer

towca · 2024-05-15T16:32:17Z

Left 3 nits, otherwise LGTM! Feel free to unhold if you don't agree with the nits or if you prefer to address them in the follow-up PR.

/lgtm
/approve
/label tide/merge-method-squash
/hold

k8s-ci-robot · 2024-05-15T16:32:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aleksandra-malinowska, towca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [towca]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

aleksandra-malinowska · 2024-05-16T09:01:08Z

Stuff left to address in follow-up PR:

comment explaining allOrNothing sematics
use the helper function for ScaleUpNoOptionAvailable status everywhere
add TODO with link to the issue for improving condition controlling if the group should be created

/unhold

* Add support for all-or-nothing scale-up strategy * Review fixes

yaroslava-serdiuk · 2024-06-17T14:00:53Z

/cherry-pick cluster-autoscaler-release-1.30

k8s-infra-cherrypick-robot · 2024-06-17T14:01:39Z

@yaroslava-serdiuk: #6821 failed to apply on top of branch "cluster-autoscaler-release-1.30":

Applying: Add support for all-or-nothing scale-up strategy
Using index info to reconstruct a base tree...
M	cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go
M	cluster-autoscaler/core/scaleup/orchestrator/orchestrator_test.go
M	cluster-autoscaler/provisioningrequest/orchestrator/orchestrator_test.go
Falling back to patching base and 3-way merge...
Auto-merging cluster-autoscaler/provisioningrequest/orchestrator/orchestrator_test.go
Auto-merging cluster-autoscaler/core/scaleup/orchestrator/orchestrator_test.go
Auto-merging cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go
CONFLICT (content): Merge conflict in cluster-autoscaler/core/scaleup/orchestrator/orchestrator.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Add support for all-or-nothing scale-up strategy
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick cluster-autoscaler-release-1.30

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

* Add support for all-or-nothing scale-up strategy * Review fixes

yaroslava-serdiuk · 2024-07-09T08:45:56Z

/cherry-pick cluster-autoscaler-release-1.30

k8s-infra-cherrypick-robot · 2024-07-09T08:47:12Z

@yaroslava-serdiuk: new pull request created: #7015

In response to this:

/cherry-pick cluster-autoscaler-release-1.30

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot requested review from towca and yaroslava-serdiuk May 13, 2024 09:26

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 13, 2024

yaroslava-serdiuk reviewed May 13, 2024

View reviewed changes

aleksandra-malinowska force-pushed the all-or-nothing-scale-up branch from 0fd691d to accc515 Compare May 14, 2024 10:28

aleksandra-malinowska mentioned this pull request May 14, 2024

Implement ProvisioningClass for best-effort-atomic-scale-up.kubernetes.io ProvisioningRequests #6824

Merged

k8s-ci-robot assigned yaroslava-serdiuk May 14, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 14, 2024

kisieland reviewed May 14, 2024

View reviewed changes

Add support for all-or-nothing scale-up strategy

60ffbf1

aleksandra-malinowska force-pushed the all-or-nothing-scale-up branch from accc515 to 60ffbf1 Compare May 14, 2024 15:13

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 14, 2024

towca reviewed May 14, 2024

View reviewed changes

Review fixes

4a9d939

aleksandra-malinowska force-pushed the all-or-nothing-scale-up branch from 5524ddd to 4a9d939 Compare May 15, 2024 11:20

towca reviewed May 15, 2024

View reviewed changes

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. labels May 15, 2024

k8s-ci-robot assigned towca May 15, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2024

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 16, 2024

k8s-ci-robot merged commit e795ac9 into kubernetes:master May 16, 2024
6 checks passed

kwiesmueller pushed a commit to kwiesmueller/k8s-autoscaler that referenced this pull request May 16, 2024

Add support for all-or-nothing scale-up strategy (kubernetes#6821)

d258224

* Add support for all-or-nothing scale-up strategy * Review fixes

ravisinha0506 pushed a commit to ravisinha0506/autoscaler that referenced this pull request Jun 17, 2024

Add support for all-or-nothing scale-up strategy (kubernetes#6821)

ad434bd

* Add support for all-or-nothing scale-up strategy * Review fixes

This was referenced Jul 3, 2024

[cluster-autoscaler-release-1.30] Refactor Provisioning Request orchestrator unit tests #6933

Merged

[cluster-autoscaler-release-1.30] Refactor out unnecessary return value; it's always nil #6932

Closed

k8s-infra-cherrypick-robot mentioned this pull request Jul 9, 2024

[cluster-autoscaler-release-1.30] Add support for all-or-nothing scale-up strategy #7015

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for all-or-nothing scale-up strategy #6821

Add support for all-or-nothing scale-up strategy #6821

aleksandra-malinowska commented May 13, 2024

yaroslava-serdiuk left a comment

yaroslava-serdiuk May 13, 2024

aleksandra-malinowska May 14, 2024

yaroslava-serdiuk May 13, 2024

aleksandra-malinowska May 14, 2024

yaroslava-serdiuk commented May 14, 2024

kisieland May 14, 2024

aleksandra-malinowska May 14, 2024

towca May 14, 2024

aleksandra-malinowska May 15, 2024

towca May 15, 2024

aleksandra-malinowska May 16, 2024

kisieland May 14, 2024

aleksandra-malinowska May 14, 2024

aleksandra-malinowska May 15, 2024

kisieland May 14, 2024

aleksandra-malinowska May 14, 2024

towca May 14, 2024

towca May 15, 2024

aleksandra-malinowska May 16, 2024

towca May 15, 2024

towca May 15, 2024

aleksandra-malinowska May 16, 2024

towca commented May 15, 2024

k8s-ci-robot commented May 15, 2024

aleksandra-malinowska commented May 16, 2024

yaroslava-serdiuk commented Jun 17, 2024

k8s-infra-cherrypick-robot commented Jun 17, 2024

yaroslava-serdiuk commented Jul 9, 2024

k8s-infra-cherrypick-robot commented Jul 9, 2024

Add support for all-or-nothing scale-up strategy #6821

Add support for all-or-nothing scale-up strategy #6821

Conversation

aleksandra-malinowska commented May 13, 2024

What type of PR is this?

What this PR does / why we need it:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

yaroslava-serdiuk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaroslava-serdiuk commented May 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

towca commented May 15, 2024

k8s-ci-robot commented May 15, 2024

aleksandra-malinowska commented May 16, 2024

yaroslava-serdiuk commented Jun 17, 2024

k8s-infra-cherrypick-robot commented Jun 17, 2024

yaroslava-serdiuk commented Jul 9, 2024

k8s-infra-cherrypick-robot commented Jul 9, 2024