Convert scale-down checks to drainability rules #6164

artemvmin · 2023-09-30T02:15:04Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Convert replicated pod, kube-system, not-safe-to-evict annotation, local storage, and pdb scale-down checks to drainability rules.

Related: #6135

Does this PR introduce a user-facing change?

None

artemvmin · 2023-09-30T02:16:09Z

/assign x13n

x13n · 2023-10-02T11:57:45Z

cluster-autoscaler/simulator/drainability/rules/replicated/rule.go

+
+// Drainable decides what to do with replicated pods on node drain.
+func (Rule) Drainable(drainCtx *drainability.DrainContext, pod *apiv1.Pod) drainability.Status {
+	if drain.IsPodLongTerminating(pod, drainCtx.Timestamp) {


Instead of having checks like this which are not relevant to the specific rule (why does replicated rule check for things like whether the pod is long terminating? Or - below - why does it check whether safe-to-evict annotation is added?), I'd just make sure the rules are used in the right order. Otherwise it may be hard to reason about them and even harder to make changes since now the same check appears in many different rules.

The original logic only performs checks (replicated, safe-to-evict, system, etc.) if the pod is not long terminating. Therefore, all of these rules have to check this and exit with nil (i.e. UndefinedStatus) if the pod is not worth considering. Indenting error flow is the recommended style pattern: https://google.github.io/styleguide/go/decisions#indent-error-flow

All checks that return non-nil (i.e. BlockingStatus) are correctly distributed among rules.

Do you have ideas for how to simplify this?

My thinking was to have a Rule for handling long terminating pods and just skipping them. This is exactly what happens with mirror pods already - they could be combined into a single rule to reduce logic-to-boilerplate ratio - or kept as separate super-simple rules.

This is a great idea. Done.

Unlike the mirror pods, we can't use skip directly, as these pods are usually not skipped. I added a field to the drainability.Status object to interrupt remaining rule checks without causing a skip condition.

cluster-autoscaler/simulator/drainability/context.go

x13n · 2023-10-02T12:04:10Z

cluster-autoscaler/simulator/drainability/rules/rules.go

+		replicated.New(),
+		system.New(),
+		notsafetoevict.New(),
+		localstorage.New(),


localstorage doesn't make sense to be used at all when DeleteOptions.SkipNodesWithLocalStorage is false. Can DeleteOptions be passed here to determine which rules are used instead?

Similarly, system is only useful when SkipNodesWithSystemPods is true.

That trades a tiny amount of computation for coupling between rules. With the current implementation, the rules framework just registers known rules and doesn't have to have any understanding of rule conditions. Putting any amount of validation higher up in the drainability rules framework is a bad idea for modularity.

One option is to add an Enabled(drainCtx, deleteOptions) method to the rules interface that will be run when generating rules. However, I would argue the overhead isn't worth it. A single Drainable() function that returns early is minimal computation and keeps all logic in a single place.

rules.Default already has to handle enablement flags to specific rules - it's just now the rules need to handle that state themselves, which makes each one more complex - some rules now have to handle a case in which they behave as if they didn't exist at all. It's not really about saving compute time - you're right that's negligible - but rather about keeping the rules logic focus on what they're supposed to be checking. I'd expect something else (now it is rules.Default) to collect all rules that should be used.

That being said, I don't have a strong opinion on this, so leaving it up to you.

I implemented this in the latest commit. I'm still not convinced.

What I like about this implementation is that it attempts to keep DeleteOptions at the framework level and not pass them down to the rules. Unfortunately, we still need to pass the legacy custom controller option to the replicated rule until the option is deprecated, which nullifies this argument.

What I dislike about it is that it is adding complexity to the framework, whithout much benefit. While not having a rule is conceptually better than the enabled field within select rules, the field is actually well contained within a rule and uses minimal compute to exit early. I can see this evolving into more business logic living at the DefaultRules construction and that turning into spaghetti code. Simple registration and delegation for all business logic is a simpler model.

Please take a look and let me know what you think.

Another alternative is to add an Enabled() field to every rule, as previously suggested, but this is the worst of both worlds (all options still passed to the rules + more boilerplate).

As mentioned before, I don't have a strong preference here - current version looks good to me, but adding enabled field within specific rules is also acceptable.

cluster-autoscaler/simulator/drainability/rules/pdb/rule.go

x13n · 2023-10-06T12:16:35Z

cluster-autoscaler/simulator/drainability/status.go

@@ -48,6 +48,23 @@ type Status struct {
 	BlockingReason drain.BlockingPodReason
 	// Error contains an optional error message.
 	Error error
+	// Interrupted means that the Rule returning the status exited early and that


This has the same semantics as Outcome != UndefinedOutcome. Can you clarify why do you think it is needed?

It was necessary to deal with the branched logic, but it's no longer needed with the below refactor.

x13n · 2023-10-06T12:18:48Z

cluster-autoscaler/utils/drain/drain.go

@@ -74,188 +70,44 @@ const (
 	UnexpectedError
 )

-// GetPodsForDeletionOnNodeDrain returns pods that should be deleted on node drain as well as some extra information
-// about possibly problematic pods (unreplicated and DaemonSets).
+// GetPodsForDeletionOnNodeDrain returns pods that should be deleted on node


Cannot this function be just removed at this point?

I didn't do this because it's a public function and may have downstream dependencies. But since you think it's a good idea, I merged the logic and migrated the tests.

x13n · 2023-10-09T13:56:14Z

cluster-autoscaler/simulator/drain.go

-			pods = append(pods, podInfo.Pod)
-		case drainability.DrainOk:
+		case drainability.UndefinedOutcome, drainability.DrainOk:
+			if drain.IsPodLongTerminating(pod, timestamp) {


Whether to skip a pod or not should be determined by rules, here you are effectively overriding the decision to treat both UndefinedOutcome and DrainOk as SkipDrain for long terminating pods. Why?

No, SkipDrain doesn't get added to the drainable pods list (notice how there is no switch case for SkipDrain). UndefinedOutcome used to pass pods to the GetPodsForDeletionOnNodeDrain, but these pods are now integrated into the drainability checks. UndefinedOutcome and DrainOk are treated identically after the refactor.

Apologies if I wasn't clear. I didn't mean to treat DrainOk/UndefinedOutcome and SkipDrain the same here, just that the special case for long terminating pods effectively causes DrainOk to be treated as if it was SkipDrain. Since now longterminating rule simply returns SkipDrain, the whole special casing is really a no-op.

Got it. You want this condition removed. Agreed that this can be delegated to the longterminating rule.

x13n · 2023-10-09T14:22:09Z

cluster-autoscaler/simulator/drainability/rules/customcontroller/rule.go

+			return drainability.NewBlockedStatus(drain.MinReplicasReached, fmt.Errorf("replication controller for %s/%s has too few replicas spec: %d min: %d", pod.Namespace, pod.Name, rc.Spec.Replicas, r.minReplicaCount))
+		}
+	} else if pod_util.IsDaemonSetPod(pod) {
+		if refKind == "DaemonSet" {


This should be !=. Might be worth adding a unit test that would catch this...

Nice catch. Added tests.

x13n · 2023-10-09T14:29:42Z

cluster-autoscaler/simulator/drainability/rules/customcontroller/rule.go

+limitations under the License.
+*/
+
+package customcontroller


nit: Would package replicacount better reflect the logic here?

x13n · 2023-10-09T14:38:24Z

cluster-autoscaler/simulator/drainability/rules/longterminating/rule.go

+// Drainable decides what to do with long terminating pods on node drain.
+func (r *Rule) Drainable(drainCtx *drainability.DrainContext, pod *apiv1.Pod) drainability.Status {
+	if drain.IsPodLongTerminating(pod, drainCtx.Timestamp) {
+		return drainability.NewDrainableStatus()


This should be SkipDrain to be consistent with previous logic.

Ah, you're right. The continue doesn't add it to the list. Fixed.

x13n · 2023-10-09T14:44:52Z

cluster-autoscaler/simulator/drainability/rules/longterminating/rule_test.go

+				ObjectMeta: metav1.ObjectMeta{
+					Name:              "bar",
+					Namespace:         "default",
+					DeletionTimestamp: &metav1.Time{Time: testTime.Add(-2 * time.Duration(extendedGracePeriod) * time.Second)},


The original test case had DeletionTimestamp extendedGracePeriod/2 seconds ago, not 2*extendedGracePeriod seconds ago and so wasn't considered long terminating - it should result in drainability.NewUndefinedStatus().

That was a test that I believe was incorrectly constructed and that I fixed during the refactor. We want to test that extended grace period specified in the spec is respected.

Added the test case you described in addition to the existing one.

cluster-autoscaler/simulator/drainability/rules/replicated/rule.go

cluster-autoscaler/utils/drain/drain.go

x13n · 2023-10-10T11:09:06Z

cluster-autoscaler/simulator/drainability/rules/longterminating/rule_test.go

+			},
+			want: drainability.NewSkipStatus(),
+		},
+		"long terminating pod with expired grace period": {


nit: this one actually doesn't have "expired" grace period - it is extended grace period that didn't expire yet. "expired" better matches the previous test case actually.

Okay, I'm understanding now. We want to skip pods that are expired. Also added another case for the zero extended duration case.

x13n · 2023-10-10T11:17:09Z

cluster-autoscaler/simulator/drain.go

-			pods = append(pods, podInfo.Pod)
-		case drainability.DrainOk:
+		case drainability.UndefinedOutcome, drainability.DrainOk:
+			if drain.IsPodLongTerminating(pod, timestamp) {


Apologies if I wasn't clear. I didn't mean to treat DrainOk/UndefinedOutcome and SkipDrain the same here, just that the special case for long terminating pods effectively causes DrainOk to be treated as if it was SkipDrain. Since now longterminating rule simply returns SkipDrain, the whole special casing is really a no-op.

… to drainability rules

…ity rules

x13n

Thanks! Looks good to me now.

/lgtm
/approve

k8s-ci-robot · 2023-10-11T11:02:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: artemvmin, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [x13n]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

artemvmin force-pushed the scale-down-drainability branch from 746bcad to c6bee38 Compare September 30, 2023 02:15

k8s-ci-robot requested a review from BigDarkClown September 30, 2023 02:15

k8s-ci-robot added the area/cluster-autoscaler label Sep 30, 2023

k8s-ci-robot requested a review from feiskyer September 30, 2023 02:15

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 30, 2023

k8s-ci-robot assigned x13n Sep 30, 2023

artemvmin force-pushed the scale-down-drainability branch from c6bee38 to e0b3e41 Compare September 30, 2023 07:02

artemvmin changed the title ~~Scale down drainability~~ Convert scale-down checks to scalability rules Sep 30, 2023

artemvmin changed the title ~~Convert scale-down checks to scalability rules~~ Convert scale-down checks to drainability rules Sep 30, 2023

artemvmin force-pushed the scale-down-drainability branch 2 times, most recently from 00ba312 to 1775757 Compare September 30, 2023 07:26

x13n requested changes Oct 2, 2023

View reviewed changes

artemvmin force-pushed the scale-down-drainability branch 3 times, most recently from 68c7829 to 26afed5 Compare October 4, 2023 01:19

x13n requested changes Oct 6, 2023

View reviewed changes

artemvmin force-pushed the scale-down-drainability branch 3 times, most recently from d0d4456 to b3e3eeb Compare October 7, 2023 00:25

x13n requested changes Oct 9, 2023

View reviewed changes

artemvmin force-pushed the scale-down-drainability branch 3 times, most recently from 6de45a3 to 6c71099 Compare October 9, 2023 17:04

x13n requested changes Oct 10, 2023

View reviewed changes

artemvmin force-pushed the scale-down-drainability branch from f663323 to e2f4163 Compare October 10, 2023 18:36

artemvmin force-pushed the scale-down-drainability branch from e2f4163 to b94c450 Compare October 10, 2023 18:40

artemvmin added 9 commits October 10, 2023 20:35

Add pdb filtering to remainingPdbTracker

7cb08df

Convert replicated, system, not-safe-to-evict, and local storage pods…

125e9c1

… to drainability rules

Convert scale-down pdb check to drainability rule

d6ee96d

Pass DeleteOptions once during default rule creation

324a33e

Split out custom controller and common checks into separate drainabil…

2fa6c26

…ity rules

Filter out disabled drainability rules during creation

96e67ee

Refactor GetPodsForDeletion logic and tests into simulator

084264c

Fix custom controller drainability rule and add test coverage

9a7459c

Add unit test for long-terminating pod past grace period

33e300f

artemvmin force-pushed the scale-down-drainability branch from b94c450 to 33e300f Compare October 10, 2023 20:35

artemvmin mentioned this pull request Oct 11, 2023

Rule ordering #6187

Closed

x13n approved these changes Oct 11, 2023

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 11, 2023

k8s-ci-robot merged commit 133fdc7 into kubernetes:master Oct 11, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert scale-down checks to drainability rules #6164

Convert scale-down checks to drainability rules #6164

artemvmin commented Sep 30, 2023 •

edited

Loading

artemvmin commented Sep 30, 2023

x13n Oct 2, 2023

artemvmin Oct 2, 2023

x13n Oct 3, 2023

artemvmin Oct 4, 2023

x13n Oct 2, 2023

artemvmin Oct 2, 2023

x13n Oct 3, 2023

artemvmin Oct 4, 2023

x13n Oct 6, 2023

x13n Oct 6, 2023

artemvmin Oct 7, 2023

x13n Oct 6, 2023

artemvmin Oct 6, 2023

x13n Oct 9, 2023

artemvmin Oct 9, 2023 •

edited

Loading

x13n Oct 10, 2023

artemvmin Oct 10, 2023

x13n Oct 9, 2023

artemvmin Oct 9, 2023

x13n Oct 9, 2023

artemvmin Oct 9, 2023

x13n Oct 9, 2023

artemvmin Oct 9, 2023

x13n Oct 9, 2023

artemvmin Oct 9, 2023

artemvmin Oct 9, 2023

x13n Oct 10, 2023

artemvmin Oct 10, 2023

x13n Oct 10, 2023

x13n left a comment

k8s-ci-robot commented Oct 11, 2023

Convert scale-down checks to drainability rules #6164

Convert scale-down checks to drainability rules #6164

Conversation

artemvmin commented Sep 30, 2023 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Does this PR introduce a user-facing change?

artemvmin commented Sep 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artemvmin Oct 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

x13n left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Oct 11, 2023

artemvmin commented Sep 30, 2023 •

edited

Loading

artemvmin Oct 9, 2023 •

edited

Loading