Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toBeDeleted taints from all nodes, not only ready ones #4211

alfredkrohmer · 2021-07-21T11:45:50Z

…eDeleted taints from all nodes, not only ready ones

x13n

Can you add some unit tests that would fail without this change?

k8s-ci-robot · 2021-09-14T11:44:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alfredkrohmer
To complete the pull request process, please assign feiskyer after the PR has been reviewed.
You can assign the PR to them by writing /assign @feiskyer in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alfredkrohmer · 2021-09-14T11:44:32Z

@x13n Added a unit test in TestUpcomingNodes. Without the fix, the test is failing with:

--- FAIL: TestUpcomingNodes (0.00s)
    clusterstate_test.go:518: 
                Error Trace:    clusterstate_test.go:518
                Error:          "map[ng1:%!s(int=6) ng2:%!s(int=1) ng3:%!s(int=2) ng5:%!s(int=1)]" should not contain "ng5"
                Test:           TestUpcomingNodes

x13n · 2021-09-17T08:20:15Z

I read the original issue and while I understand the need to clean up taints on all nodes instead of just ready ones, I don't understand why we're reducing the number of nodes that should come up by subtracting the nodes that are being deleted.

Suppose:

there are 10 nodes in the node group
one of them starts being scaled down, target size is now 9
before all pods are successfully evicted, new unschedulable pods appear in the cluster, increasing the target size back to 10
new node won't get scheduled now, because we're going to subtract the deleted node from the target size
pods will stay pending until the old node is deleted - only after that happens, the new node will get provisioned

alfredkrohmer · 2021-09-21T09:56:29Z

I don't understand why we're reducing the number of nodes that should come up by subtracting the nodes that are being deleted

That's not the case here, this is not being subtracted from the nodes that should come up, it's subtracted from the number of nodes that are supposedly coming up right now (that cluster-autoscaler thinks are coming up right now).

x13n · 2021-10-25T14:55:04Z

That's not the case here, this is not being subtracted from the nodes that should come up, it's subtracted from the number of nodes that are supposedly coming up right now (that cluster-autoscaler thinks are coming up right now).

The number of nodes that are supposedly coming up right now will later impact the number of nodes that should come up: CA will think that there's no need to create more nodes, since they are coming up already.

alfredkrohmer · 2021-11-05T11:31:43Z

The number of nodes that are supposedly coming up right now will later impact the number of nodes that should come up

Exactly

CA will think that there's no need to create more nodes, since they are coming up already.

My PR is reducing the number of nodes that CA thinks is coming up by subtracting the number of nodes being deleted - otherwise CA thinks these nodes that are being deleted are actually coming up right now which is not true. This actually increases (not decreases) the number of nodes that should come up.

x13n · 2021-11-09T10:36:52Z

Hm, you're right, this will increase, not decrease the number of upcoming nodes. That brings about another concern though: if some nodes are tainted for deletion, but don't actually end up deleted, we may exceed the max size. Nodes that are being deleted can end up staying around e.g. if CA is restarted in the meantime.

alfredkrohmer · 2021-11-09T16:47:43Z

Nodes that are being deleted can end up staying around e.g. if CA is restarted in the meantime.

That's the other part of the PR where deletion taints are properly removed on CA startup.

x13n · 2021-11-10T12:25:10Z

That's the other part of the PR where deletion taints are properly removed on CA startup.

Proper removal of deletion taints is not my concern here. We may still exceed max # of nodes in the following scenario:

Node X is tainted for deletion
Another node Y starts getting created as a part of scale up
CA crashes before X is actually deleted
New CA process clears deletion taint from node X

Now we have both X and Y up and running even though they might be exceeding the limit.

alfredkrohmer · 2021-11-11T09:32:37Z

I don't see how this behavior could be triggered by my PR. My PR doesn't touch the scale-up logic; again, it only changes the number of nodes CA thinks are coming up right now, it doesn't change the number of nodes it thinks that are there. I don't see how this would cause exceeding the max number of nodes in a NG.

alfredkrohmer · 2021-11-11T18:58:40Z

@x13n I think I found what you mean. Here:

autoscaler/cluster-autoscaler/core/scale_up.go

Lines 476 to 478 in 86068ba

    
           if context.MaxNodesTotal > 0 && len(nodes)+newNodes+len(upcomingNodes) > context.MaxNodesTotal { 
        
           	klog.V(1).Infof("Capping size to max cluster total size (%d)", context.MaxNodesTotal) 
        
           	newNodes = context.MaxNodesTotal - len(nodes) - len(upcomingNodes)

the number of upcoming nodes is considered to check if the maximum number of nodes in the whole cluster would be exceeded. If you want, I can add/subtract the deleted nodes here as well, then it's the same behavior as before. However, I'm not sure what the best course of action would actually be here (allow to exceed the max while nodes are being deleted vs risk that scale up is not possible when max would be exceeded).

As for the per-node group max number of nodes, this shouldn't be affected by my change, as this check considers only the actual number of nodes in the node group (independent of the state of the node in k8s):

autoscaler/cluster-autoscaler/processors/nodegroupset/balancing_processor.go

Lines 87 to 113 in 86068ba

    
           totalCapacity := 0 
        
           for _, ng := range groups { 
        
           	currentSize, err := ng.TargetSize() 
        
           	if err != nil { 
        
           		return []ScaleUpInfo{}, errors.NewAutoscalerError( 
        
           			errors.CloudProviderError, 
        
           			"failed to get node group size: %v", err) 
        
           	} 
        
           	maxSize := ng.MaxSize() 
        
           	if currentSize == maxSize { 
        
           		// group already maxed, ignore it 
        
           		continue 
        
           	} 
        
           	info := ScaleUpInfo{ 
        
           		Group:       ng, 
        
           		CurrentSize: currentSize, 
        
           		NewSize:     currentSize, 
        
           		MaxSize:     maxSize} 
        
           	scaleUpInfos = append(scaleUpInfos, info) 
        
           	if maxSize-currentSize > 0 { 
        
           		totalCapacity += maxSize - currentSize 
        
           	} 
        
           } 
        
           if totalCapacity < newNodes { 
        
           	klog.V(2).Infof("Requested scale-up (%v) exceeds node group set capacity, capping to %v", newNodes, totalCapacity) 
        
           	newNodes = totalCapacity 
        
           }

x13n · 2021-11-23T14:27:44Z

Apologies for late response. Yes, I meant the logic in

autoscaler/cluster-autoscaler/core/scale_up.go

Lines 476 to 478 in 86068ba

    
           if context.MaxNodesTotal > 0 && len(nodes)+newNodes+len(upcomingNodes) > context.MaxNodesTotal { 
        
           	klog.V(1).Infof("Capping size to max cluster total size (%d)", context.MaxNodesTotal) 
        
           	newNodes = context.MaxNodesTotal - len(nodes) - len(upcomingNodes)

I think deleted nodes should also be considered here. In edge cases when deletion takes arbitrarily long time, we could easily exceed the maximum if deleted nodes are not accounted for in the scale up logic.

alfredkrohmer · 2021-11-26T18:19:34Z

I tried to somehow get the number of deleted nodes at this point in the code to add / subtract them and so I went through the call graph up to where ScaleUp is called and I suspect I found more errors in the surrounding implementation.

One thing upfront that I found: len(nodes) (here in the ScaleUp function) does contain the deleted nodes - but only if they are not cordoned; hence:

for not-cordoned "deleted" nodes: the current calculation is actually broken, it's off by n (number of deleted nodes) in the other direction, i.e. already preventing scale up at max-n because they would be contained twice in the sum, in len(nodes) and in len(upcomingNodes), my change would fix this.
for cordoned "deleted" nodes: the current calculation is correct and my change would break it by allowing to scale up to max+n; see the following points for a proposed solution

In the RunOnce function here:

autoscaler/cluster-autoscaler/core/static_autoscaler.go

Line 235 in 86068ba

allNodes, readyNodes, typedErr := a.obtainNodeLists(a.CloudProvider)

allNodes and readyNodes are retrieved (allNodes includes unready and cordoned nodes, readyNodes doesn't)

In the same function, readyNodes is used three times in a relevant way.

First time to derive node group infos from the ready nodes:

autoscaler/cluster-autoscaler/core/static_autoscaler.go

Line 283 in 86068ba

    
           nodeInfosForGroups, autoscalerError := a.processors.TemplateNodeInfoProvider.Process(autoscalingContext, readyNodes, daemonsets, a.ignoredTaints)

Second time for another check that the max number of nodes in the cluster is not yet reached:

autoscaler/cluster-autoscaler/core/static_autoscaler.go

Line 420 in 86068ba

} else if a.MaxNodesTotal > 0 && len(readyNodes) >= a.MaxNodesTotal {

Third time to finally call the ScaleUp function:

autoscaler/cluster-autoscaler/core/static_autoscaler.go

Line 435 in 86068ba

    
           scaleUpStatus, typedErr = ScaleUp(autoscalingContext, a.processors, a.clusterStateRegistry, unschedulablePodsToHelp, readyNodes, daemonsets, nodeInfosForGroups, a.ignoredTaints)

I believe it would be more correct to pass allNodes instead in these three uses of readyNodes.

In the first use, I think it makes sense to build node group info from unready nodes (e.g. if it's the only one from this node group).
In the second use, as per your argument, cordoned and unready nodes should count into the max number of nodes.
In the third use (the actual problem I was trying to solve), this would make len(nodes) (here) contain the cordoned nodes and hence they would also be considered correctly for the max number of nodes in the cluster - so this would also make my previous change correct for cordoned "deleted" nodes.

Any thoughts?

x13n · 2021-12-21T20:01:25Z

Apologies again for super slow response. I had to take some time to go through the existing code. I think the proposal makes sense, I'm just a bit worried I didn't think about some edge case though - historically, clusterstate changes were a source of "interesting" bugs - and would prefer to keep this change behind a feature flag because of this. Can be on by default and can be removed in a few versions if there are no issues, but it might be safer to have a quick rollback mechanism. WDYT?

k8s-triage-robot · 2022-03-21T20:15:43Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

x13n · 2022-03-22T11:48:03Z

/remove-lifecycle stale

Maybe a feature flag is an overkill, but would appreciate another pair of eyes on this. @MaciekPytel - can you take a look?

…des from being considered from available capacity to schedule Pending Pods to.

MaciekPytel · 2022-04-26T16:20:12Z

Sorry for missing this for so long. I understand the problem being solved. I'm worried that this PR may fix it, but at the same time introduce an 'opposite' issue.

Today we count a node that is intended to be deleted as "Deleted". Because Deleted is mutually exclusive with Ready/Unready and because of how upcoming nodes calculation work, CA overestimates the number of upcoming nodes and doesn't trigger scale-up.
It can take time for node controller to remove kubernetes node object after the underlying VM has already been deleted. During this time the node also shows up in clusterstarte as Deleted. If we subtracted Deleted from TargetSize as proposed in this PR we would underestimate the number of upcoming nodes (we're subtracting a node that no longer counts towards TargetSize, as the VM no longer exists on cloudprovider side). Underestimating the number of upcoming nodes would most likely lead to too much scale-up.

The description of Deleted in Readiness struct makes it pretty clear to me that the intent of 'Deleted' node in Readiness is the case 2 above. And I think the current implementation addresses this problem correctly. If Deleted nodes are defined as the nodes that no longer exist on provider side, than we shouldn't subtract them from TargetSize.

I think the bug is that we're incorrectly identifying nodes as Deleted and the way to fix both problems 1 and 2 would be to fix that. Unfortunately that would probably require a slightly bigger code change: I think we should check if a particular instance exists on cloudprovider side as part of readiness calculation. This is very similar to how we identify unregistered nodes, just doing the check in other direction (check if there is a matching instance for a given node).

WDYT @alfredkrohmer @x13n ?

fookenc · 2022-04-27T02:30:02Z

Hi @MaciekPytel, thank you for taking a look! I wouldn't want to introduce new issues into the autoscaler, but I'm concerned that the readiness calculation isn't aware of nodes being drained & deleted.

To be exact, the readiness calculation doesn't consider taints that may exist on the nodes. If a node has the ToBeDeleted taint, it has an effect of NoSchedule. Pods wouldn't be capable of being scheduled against the node, but the autoscaler believes that enough resources exist to handle the pending pods. Unfortunately this blocks scale ups indefinitely while the node is being drained and deleted. This can be observed in the logs obtained in the bug #4456.

I1105 11:07:58.593642 1 filter_out_schedulable.go:157] Pod default.app-1-cf767bfc7-m5672 marked as unschedulable can be scheduled on node template-node-for-node-group-2-az2-6211936831103862438-0. Ignoring in scale up.
I1105 11:07:58.593798 1 filter_out_schedulable.go:157] Pod default.app-1-cf767bfc7-jtcc4 marked as unschedulable can be scheduled on node template-node-for-node-group-2-az2-6211936831103862438-0. Ignoring in scale up.
I1105 11:07:58.593818 1 filter_out_schedulable.go:170] 0 pods were kept as unschedulable based on caching
I1105 11:07:58.593825 1 filter_out_schedulable.go:171] 2 pods marked as unschedulable can be scheduled.

CA seem to be thinking that the above node can be used to schedule the un-schedulable pod but it is actually being deleted due to under-utilisation. Due to a max termination grace period of 3 hours (and cause the pods already running on the node failed to grace-fully terminate before that), CA had to wait 3 hours for all pods to be removed from the affected node. During this interval, there was no scale up triggered by CA even though there were pods that needed to be scheduled.

I have found that scaling up the pods enough to exceed the capacity/resources available of the tainted node will result in a new node being added. Given that the Deleted field is the count of nodes with the ToBeDeleted taint, I assumed that this value could only be zero or 1 in typical scenarios.

One alternative that I've been exploring is to have a separate controller monitor for the ToBeDeleted taint, and triggering an over provision a new node with Pause pods to prevent scale ups from being blocked. This does move the burden of overprovisioning outside of the autoscaler instead of having the readiness check adjusted to remove upcoming nodes that shouldn't be considered viable.

Please let me know your thoughts! Thanks!

MaciekPytel · 2022-04-27T12:54:08Z

Given that the Deleted field is the count of nodes with the ToBeDeleted taint

I think this is the problem. It's not intended to be the number of nodes with ToBeDeleted taint. It's intended to be a number of nodes that don't have a matching VM on cloudprovider side, because the VM has already been deleted.

My guess is that looking at a taint seemed like an easy proxy for nodes that have already been deleted by CA ¹. As we can see, it isn't. I think the right way to fix this issue without introducing new problems is to stop identifying nodes that are being deleted as already deleted. This should fix your scenario, as the node that is being deleted would not be counted as Deleted, meaning that it would be counted as either Ready or Unready, both of which are subtracted from upcoming nodes.

It also means that nodes that have actually been deleted (in the sense that VM no longer exists and it's just a kubernetes object that haven't been cleaned up yet) would still not be included in upcoming nodes calculation. This part is correct - those nodes no longer count towards NodeGroup.TargetSize() and so should not be subtracted from this number.

We already get list of cloudprovider instances in UpdateNodes() before the updateReadinessStats() call. I think we could use this data in updateReadinessStats() to only count node as Deleted if there is no corresponding instance.

The potentially tricky part would be understanding if there is any negative interaction with node_instances_cache (do we need to call InvalidateNodeInstancesCacheEntry() on scale-down? I'm guessing yes, but I haven't really dived into the code deeply enough to know).

This makes more sense if you consider when this logic was implemented. Back than node deletion was synchronous (ie. CA would completely block for the entire process), so from CA perspective tainting and deleting the node was effectively atomic. ↩

x13n · 2022-04-28T14:18:02Z

I think that part makes sense to me. I'd still clear node taints from all nodes on startup though - I don't see a reason to keep a (possibly intermittently) unready node tainted when CA restarts. It wouldn't automatically resume deletion anyway.

fookenc · 2022-04-29T17:28:00Z

I made a commit based on the suggestions provided, but not certain that I got it right. Within the updateReadinessStats, I switched to using the Cluster State Registry of the latest nodes unregistered in the cloud provider.

https://github.com/fookenc/autoscaler/blob/049e8ad0fc79bf2d6b9ef18b933efbb3d7d8415f/cluster-autoscaler/clusterstate/clusterstate.go#L541-L548

I also made a change to the GetReadinessState function to include the ToBeDeleted taint key to mark any node as Unready. I don't know if this is necessary, but felt that it made sense to mark it that way.

https://github.com/fookenc/autoscaler/blob/049e8ad0fc79bf2d6b9ef18b933efbb3d7d8415f/cluster-autoscaler/utils/kubernetes/ready.go#L72-L77

After building and deploying the changes from this commit¹ to a test cluster, the scale ups were no longer being blocked when a node was tainted with ToBeDeleted. However, I'm not sure that these code changes exactly match to what was described. Additionally, I didn't make any attempts to address the scale down that may need to call InvalidateNodeInstancesCacheEntry.

https://github.com/fookenc/autoscaler/commit/049e8ad0fc79bf2d6b9ef18b933efbb3d7d8415f# ↩

MaciekPytel · 2022-05-04T16:07:18Z

I'd still clear node taints from all nodes on startup though

+1. No reason to keep taints. I don't think we ever made conscious decision to skip any nodes when un-tainting, we just implicitly assumed readyNodes ~= allNodes as normally nodes removed by CA don't become unready until deleted on cloudprovider side. I'm guessing this is an issue when using --cordon-node-before-terminating (which would explain why we haven't noticed it before - we don't set this in any of our testing).

MaciekPytel · 2022-05-05T14:01:40Z

https://github.com/fookenc/autoscaler/blob/049e8ad0fc79bf2d6b9ef18b933efbb3d7d8415f/cluster-autoscaler/clusterstate/clusterstate.go#L541-L548

I don't think that implements what I described. An "unregistered node" is a VM that has no matching Node object in k8s. I think a "deleted node" should be a Node object in k8s that has no VM. I don't think we can infer this information from unregistered nodes, we need something similar to getNotRegisteredNodes() working "in other direction" (ie. for each node see if there is a matching instance").

I also made a change to the GetReadinessState function to include the ToBeDeleted taint key to mark any node as Unready. I don't know if this is necessary, but felt that it made sense to mark it that way.

Hmm, good point. It certainly seems logical, but I'm not sure if this doesn't have some undesired side effects (ex. marking nodepool as unhealthy during large scale-down?).

fookenc · 2022-05-19T23:39:38Z

Hi @MaciekPytel & @x13n,

I've created a new pull request that incorporates some of the changes (tests for UpcomingNodes, clearing taints from ALL nodes on startup) and new changes to identify those nodes that no longer have a corresponding VM on the cloud provider. Please let me know your feedback on the proposed changes. If possible, I'd like to have a discussion regarding the InvalidateNodeInstancesCacheEntry during scale down.

Thanks!

x13n · 2022-08-11T11:18:48Z

This has been obsoleted by #5054 now.

/close

k8s-ci-robot · 2022-08-11T11:19:07Z

@x13n: Closed this PR.

In response to this:

This has been obsoleted by #5054 now.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

alfredkrohmer · 2022-10-04T13:30:19Z

@x13n Correct me if I'm wrong, but I belive #5054 doesn't actually address the main problem described in #3949. Our problem is not nodes deleted in the cloud provider and still having a k8s node, but rather that nodes that are being drained by cluster-autoscaler for binpacking are counted as "upcoming" nodes. Or is the PR that you linked fixing the problem by rewiring a bunch of logic so that it will be fixed "on the way, too"?

x13n · 2022-10-06T14:06:52Z

The "right way" to address #3949 was discussed above in this PR. In order to get rid of the bug, CA needs to only identify deleted (as in - VM removed) nodes as Deleted. That requires CA to be able to determine whether a given node has a matching instance in a given cloud provider (which is what #5054 is about).

Once proper counting of deleted instances is implemented, the case you described in #3949 will stop happening: existing instance will be considered by CA as Unready rather than Deleted, so it will be subtracted when counting upcoming nodes.

Does that make sense?

Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toB…

8600ae6

…eDeleted taints from all nodes, not only ready ones

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 21, 2021

k8s-ci-robot requested review from aleksandra-malinowska and Jeffwan July 21, 2021 11:46

x13n requested changes Jul 21, 2021

View reviewed changes

jbartosik added the area/cluster-autoscaler label Jul 22, 2021

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 14, 2021

Add unit test

a1c801f

alfredkrohmer force-pushed the bugfix/include-tobedeletednodes-in-calculation-of-upcoming-nodes branch from 76b3d5f to a1c801f Compare September 14, 2021 14:36

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 21, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 22, 2022

fookenc added a commit to fookenc/autoscaler that referenced this pull request Apr 7, 2022

[PullRequest kubernetes#4211] Applying fix for ToBeDeleted tainted no…

589b654

…des from being considered from available capacity to schedule Pending Pods to.

fookenc mentioned this pull request Apr 12, 2022

CA fails to scale-up or cancel in-progress scale down when there are un-schedulable pods #4456

Closed

fookenc mentioned this pull request May 17, 2022

Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes #4896

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 4, 2022

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2022

fookenc mentioned this pull request Jul 27, 2022

Identifying cloud provider deleted nodes #5054

Merged

k8s-ci-robot closed this Aug 11, 2022

vadasambar mentioned this pull request Mar 15, 2023

Cluster Autoscaler Forgets Nodes Scheduled for Deletion during Restart #5048

Open

claassen mentioned this pull request May 14, 2024

[BUG] Cluster autoscaler bug requires Azure specific implementation to resolve Azure/AKS#4286

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toBeDeleted taints from all nodes, not only ready ones #4211

Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toBeDeleted taints from all nodes, not only ready ones #4211

alfredkrohmer commented Jul 21, 2021

x13n left a comment

k8s-ci-robot commented Sep 14, 2021

alfredkrohmer commented Sep 14, 2021

x13n commented Sep 17, 2021

alfredkrohmer commented Sep 21, 2021

x13n commented Oct 25, 2021

alfredkrohmer commented Nov 5, 2021

x13n commented Nov 9, 2021

alfredkrohmer commented Nov 9, 2021

x13n commented Nov 10, 2021

alfredkrohmer commented Nov 11, 2021 •

edited

Loading

alfredkrohmer commented Nov 11, 2021

x13n commented Nov 23, 2021

alfredkrohmer commented Nov 26, 2021 •

edited

Loading

x13n commented Dec 21, 2021

k8s-triage-robot commented Mar 21, 2022

x13n commented Mar 22, 2022

MaciekPytel commented Apr 26, 2022

fookenc commented Apr 27, 2022

MaciekPytel commented Apr 27, 2022

x13n commented Apr 28, 2022

fookenc commented Apr 29, 2022

MaciekPytel commented May 4, 2022

MaciekPytel commented May 5, 2022

fookenc commented May 19, 2022

x13n commented Aug 11, 2022

k8s-ci-robot commented Aug 11, 2022

alfredkrohmer commented Oct 4, 2022

x13n commented Oct 6, 2022

Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toBeDeleted taints from all nodes, not only ready ones #4211

Subtract toBeDeleted nodes from number of upcoming nodes; cleanup toBeDeleted taints from all nodes, not only ready ones #4211

Conversation

alfredkrohmer commented Jul 21, 2021

x13n left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Sep 14, 2021

alfredkrohmer commented Sep 14, 2021

x13n commented Sep 17, 2021

alfredkrohmer commented Sep 21, 2021

x13n commented Oct 25, 2021

alfredkrohmer commented Nov 5, 2021

x13n commented Nov 9, 2021

alfredkrohmer commented Nov 9, 2021

x13n commented Nov 10, 2021

alfredkrohmer commented Nov 11, 2021 • edited Loading

alfredkrohmer commented Nov 11, 2021

x13n commented Nov 23, 2021

alfredkrohmer commented Nov 26, 2021 • edited Loading

x13n commented Dec 21, 2021

k8s-triage-robot commented Mar 21, 2022

x13n commented Mar 22, 2022

MaciekPytel commented Apr 26, 2022

fookenc commented Apr 27, 2022

MaciekPytel commented Apr 27, 2022

Footnotes

x13n commented Apr 28, 2022

fookenc commented Apr 29, 2022

Footnotes

MaciekPytel commented May 4, 2022

MaciekPytel commented May 5, 2022

fookenc commented May 19, 2022

x13n commented Aug 11, 2022

k8s-ci-robot commented Aug 11, 2022

alfredkrohmer commented Oct 4, 2022

x13n commented Oct 6, 2022

alfredkrohmer commented Nov 11, 2021 •

edited

Loading

alfredkrohmer commented Nov 26, 2021 •

edited

Loading