-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: set IgnoreDaemonSetsUtilization
per nodegroup for AWS
#5672
feat: set IgnoreDaemonSetsUtilization
per nodegroup for AWS
#5672
Conversation
b192159
to
7fa229d
Compare
@drmorr0 , @feiskyer |
IgnoreDaemonSetsUtilization
per nodegroupIgnoreDaemonSetsUtilization
per nodegroup for AWS
There are many cloud providers which have `grep -R ") GetOptions" ./cloudprovider -A 2 --exclude-dir=vendor`
|
} | ||
|
||
type utilizationThresholdGetter interface { | ||
type nodeGroupConfigGetter interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return simulator.UnexpectedError, nil | ||
} | ||
|
||
gpuConfig := context.CloudProvider.GetNodeGpuConfig(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code from line 141 to 146 here is just copy and paste from line 121 to 126 here. I have moved it down because I want to use nodegroup
when calling GetIgnoreDaemonSetsUtilization
to get the per ASG value for IgnoreDaemonSetsUtilization
and then pass it to utilization.Calculate
on line 142 below.
|
||
allTestCases := testCases | ||
|
||
// run all test cases again with `IgnoreDaemonSetsUtilization` set to true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that we test against IgnoreDaemonSetsUtilization
: true
and IgnoreDaemonSetsUtilization
: false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause different test cases to have the same desc
, making them harder to debug. For the sake of readability, I'd just have one long list of test cases to check, but if you have to generate them programatically like this, please update desc
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion.
I have started adding a suffix now like <test-case-desc> IgnoreDaemonSetsUtilization=true
and <test-case-desc> IgnoreDaemonSetsUtilization=false
and
@@ -137,14 +158,17 @@ func TestFilterOutUnremovable(t *testing.T) { | |||
} | |||
} | |||
|
|||
type staticThresholdGetter struct { | |||
threshold float64 | |||
type staticNodeGroupConfigProcessor struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced staticThresholdGetter
here with staticNodeGroupConfigProcessor
because I think it is a better implementation since we don't hard code the value of threshold to 0.5 but use the default values set here.
// from NodeGroupConfigProcessor interface | ||
type actuatorNodeGroupConfigGetter interface { | ||
// GetIgnoreDaemonSetsUtilization returns IgnoreDaemonSetsUtilization value that should be used for a given NodeGroup. | ||
GetIgnoreDaemonSetsUtilization(context *context.AutoscalingContext, nodeGroup cloudprovider.NodeGroup) (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface is to limit the functions that can be used from processors.NodeGroupConfigProcessor
interface. Since I only need to use GetIgnoreDaemonSetsUtilization
here, I don't see the need to expose all the functions from the processors.NodeGroupConfigProcessor
interface.
I think this PR is becoming too big. I will be supporting this tag only for AWS in this PR. |
bcbd017
to
ca74c9c
Compare
}, | ||
{ | ||
nodeGroup: testNg2, | ||
testCases: getStartDeletionTestCases(testNg2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am running the test cases twice one for each nodegroup defined above where in the first nodegroup, IgnoreDaemonSetsUtilization: false
and for the second nodegroup IgnoreDaemonSetsUtilization: true
. This is to test IgnoreDaemonSetsUtilization
option on the nodegroup which is called here which is called by StartDeletion
(function we are trying to test here) via deleteAsyncDrain
and deleteAsyncEmpty
.
0cf8581
to
4400d80
Compare
97ce06c
to
c55f018
Compare
Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: test cases failing for actuator and scaledown/eligibility - abstract default values into `config` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename global `IgnoreDaemonSetsUtilization` -> `GlobalIgnoreDaemonSetsUtilization` in code - there is no change in the flag name - rename `thresholdGetter` -> `configGetter` and tweak it to accomodate `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: reset help text for `ignore-daemonsets-utilization` flag - because per nodegroup override is supported only for AWS ASG tags as of now Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add info about overriding `--ignore-daemonsets-utilization` per ASG - in AWS cloud provider README Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use a limiting interface in actuator in place of `NodeGroupConfigProcessor` interface - to limit the functions that can be used - since we need it only for `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: tests failing for actuator - rename `staticNodeGroupConfigProcessor` -> `MockNodeGroupConfigGetter` - move `MockNodeGroupConfigGetter` to test/common so that it can be used in different tests Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: go lint errors for `MockNodeGroupConfigGetter` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add tests for `IgnoreDaemonSetsUtilization` in cloud provider dir Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: update node group config processor tests for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: update eligibility test cases for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: run actuation tests for 2 NGS - one with `IgnoreDaemonSetsUtilization`: `false` - one with `IgnoreDaemonSetsUtilization`: `true` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add tests for `IgnoreDaemonSetsUtilization` in actuator - add helper to generate multiple ds pods dynamically - get rid of mock config processor because it is not required Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: fix failing tests for actuator Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: remove `GlobalIgnoreDaemonSetUtilization` autoscaling option - not required Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: warn message `DefaultScaleDownUnreadyTimeKey` -> `DefaultIgnoreDaemonSetsUtilizationKey` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use `generateDsPods` instead of `generateDsPod` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: `globaIgnoreDaemonSetsUtilization` -> `ignoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com>
Signed-off-by: vadasambar <surajrbanakar@gmail.com>
- instead of passing all the processors (we only need `NodeGroupConfigProcessor`) Signed-off-by: vadasambar <surajrbanakar@gmail.com>
- add suffix to tests with `IgnoreDaemonSetsUtilization` set to `true` and `IgnoreDaemonSetsUtilization` set to `false` Signed-off-by: vadasambar <surajrbanakar@gmail.com>
38a9a4d
to
e1a22da
Compare
|
||
for _, testSet := range testSets { | ||
for tn, tc := range testSet { | ||
t.Run(tn, func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code from this line and below is just a copy and paste of the older code (shifted by some lines). I haven't changed anything in the code until https://github.com/kubernetes/autoscaler/pull/5672/files#r1174841347
Signed-off-by: vadasambar <surajrbanakar@gmail.com>
@x13n I've addressed all the review comments. Can you please review this PR again 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, looks good to me!
/lgtm
@@ -67,6 +67,15 @@ func BuildTestPod(name string, cpu int64, mem int64) *apiv1.Pod { | |||
return pod | |||
} | |||
|
|||
// BuildDSTestPod creates a DaemonSet pod with cpu and memory. | |||
func BuildDSTestPod(name string, cpu int64, mem int64) *apiv1.Pod { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate PR makes a lot of sense, thanks!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MaciekPytel, vadasambar, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Oh, and the hold was meant for me, so removing: /hold cancel |
I read #5399 first, and then the discussion here, and it's still not clear to me what problem you encounter by ignoring DaemonSet utilization globally. The description notes:
If the nodes are large enough and the contribution from DaemonSets is proportionally small, what harm comes from ignoring their contribution on these large nodes too? What is the benefit of noting their contribution there? |
If you look at really small consumption in daemonsets, I think ignoring daemonset utilization makes sense. e.g., only 1 CPU is consumed by daemonsets on a 32 CPU node. It might not matter as much. But if you look at relatively larger consumption in daemonsets, say 1CPU in a 10 CPU node. For example imagine a scenario like this:
In case of 2, we want the node to get scaled down. But in case of 1, we might not want the node to get scaled down. Ignoring daemonset utilization for nodegroup1 means CA would scale down nodes in nodegroup 1 even though the node is actually using 70% of cpu (if you include daemonset utilization). This means, the pods would need to find a new home now. This can create potential disruption (application downtime until new new node is brought in by CA) |
Thank you for the explanation. Regarding case one, the target utilization of 0.7 (70%) is higher than I've ever used, and I still don't see why you want to count the DaemonSet's contribution there. If you want the node to survive your 70% threshold, it either needs 10% more "real" workload, or your actual threshold should be closer to 60% to get what you want. Counting the DaemonSet as part of that utilization is falsifying the baseline atop which you sum the rest of the non-daemon workload's utilization. You could adjust the threshold instead. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
We want to support the ability to specify
IgnoreDaemonSetsUtilization
per nodegroup through ASG tags. Why? It is to cater for the following case (among other possible cases):IgnoreDaemonSetsUtilization
is setfalse
globally because we want to consider daemonsets when calculating node resource utilization. This works well for ASGs/nodegroups which have large nodes and daemonsets utilization doesn't contribute much to node resource utilization. Due to this, we have no problem during node scale down.ScaleDownUtilizationThreshold
.To solve this problem, we want to support setting
IgnoreDaemonSetsUtilization
per ASG/nodegroup through tag likek8s.io/cluster-autoscaler/node-template/autoscaling-options/ignoredaemonsetsutilization: true
which would override the global value ofIgnoreDaemonSetsUtilization
(which is false)More details: #5399
Which issue(s) this PR fixes:
Fixes #5399
Special notes for your reviewer:
Nothing in particular.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: