Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

koord-descheduler: enhance LowNodeLoad scorer #2092

Merged

Conversation

LY-today
Copy link
Contributor

@LY-today LY-today commented Jun 8, 2024

Ⅰ. Describe what this PR does

What is your proposal:
When descheduler evaluates which pod to expel to reduce the utilization of high-load nodes to a reasonable level, can descheduler perform matching calculations on the actual usage of the pod and the amount of eviction resources in advance?

Why is this needed:
This can reduce the number of times a single node is repeatedly entered into the eviction logical unit, reduce the number of single node eviction instances, and indirectly improve the accuracy of the eviction results and the stability of the business while ensuring that hot issues are solved.

Is there a suggested solution, if so, please add it:
evict pod priority selector:podNowResourceUsages >= nodeNowResourceUsages - nodeHighResourceThresholds * nodeResourceAllocatable

Ⅱ. Does this pull request fix one issue?

issue链接:#1975

Ⅲ. Describe how to verify it

I added unit tests

Ⅳ. Special notes for reviews

none

V. Checklist

✅ I have written necessary docs and comments
✅ I have added necessary unit tests and integration tests
✅ All checks passed in make test

Copy link

codecov bot commented Jun 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.71%. Comparing base (4c143e6) to head (dd39981).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2092      +/-   ##
==========================================
+ Coverage   68.66%   68.71%   +0.04%     
==========================================
  Files         430      430              
  Lines       40053    40085      +32     
==========================================
+ Hits        27503    27543      +40     
+ Misses      10185    10184       -1     
+ Partials     2365     2358       -7     
Flag Coverage Δ
unittests 68.71% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@LY-today
Copy link
Contributor Author

LY-today commented Jun 8, 2024

/assign @eahydra

@LY-today LY-today force-pushed the descheduler-pod-scorer branch 2 times, most recently from 673f651 to bd7d556 Compare June 8, 2024 16:07
@LY-today
Copy link
Contributor Author

LY-today commented Jun 9, 2024

/assign @FillZpp

@saintube saintube changed the title feat: enhance scorer decheduler: enhance LoadAware scorer Jun 11, 2024
@LY-today
Copy link
Contributor Author

/assign @ZiMengSheng

@LY-today
Copy link
Contributor Author

/assign @songtao98

@LY-today
Copy link
Contributor Author

@saintube hi,The tide process seems to be stuck, what information do I need to add to move the process forward?

@songchh
Copy link

songchh commented Jun 11, 2024

Agreed, eviction based solely on the maximum usage of pods will indeed lead to stability problems. The simplest scenario is that after the pod with the largest usage is evicted, few nodes in the scheduler can meet its resource usage requirements, and pending becomes the norm. Even if it can be scheduled on certain nodes, the single-machine hotspot problem is more likely to be triggered during peak usage of multiple pods. Therefore, I agree that this should be addressed by evicting pods whose usage is just enough to meet demand, rather than simply based on maximum usage.

@LY-today
Copy link
Contributor Author

@ZiMengSheng hi,The tide process seems to be stuck, what information do I need to add to move the process forward?
@songtao98 hi,The tide process seems to be stuck, what information do I need to add to move the process forward?

@LY-today
Copy link
Contributor Author

@saintube @eahydra Could you please arrange for a classmate to follow up on this PR? It is currently affecting the stability of our business on the internal cluster. Thank you all.

@songtao98
Copy link
Contributor

songtao98 commented Jun 12, 2024

@ZiMengSheng hi,The tide process seems to be stuck, what information do I need to add to move the process forward? @songtao98 hi,The tide process seems to be stuck, what information do I need to add to move the process forward?

No more other things to do for now, just need a review and approval.

@saintube @eahydra Could you please arrange for a classmate to follow up on this PR? It is currently affecting the stability of our business on the internal cluster. Thank you all.

@LY-today I'm reviewing this and sorry for the delay. Plesae wait a little bit, I'll do my best and review it ASAP.

@LY-today
Copy link
Contributor Author

@ZiMengSheng hi,The tide process seems to be stuck, what information do I need to add to move the process forward? @songtao98 hi,The tide process seems to be stuck, what information do I need to add to move the process forward?

No more other things to do for now, just need a review and approval.

@saintube @eahydra Could you please arrange for a classmate to follow up on this PR? It is currently affecting the stability of our business on the internal cluster. Thank you all.

@LY-today I'm reviewing this and sorry for the delay. Plesae wait a little bit, I'll do my best and review it ASAP.

thank you for your reply

@LY-today
Copy link
Contributor Author

Agreed, eviction based solely on the maximum usage of pods will indeed lead to stability problems. The simplest scenario is that after the pod with the largest usage is evicted, few nodes in the scheduler can meet its resource usage requirements, and pending becomes the norm. Even if it can be scheduled on certain nodes, the single-machine hotspot problem is more likely to be triggered during peak usage of multiple pods. Therefore, I agree that this should be addressed by evicting pods whose usage is just enough to meet demand, rather than simply based on maximum usage.

Yes, we seem to have the same problem scenario

Copy link
Contributor

@songtao98 songtao98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic logic looks good to me. However, let's figure out how to cooperate with PR #2066. The score function need to be discussed.

pkg/descheduler/utils/sorter/scorer.go Outdated Show resolved Hide resolved
@songtao98
Copy link
Contributor

@LY-today Please check and resolve my review comment here: #2092 (comment).

To be short, prodLowThreshold and prodHighThreshold is just like lowThreshold and highThreshold but only counting the usage of Prod pods. This is used to balance the prod pods distribution. The newest logic is first balance pods on nodes that are overutilized by highThreshold and then balance pods on nodes that are just overutilized by prodHighThreshold.

When we are doing this, the function sortPodsOnOneOverloadedNode is called in function balancePods. balancePods has an arg prod bool and is called first as balancePods(prod: false) then balancePods(prod: true). You can see the related code lines.

I suggest you add an arg prod bool in sortPodsOnOneOverloadedNode too. And pass the value of prod of balancePods to it. If prod is true, you should call usedCopy.Sub(*srcNode.thresholds.prodHighResourceThreshold[or]) in sortPodsOnOneOverloadedNode. This what I need you to modify.

If any further problems, feel free to comment me. :)

@songtao98
Copy link
Contributor

songtao98 commented Jun 13, 2024

@LY-today Please check and resolve my review comment here: #2092 (comment).

To be short, prodLowThreshold and prodHighThreshold is just like lowThreshold and highThreshold but only counting the usage of Prod pods. This is used to balance the prod pods distribution. The newest logic is first balance pods on nodes that are overutilized by highThreshold and then balance pods on nodes that are just overutilized by prodHighThreshold.

When we are doing this, the function sortPodsOnOneOverloadedNode is called in function balancePods. balancePods has an arg prod bool and is called first as balancePods(prod: false) then balancePods(prod: true). You can see the related code lines.

I suggest you add an arg prod bool in sortPodsOnOneOverloadedNode too. And pass the value of prod of balancePods to it. If prod is true, you should call usedCopy.Sub(*srcNode.thresholds.prodHighResourceThreshold[or]) in sortPodsOnOneOverloadedNode. This what I need you to modify.

If any further problems, feel free to comment me. :)

and also, remember to check and modify related code carefully. I'll re-review this PR after your modification. About the failed checking of lint and unit-test, I'm fixing it in #2103 .

After all, remember to rebase all commits in your PR and just keep only 1 commit.

@LY-today
Copy link
Contributor Author

@LY-today Please check and resolve my review comment here: #2092 (comment).

To be short, prodLowThreshold and prodHighThreshold is just like lowThreshold and highThreshold but only counting the usage of Prod pods. This is used to balance the prod pods distribution. The newest logic is first balance pods on nodes that are overutilized by highThreshold and then balance pods on nodes that are just overutilized by prodHighThreshold.

When we are doing this, the function sortPodsOnOneOverloadedNode is called in function balancePods. balancePods has an arg prod bool and is called first as balancePods(prod: false) then balancePods(prod: true). You can see the related code lines.

I suggest you add an arg prod bool in sortPodsOnOneOverloadedNode too. And pass the value of prod of balancePods to it. If prod is true, you should call usedCopy.Sub(*srcNode.thresholds.prodHighResourceThreshold[or]) in sortPodsOnOneOverloadedNode. This what I need you to modify.

If any further problems, feel free to comment me. :)

@LY-today Please check and resolve my review comment here: #2092 (comment).
To be short, prodLowThreshold and prodHighThreshold is just like lowThreshold and highThreshold but only counting the usage of Prod pods. This is used to balance the prod pods distribution. The newest logic is first balance pods on nodes that are overutilized by highThreshold and then balance pods on nodes that are just overutilized by prodHighThreshold.
When we are doing this, the function sortPodsOnOneOverloadedNode is called in function balancePods. balancePods has an arg prod bool and is called first as balancePods(prod: false) then balancePods(prod: true). You can see the related code lines.
I suggest you add an arg prod bool in sortPodsOnOneOverloadedNode too. And pass the value of prod of balancePods to it. If prod is true, you should call usedCopy.Sub(*srcNode.thresholds.prodHighResourceThreshold[or]) in sortPodsOnOneOverloadedNode. This what I need you to modify.
If any further problems, feel free to comment me. :)

and also, remember to check and modify related code carefully. I'll re-review this PR after your modification. About the failed checking of lint and unit-test, I'm fixing it in #2103 .

After all, remember to rebase all commits in your PR and just keep only 1 commit.

Received, I am now compatible

@LY-today
Copy link
Contributor Author

@songtao98 Local make test, the feedback is successful, but the CI link is abnormal. Can it be re-executed manually?

@songtao98
Copy link
Contributor

@songtao98 Local make test, the feedback is successful, but the CI link is abnormal. Can it be re-executed manually?

I noticed that was caused by lint check. You can run make test, make lint, etc. on your local env to make sure that your commit can pass the checks. Check the make file and you can see what we need.

@songtao98
Copy link
Contributor

songtao98 commented Jun 14, 2024

@LY-today There is a typo in your PR's title. Consider modify it as koord-descheduler: enhance LowNodeLoad scorer

@LY-today LY-today changed the title decheduler: enhance LoadAware scorer koord-descheduler: enhance LowNodeLoad scorer Jun 14, 2024
@LY-today
Copy link
Contributor Author

@LY-today There is a typo in your PR's title. Consider modify it as koord-descheduler: enhance LowNodeLoad scorer

done

Signed-off-by: LY-today <724102053@qq.com>
@LY-today
Copy link
Contributor Author

@songtao98 All have been modified and passed the CI stage, please check

@songtao98
Copy link
Contributor

/lgtm
@LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

@koordinator-bot koordinator-bot bot added the lgtm label Jun 14, 2024
@LY-today
Copy link
Contributor Author

/lgtm @LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

How do I mark it as resolved?

@songtao98 done

@LY-today
Copy link
Contributor Author

/lgtm @LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

How do I mark it as resolved?

@songtao98 done

@songtao98 What other links do we need to continue to promote the progress of PR integration?

@songtao98
Copy link
Contributor

/lgtm @LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

How do I mark it as resolved?

@songtao98 done

@songtao98 What other links do we need to continue to promote the progress of PR integration?

An approved label needed. Maintainers will soon do final check ASAP and if no other problems, this PR will be approved and auto-merged.

@LY-today
Copy link
Contributor Author

/lgtm @LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

How do I mark it as resolved?

@songtao98 done

@songtao98 What other links do we need to continue to promote the progress of PR integration?

An approved label needed. Maintainers will soon do final check ASAP and if no other problems, this PR will be approved and auto-merged.

Received, thank you for your hard work. We hope to speed up the progress and solve the online problems of our cluster.

@LY-today
Copy link
Contributor Author

/lgtm @LY-today You can mark all conversations as resolved. Then we can continue to merge this PR.

How do I mark it as resolved?

@songtao98 done

@songtao98 What other links do we need to continue to promote the progress of PR integration?

An approved label needed. Maintainers will soon do final check ASAP and if no other problems, this PR will be approved and auto-merged.

Received, thank you for your hard work. We hope to speed up the progress and solve the online problems of our cluster.

@songtao98 Hello, are there any plans to advance this PR today?

@songtao98
Copy link
Contributor

songtao98 commented Jun 17, 2024

PTAL @hormes @ZiMengSheng

@ZiMengSheng
Copy link
Contributor

/lgtm

@ZiMengSheng
Copy link
Contributor

/approve

@LY-today
Copy link
Contributor Author

LY-today commented Jun 17, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ZiMengSheng To complete the pull request process, please ask for approval from eahydra after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

@eahydra Please check out this PR and let's work together to solve the problem

@LY-today
Copy link
Contributor Author

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ZiMengSheng To complete the pull request process, please ask for approval from eahydra after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

@hormes Please check out this PR and let's work together to solve the problem

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes, ZiMengSheng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit 96ca38d into koordinator-sh:main Jun 17, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants