-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal of introducing a rebalance mechanism to actively trigger rescheduling of resource #4698
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #4698 +/- ##
==========================================
+ Coverage 53.12% 53.33% +0.20%
==========================================
Files 251 252 +1
Lines 20417 20482 +65
==========================================
+ Hits 10847 10924 +77
+ Misses 8856 8836 -20
- Partials 714 722 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This Pr mixes fault self-healing and rescheduling. I think fault self-healing includes rescheduling, similar to when a node crashes, the workload corresponding to the pod on the node will regenerate the pod. This is completed by multiple controllers working together, including a scheduler. If the goal is self-healing, then multiple components need to be considered for coordination. If it is only rescheduling, then only the target of eviction and the conditions for stopping eviction need to be considered. Can we consider the design concept of the Descheduler project in the community |
c57f463
to
edb362e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign
I did a hard job to made a thorough improvement of this proposal, now everyone can go through it all over again, looking forward to your suggestions~ |
@wu0407 Hello, I have updated this proposal. Actually, this proposal is about an entirely rescheduling, as for cluster failover is only a user story of it. For more imformation you can see in latest proposal, thank you for your comments~ |
1e4b127
to
e7aff2a
Compare
docs/proposals/scheduling/workload-rebalancer/workload-rebalancer.md
Outdated
Show resolved
Hide resolved
docs/proposals/scheduling/workload-rebalancer/workload-rebalancer.md
Outdated
Show resolved
Hide resolved
…cheduling of resource. Signed-off-by: chaosi-zju <chaosi@zju.edu.cn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RainbowMango The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind design
/kind documentation
What this PR does / why we need it:
Proposal of introducing a rebalance mechanism to actively trigger rescheduling of resource.
Assuming the user has propagated the workloads to member clusters, in some scenarios the current replicas distribution
is not the most expected, such as:
Aggregated
schedule strategy, replicas were initially distributed across multiple clusters due to resourceconstraints, but now one cluster is enough to accommodate all replicas.
Therefore, the user desires for an approach to trigger rescheduling so that the replicas distribution can do a rebalance.
Which issue(s) this PR fixes:
Fixes part of #4840
Special notes for your reviewer:
Does this PR introduce a user-facing change?: