Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose binding priority and preemption #4993

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

whitewindmills
Copy link
Member

@whitewindmills whitewindmills commented May 28, 2024

What type of PR is this?
/kind design

What this PR does / why we need it:
Propose binding priority and preemption

Which issue(s) this PR fixes:
Fixes #4938

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@karmada-bot karmada-bot added the kind/design Categorizes issue or PR as related to design. label May 28, 2024
@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 28, 2024
@whitewindmills
Copy link
Member Author

@codecov-commenter
Copy link

codecov-commenter commented May 28, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 48.34%. Comparing base (d4c2793) to head (468bef1).
Report is 1138 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4993      +/-   ##
==========================================
- Coverage   53.33%   48.34%   -4.99%     
==========================================
  Files         252      666     +414     
  Lines       20482    54858   +34376     
==========================================
+ Hits        10924    26521   +15597     
- Misses       8836    26614   +17778     
- Partials      722     1723    +1001     
Flag Coverage Δ
unittests 48.34% <ø> (-4.99%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@XiShanYongYe-Chang
Copy link
Member

Thanks~
/assign

@whitewindmills
Copy link
Member Author

/assign @RainbowMango

@RainbowMango
Copy link
Member

Hi @whitewindmills I guess it's a good chance to introduce this proposal at tomorrow's community meeting, what do you think?

@whitewindmills
Copy link
Member Author

hi all, can we continue this proposal?

@Monokaix
Copy link
Contributor

hi all, can we continue this proposal?

+1

@seanlaii
Copy link
Contributor

seanlaii commented Oct 15, 2024

Hi all,

Thank you all for the amazing feedback today! In summary, our design currently only targets the resource that is scheduled in a single cluster, and the preemption only happens for the bindings in one cluster.

Here are the points we agree with in this proposal:

  1. Need to redesign the scheduler queue so that it is aware of the priority and it will pop out the resource binding with highest priority first.
  2. Reuse the native priorityClass and create a new API if we need to extend the functionality in the future.
  3. Resolving the priority value and preemptionPolicy/preemptionBehavior to the resource binding so that the scheduler won’t need to query the priorityClass to find the values.
  4. If scheduling fails due to insufficient resources (or can not find feasible clusters), we should continue attempting to schedule other pending ResourcesBindings instead of blocking the whole scheduling process.
  5. Reschedule the preempted bindings. (Need to carefully consider the backoff time for the preempted bindings to avoid reschedule the preempted bindings before the preemptor binding.)

These are the points where we have different views or additional questions and thoughts:

  1. Could you please clarify if there are any use cases that require binding the priorityClassName with the PropagationPolicy/ClusterPropagationPolicy? Maybe because you don't want to enforce that all the resources must have a priorityClassName filed? We plan to put the priority and preemptionBehavior/preemptionPolicy in the replicaRequirements. So the user can just specify the priorityClassName in the resource and create the resource the same as what they do in the single cluster. We can use custom resource interpreter to propagate the priority and preemptionPolicy to the replicaRequirements.
  2. We propose using scheduler-estimators to find the victim bindings since they have the accurate information of the clusters, and the scheduler will use the victim bindings to decide which bindings to preempt and perform preemption.
  3. If possible, we would like to have an option or feature gate or a field in the binding for controlling if preempted bindings should be rescheduled.

Thanks for your time again! Please feel free to provide any comment.
Reference: Our Proposal - https://docs.google.com/document/d/1MixmgLwnmiRrukyFP25JvE2hqiqB3f_JKT12tXiOsbc/edit?tab=t.0
cc @RainbowMango @kevin-wangzefeng @wengyao04 @LeonZh0u

@RainbowMango RainbowMango modified the milestones: v1.12, v1.13 Nov 28, 2024
@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from f1b5c03 to f2f7cb9 Compare November 28, 2024 09:45
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from rainbowmango. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from f2f7cb9 to b0070de Compare December 18, 2024 09:18
@whitewindmills
Copy link
Member Author

@RainbowMango can this proposal be merged?

@RainbowMango
Copy link
Member

I want to give it another look and see if any comments from the others.

@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from b0070de to 0246005 Compare December 24, 2024 07:38
@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from 0246005 to c1dbe3f Compare January 7, 2025 03:46
@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from c1dbe3f to b711230 Compare January 13, 2025 09:04
@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch 3 times, most recently from fb8f4f4 to 328750b Compare January 16, 2025 09:22
Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any other comments regarding the priority scheduling part so far except the feature gate name.

Next, I would like to ask @kevin-wangzefeng to review this design again. If no further comments, I think we can start coding. Additionally, the preemption part is relatively complex and needs further discussion.

Signed-off-by: whitewindmills <jayfantasyhjh@gmail.com>
@whitewindmills whitewindmills force-pushed the binding-priority-preemption-proposal branch from 328750b to 468bef1 Compare January 21, 2025 08:39
Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

As I mentioned on #4993 (review), looks great for the priority scheduling part.

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Scheduler] about priority scheduling