-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Fix preemption blocked by low priority job #1020
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
68aa415
to
3ad09c2
Compare
@alculquicondor: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/close I think this would introduce a regression on #475 |
@alculquicondor: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
/kind regression
What this PR does / why we need it:
#805 fixed over provisioning by restricting admission when more than one CQ had nominated workloads.
This check ended up being excessive in an scenario like follows:
beta
in a cohort, withStrictFIFO
strategy, has a low priority job pending that can't be accommodated by borrowingalpha
in the same cohort gets a new high priority job.Since job priorities don't influence ordering across multiple CQs, the low priority job ends up blocking any admission in the cohort.
Side bug that made this issue hard to reproduce: The preemption logic was removing CQs from a cohort in the snapshot. Fixed by cloning the set.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
WIP because I want to add a unit test for this.
Does this PR introduce a user-facing change?