Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fix preemption blocked by low priority job #1020

Closed
wants to merge 3 commits into from

Conversation

alculquicondor
Copy link
Contributor

@alculquicondor alculquicondor commented Jul 27, 2023

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

#805 fixed over provisioning by restricting admission when more than one CQ had nominated workloads.

This check ended up being excessive in an scenario like follows:

  1. A CQ beta in a cohort, with StrictFIFO strategy, has a low priority job pending that can't be accommodated by borrowing
  2. A CQ alpha in the same cohort gets a new high priority job.

Since job priorities don't influence ordering across multiple CQs, the low priority job ends up blocking any admission in the cohort.

Side bug that made this issue hard to reproduce: The preemption logic was removing CQs from a cohort in the snapshot. Fixed by cloning the set.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

WIP because I want to add a unit test for this.

Does this PR introduce a user-facing change?

Fix preemption being blocked by an old job from a different ClusterQueue that doesn't fit.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 27, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link

netlify bot commented Jul 27, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 3ad09c2
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/64c2cfa9d3e3df0009b7d001

@k8s-ci-robot k8s-ci-robot requested review from tenzen-y and trasc July 27, 2023 19:47
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 27, 2023
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 27, 2023
@k8s-ci-robot
Copy link
Contributor

@alculquicondor: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kueue-test-unit-main 3ad09c2 link true /test pull-kueue-test-unit-main
pull-kueue-test-integration-main 3ad09c2 link true /test pull-kueue-test-integration-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@alculquicondor
Copy link
Contributor Author

/close

I think this would introduce a regression on #475

@k8s-ci-robot
Copy link
Contributor

@alculquicondor: Closed this PR.

In response to this:

/close

I think this would introduce a regression on #475

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants