Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoscaler: fix premature end of binpacking #6165

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Sep 30, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

When PermissionToAddNode gets called without actually adding a new node, the node counter in the thresholdBasedEstimationLimiter gets out of sync with the actual number of new nodes. This can happen when the "is the last node empty" check triggers.

The solution used here is to rearrange the checks so that PermissionToAddNode is followed by adding a new node. Alternatively, it might also be possible to pass the current number of nodes as parameter.

Special notes for your reviewer:

Found while investigating scale up scenarios with dynamic resource allocation support, see https://kubernetes.slack.com/archives/C09R1LV8S/p1695991972759879.

Does this PR introduce a user-facing change?

CA might have created less nodes than desired with a message about "Capping binpacking after exceeding threshold of 4 nodes" even though it then didn't actually add four new nodes.

When PermissionToAddNode gets called without actually adding a new node, the
node counter in the thresholdBasedEstimationLimiter gets out of sync with the
actual number of new nodes. This can happen when the "is the last node empty"
check triggers.

The solution used here is to rearrange the checks so that PermissionToAddNode
is followed by adding a new node. Alternatively, it might also be possible
to pass the current number of nodes as parameter.
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 30, 2023
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 30, 2023
@x13n
Copy link
Member

x13n commented Oct 10, 2023

/assign

@x13n
Copy link
Member

x13n commented Oct 10, 2023

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 10, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 10, 2023
@k8s-ci-robot k8s-ci-robot merged commit bc0e288 into kubernetes:master Oct 10, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants