Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 MS: improve replica defaulting for autoscaler #9649

Merged

Conversation

aiden-von
Copy link
Contributor

What this PR does / why we need it: MachineSet.spec.replicas defaulting should take into account autoscaler min/max size if defined. This PR applies the same default replicas policy of MD to MS.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #8085

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label labels Oct 31, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @aiden-von!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 31, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 31, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @aiden-von. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/ok-to-test

/area machineset

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. area/machineset Issues or PRs related to machinesets and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. do-not-merge/needs-area PR is missing an area label labels Oct 31, 2023
@aiden-von aiden-von force-pushed the pr-ms-autoscaler-defaulting branch 3 times, most recently from cb12306 to 8d7ca40 Compare November 1, 2023 00:41
Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this generally makes sense to me and the code looks good.

my main concern is the replicas automatically getting set to minimum, but i think the documentation with this PR is good so, hopefully, users won't be too surprised.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 34ee03e67f19ed5cd7d97b7e1e1322cfa55bac1d

@aiden-von
Copy link
Contributor Author

@killianmuldoon, @elmiko
Hello, This is my first PR. It's been a while since my PR stucked, is there any process I missed?

@sbueringer
Copy link
Member

sbueringer commented Nov 20, 2023

@killianmuldoon, @elmiko Hello, This is my first PR. It's been a while since my PR stucked, is there any process I missed?

No worries you didn't miss anything. Just pretty busy recently so nobody got around to review the PR yet

@aiden-von
Copy link
Contributor Author

aiden-von commented Nov 20, 2023

No worries you didn't miss anything. Just pretty busy recently so nobody got around to review the PR yet

Thank you for being responsive @sbueringer :)
I undertood. Please review it at your convenience.

Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits.

The replica calculation is a bit duplicate to the one for MD, but maybe it's easier to copy instead of refactoring to one generic func (so fine for me as is)

webhooks/alias.go Outdated Show resolved Hide resolved
internal/webhooks/machineset.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2023
@aiden-von aiden-von force-pushed the pr-ms-autoscaler-defaulting branch 2 times, most recently from 88f3d1a to 63c1ee1 Compare November 21, 2023 13:52
@sbueringer
Copy link
Member

In general looks good. Looks like the gci linter reports some findings: https://github.com/kubernetes-sigs/cluster-api/actions/runs/6944695786/job/18892668156

  webhooks/alias.go:20: File is not `gci`-ed with --skip-generated -s standard -s default -s prefix(sigs.k8s.io/cluster-api) --custom-order (gci)
  	"sigs.k8s.io/cluster-api/internal/webhooks"
  webhooks/alias.go:22: File is not `gci`-ed with --skip-generated -s standard -s default -s prefix(sigs.k8s.io/cluster-api) --custom-order (gci)
  	"sigs.k8s.io/controller-runtime/pkg/client"

(you can try if make lint-fix just fixes them)

Please also directly squash when you resolve these issues

@aiden-von
Copy link
Contributor Author

In general looks good. Looks like the gci linter reports some findings: https://github.com/kubernetes-sigs/cluster-api/actions/runs/6944695786/job/18892668156

  webhooks/alias.go:20: File is not `gci`-ed with --skip-generated -s standard -s default -s prefix(sigs.k8s.io/cluster-api) --custom-order (gci)
  	"sigs.k8s.io/cluster-api/internal/webhooks"
  webhooks/alias.go:22: File is not `gci`-ed with --skip-generated -s standard -s default -s prefix(sigs.k8s.io/cluster-api) --custom-order (gci)
  	"sigs.k8s.io/controller-runtime/pkg/client"

(you can try if make lint-fix just fixes them)

Please also directly squash when you resolve these issues

Thank you for review and fast responsive.
The lint issue has been resolved.
If all tests pass except api-diff, I will squash the commit.

@sbueringer
Copy link
Member

/lgtm

so far :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: eaae21e402eb42ec8576f1be964159f478b8c268

// Notes:
// - While the min size and max size annotations of the autoscaler provide the best UX, other autoscalers can use the
// DefaultReplicasAnnotation if they have similar use cases.
func calculateMachineSetReplicas(ctx context.Context, oldMS *clusterv1.MachineSet, newMS *clusterv1.MachineSet, dryRun bool) (int32, error) {
Copy link
Member

@enxebre enxebre Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we considered passing oldReplicas, newReplicas through the signature of this function instead of the machineSet resources and reuse it for both MD and MS? It seems the objects are only needed for if oldMD == nil { which would actually be the same code path than case oldMD.Spec.Replicas == nil:

Copy link
Member

@sbueringer sbueringer Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just use one generic function instead of duplicating I would prefer to pass in oldObject (or oldMDorMS) of type runtime.Object instead of merging the two code paths of oldMS == nil & oldMS.Spec.Replicas in the func

(Because those are actual different cases even if the result that we use minSize is the same and I think it would be better to have that logic clear in our code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool, let's keep current implementation since it seems to have general lgtm and this code is unlikely to change often.

@k8s-ci-robot
Copy link
Contributor

@aiden-von: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-apidiff-main addb065 link false /test pull-cluster-api-apidiff-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sbueringer
Copy link
Member

(failing apidiff job is fine, It's just surfacing a change to an exported type and I think this one is fine)

@enxebre
Copy link
Member

enxebre commented Nov 27, 2023

/lgtm

@sbueringer
Copy link
Member

Thank you!!

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 27, 2023
@k8s-ci-robot k8s-ci-robot merged commit d1ff0b5 into kubernetes-sigs:main Nov 27, 2023
18 of 20 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.7 milestone Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machineset Issues or PRs related to machinesets cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MachineSet.spec.replicas defaulting should take into account autoscaler min/max size if defined
6 participants