Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support azure vmss flex #6633

Closed

Conversation

Daniel-Redeploy
Copy link
Contributor

@Daniel-Redeploy Daniel-Redeploy commented Mar 13, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Adds support for Azure VMSS Flexible MachinePool deployments.

Which issue(s) this PR fixes:

Fixes #6454

Special notes for your reviewer:

Removes normalization of all azure resources, rather than just for Azure VMSS resources. The reason for this is that VMSS Flexible provisions VM's (Microsoft.Compute/virtualMachines) rather than instances as a sub-resource of the VMSS (Microsoft.Compute/virtualMachineScaleSets).

Should supersede #6632

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 13, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Daniel-Redeploy
Once this PR has been reviewed and has the lgtm label, please assign elmiko for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/provider/cluster-api Issues or PRs related to Cluster API provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 13, 2024
@mwielgus mwielgus requested a review from elmiko March 20, 2024 10:44
@elmiko
Copy link
Contributor

elmiko commented Mar 20, 2024

i would like some time to study this as we prefer not to encode provider specific information in the clusterapi provider. this information should be discoverable through the generic mechanisms of clusterapi.

also, i would like @jackfrancis to review this as well.

please do not merge this until we have had time to review
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 20, 2024
@jackfrancis
Copy link
Contributor

/assign @manishsat

@Daniel-Redeploy what is additive in this PR on top of the original PR that introduced VMSS Flex for the Azure provider? Ref:

cc @gandhipr

@k8s-ci-robot
Copy link
Contributor

@jackfrancis: GitHub didn't allow me to assign the following users: manishsat.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @manishsat

@Daniel-Redeploy what is additive in this PR on top of the original PR that introduced VMSS Flex for the Azure provider? Ref:

cc @gandhipr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jackfrancis
Copy link
Contributor

O.K. I read through this PR changeset a bit more and it seems like the best way to describe this is "enable first class MachinePoolMachine support in clusterapi provider". Is that about right? If so, I would hope we can enable this in such a way where we don't have to reference AzureMachinePoolMachines (or any other provider-specific implementation) in the clusterapi provider code. Is there something that prevents such a clean separation of concerns?

@Daniel-Redeploy
Copy link
Contributor Author

/assign @manishsat

@Daniel-Redeploy what is additive in this PR on top of the original PR that introduced VMSS Flex for the Azure provider? Ref:

cc @gandhipr

Somehow I totally missed the VMSS Flex docs, feature and how to toggle it when I worked on this last November. I remember I found references to Flex but couldn't get the cluster autoscaler to remove instances without the code changed in this PR. In short CA found the VMSS, scaled out the instance count, but couldn't scale in since it couldn't find the individual instance.

I haven't tested if the AZURE_ENABLE_VMSS_FLEX env will work as expected for our scenario with mixed deployments models (MachineDeployment + VMSS Uniform + VMSS Flex). But if it does this PR should be seen as redundant.

@Daniel-Redeploy
Copy link
Contributor Author

O.K. I read through this PR changeset a bit more and it seems like the best way to describe this is "enable first class MachinePoolMachine support in clusterapi provider". Is that about right? If so, I would hope we can enable this in such a way where we don't have to reference AzureMachinePoolMachines (or any other provider-specific implementation) in the clusterapi provider code. Is there something that prevents such a clean separation of concerns?

Makes perfect sense to me and it would make the clusterapi provider flexible against more MachinePool providers than Azure. 👍

@x13n
Copy link
Member

x13n commented May 7, 2024

/assign @elmiko @jackfrancis

Assigning - since I understand you're reviewing this one?

@jackfrancis
Copy link
Contributor

@Daniel-Redeploy have you had a chance to determine if we still need to preserve anything from this PR?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 5, 2024
@elmiko
Copy link
Contributor

elmiko commented Aug 6, 2024

it seems like we can close this one, but i will leave that decision up to @jackfrancis and @Daniel-Redeploy

@jackfrancis
Copy link
Contributor

/close

@elmiko agree

@Daniel-Redeploy feel free to reopen if I'm premature here!

@k8s-ci-robot
Copy link
Contributor

@jackfrancis: Closed this PR.

In response to this:

/close

@elmiko agree

@Daniel-Redeploy feel free to reopen if I'm premature here!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/cluster-api Issues or PRs related to Cluster API provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
6 participants