Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: upstream most of Azure managed CAS changes in cloudprovider/azure for 1.28 #7067

Conversation

comtalyst
Copy link
Contributor

@comtalyst comtalyst commented Jul 18, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

"Refactor" as a part of fork-upstream (managed-selfhosted) realignment. Should not have any breaking changes.
This codebases realignment will simplify the logistics between the two, cutting a significant portion of maintenance cost.

There will be a separate effort focusing on improve code quality, rather than realigning codebase.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

All are in cloudprovider/azure except go.mod, go.sum, and vendor.
Preferrably review 1.29 PR first, then compare it to here.
Unless specified, the differences are just linter changes and cloud-provider-azure version differences.

Overall:

  • Bump cloud-provider-azure library to v1.28
    • The change of default VmType noted here is not applicable to us, because we have a forked layer of cloud-provider-azure, which already have the default of this field set to vmss
  • Trivial code improvements (e.g., using constants for reused string literals, method interfaces, etc.)
  • Support cloud provider AAD certificate authentication
  • Add a new GPU label, with the old one being deprecated/switch out soon
  • Temporarily adding new configuration options; primarily for managed offering, not being supported for self-hosted yet so don't rely on them until documented: EnableForceDelete (already in master), EnableDetailedCSEMessage, GetVmssSizeRefreshPeriod
  • Add retry for creatingAzureManager in case of throttled requests (already in master)
  • Add support for edge zones (already in master)

Does this PR introduce a user-facing change?

- Azure: update cloud-provider-azure library version to v1.28.0
- Azure: support cloud provider AAD certificate authentication
- Azure: add support for edge zones

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 18, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @comtalyst. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 18, 2024
@k8s-ci-robot k8s-ci-robot requested a review from feiskyer July 18, 2024 05:37
@k8s-ci-robot k8s-ci-robot added the area/provider/azure Issues or PRs related to azure provider label Jul 18, 2024
@k8s-ci-robot k8s-ci-robot requested a review from gandhipr July 18, 2024 05:37
@comtalyst comtalyst force-pushed the comtalyst/azure-changes-from-managed-1.28 branch 4 times, most recently from 0e8d28f to e2ffa0c Compare July 18, 2024 06:43
return nil, staticErr
}
// fetch vmssType information
vmssType, err := getVMSSType(template, manager.azureCache, enableDynamicInstanceList)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the linter forces some of the logic to be moved to separate functions (e.g., getVMSSType)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I found several instances of potentially questionable implementations here. But the evaluation/refactor will be a separate effort.

@comtalyst comtalyst force-pushed the comtalyst/azure-changes-from-managed-1.28 branch from e2ffa0c to c3af59e Compare July 19, 2024 22:04
@comtalyst comtalyst force-pushed the comtalyst/azure-changes-from-managed-1.28 branch from c3af59e to b099e19 Compare July 19, 2024 22:09
@comtalyst comtalyst marked this pull request as ready for review July 21, 2024 18:45
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 21, 2024
@k8s-ci-robot k8s-ci-robot requested a review from nilo19 July 21, 2024 18:45
@tallaxes
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 26, 2024
@k8s-ci-robot
Copy link
Contributor

@comtalyst: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-autoscaler-e2e-azure 7603a15 link false /test pull-cluster-autoscaler-e2e-azure

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@tallaxes
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 31, 2024
@feiskyer
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: comtalyst, feiskyer, tallaxes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2024
@k8s-ci-robot k8s-ci-robot merged commit 1ba4f0b into kubernetes:cluster-autoscaler-release-1.28 Jul 31, 2024
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants