Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Data Disks in Azure and AliCloud #397

Merged
merged 13 commits into from
Apr 26, 2020

Conversation

guydc
Copy link
Contributor

@guydc guydc commented Jan 29, 2020

What this PR does / why we need it:

Enable MCM support for Data Disks in AliCloud and Azure. Aligns MCM’s capability to create and attach multiple devices across all major IaaS providers, for use cases such as memory expansion, resource isolation and encryption.

Use a test suite for all driver tests.

Which issue(s) this PR fixes:
Fixes #396
Part of gardener/gardener#1743

Special notes for your reviewer:

Release note:

Added support for multiple Data Disks in Azure and AliCloud. 

alexwo and others added 2 commits January 29, 2020 20:01
@guydc guydc requested review from ggaurav10 and a team as code owners January 29, 2020 18:18
@hardikdr
Copy link
Member

thanks @guydaichs for the PR, we shall review it soon.

@hardikdr hardikdr added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 6, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 6, 2020
pkg/driver/driver_alicloud.go Outdated Show resolved Hide resolved
pkg/driver/driver_azure.go Outdated Show resolved Hide resolved
@prashanth26 prashanth26 added area/scalability kind/api-change API change with impact on API users kind/enhancement Enhancement, improvement, extension size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) topology/seed Affects Seed clusters labels Feb 6, 2020
@hardikdr
Copy link
Member

hardikdr commented Feb 6, 2020

Thanks @guydaichs for PR/feature. As this does not fall under our existing CI scope[multiple-disks], we shall test it out separately soon, and update the thread here.

@jia-jerry @dkistner Can you please take a brief look for the Alicloud and Azure part respectively :-)

@hardikdr
Copy link
Member

hardikdr commented Feb 6, 2020

cc @rewiko
As you had enabled support for multiple-disks recently in other PR.

@prashanth26
Copy link
Contributor

Looks like there is an issue here -
pkg/apis/machine/validation/azuremachineclass.go:144: Sprintf format %s has arg lun of wrong type int32

@dkistner
Copy link
Member

dkistner commented Feb 7, 2020

I will try to check the azure specific part soon.

@guydc
Copy link
Contributor Author

guydc commented Feb 10, 2020

@rewiko - I added a verification for block device names in awsmachineclass, based on the recommended naming documented by AWS. I changed a few unit tests that were using a device named /dev/vdga. Is this ok from your point of view?

@rewiko
Copy link
Contributor

rewiko commented Feb 10, 2020

LGTM just a few comments

pkg/apis/machine/types.go Outdated Show resolved Hide resolved
@ghost ghost removed the platform/az label Mar 7, 2020
Copy link
Member

@hardikdr hardikdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice PR @guydaichs

pkg/apis/machine/v1alpha1/types.go Show resolved Hide resolved
pkg/apis/machine/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/driver/driver_alicloud.go Show resolved Hide resolved
pkg/driver/driver_azure.go Outdated Show resolved Hide resolved
@guydc
Copy link
Contributor Author

guydc commented Mar 30, 2020

How are the luns managed by the PVC/PV controller in k8s ? Is it possible that, kubernetes allocates the same lun value for a pvc-disk and we set the same lun value for the data-disk here, what will be the behaviour then

@hardikdr - It looks like the attachment process in azure finds an available lun : https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/vendor/k8s.io/legacy-cloud-providers/azure/azure_controller_common.go#L280.

My assumptions are:

  • MCM will provision and attach data-disks before any PV is attached.
  • If an existing PV is (re-)attached to a node, an available lun would be selected on-demand.
  • If the data-disk definition for an azure machine will change, the node would be destroyed and re-provisioned.

Do you think that we need an additional safety mechanism?

Copy link
Member

@hardikdr hardikdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, I have few minor comments inline, thanks for the much-needed feature.

pkg/apis/machine/validation/alicloudmachineclass.go Outdated Show resolved Hide resolved
pkg/driver/driver_alicloud.go Show resolved Hide resolved
@hardikdr
Copy link
Member

hardikdr commented Apr 7, 2020

Do you think that we need an additional safety mechanism?

Looks good to me, but you might want to take a call on the suggestion made here by @dkistner , if you feel to make the lun optional, and enumerate from zero if not specified.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ guydaichs
❌ alexwo
You have signed the CLA already but the status is still pending? Let us recheck it.

@hardikdr
Copy link
Member

hardikdr commented Apr 7, 2020

Overall, thanks a lot to @guydaichs for the PR, and thanks to everyone for reviewing it.

I'd suggest following, to further complete the review and go ahead merging it.

@guydaichs Can you please address the remaining review-comments on the PR. Also, Resolve Conversations wherever you feel it's taken care of already. There is also a conflicting file, you might want to consider.

  • Overall, an apology from my side for the delayed review.

@dkistner @jia-jerry Can you please follow-up (resolve the conversations if taken care of or concern is not valid anymore) and approve the PR from the Azure's and Alicloud's perspective respectively when it's ready. Thanks a lot in advance.

Copy link
Contributor

@jia-jerry jia-jerry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor issue

@@ -63,6 +66,43 @@ func validateAlicloudMachineClassSpec(spec *machine.AlicloudMachineClassSpec, fl
allErrs = append(allErrs, field.Required(fldPath.Child("keyPairName"), "KeyPairName is required"))
}

const dataDiskNameFmt string = `[a-zA-Z0-9\.\-_:]+`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should starts with a letter. The regexp looks like [a-zA-Z][a-zA-Z0-9\.\-_:]+

Copy link
Member

@dkistner dkistner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

We will tackle the things with optional luns for Azure later in a subsequent issue and pr.
Thanks @guydaichs :)

Copy link
Contributor

@jia-jerry jia-jerry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@hardikdr
Copy link
Member

@ggaurav10 @prashanth26 @rewiko Can you please take one final look at PR, and approve if possible?

@hardikdr hardikdr added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 23, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 23, 2020
Copy link
Contributor

@rewiko rewiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR @guydaichs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related kind/api-change API change with impact on API users kind/enhancement Enhancement, improvement, extension needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/alicloud Alicloud platform/infrastructure platform/azure Microsoft Azure platform/infrastructure size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) topology/seed Affects Seed clusters
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data Disks in AliCloud and Azure