Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gcp local ssd disk support #414

Merged
merged 3 commits into from
Apr 14, 2020
Merged

Conversation

tennix
Copy link
Contributor

@tennix tennix commented Feb 17, 2020

What this PR does / why we need it:

Which issue(s) this PR fixes:
This PR enables user to add local ssd disks on GCP, the local ssd disk supports two interfaces: NVME and SCSI. See details in the below documents:

Special notes for your reviewer:

I've tested this on GCP and created 4 local NVME ssd disks. Note a GCE instance can only attach at most 8 local ssd disks. So there may need a validation for the total number of local ssd disks.

Release note:

Add GCP local ssd disk support

@tennix tennix requested review from ggaurav10 and a team as code owners February 17, 2020 10:27
@hardikdr
Copy link
Member

@tennix Thanks a lot for the PR, we'll take a look soon.

Copy link
Contributor

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@prashanth26 prashanth26 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 18, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 18, 2020
@prashanth26 prashanth26 added platform/gcp Google cloud platform/infrastructure reviewed/lgtm Has approval for merging needs/review Needs review and removed reviewed/lgtm Has approval for merging labels Feb 28, 2020
@prashanth26
Copy link
Contributor

@hardikdr @ggaurav10 - Do you see any concerns here?

var attachedDisk compute.AttachedDisk
if disk.Type == "SCRATCH" {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default to false, right?
@hardikdr @prashanth26 - Wouldn't it lead to similar issues that we recently saw of orphan disks when a VM is deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. What would be the best way to handle this? Default the values to true on MCM or Gardener maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, defaulting should be done in MCM, keeping it mind that it can be used independently of Gardener

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's related to this. Unlike to persistent disks, the local ssd disks will be deleted once the instance is deleted. But I can change the default value to true if you like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually prefer to delete these disks by default with the VM, unless specified in machinieclass. Considering the fact that we constantly create/delete the machines with autoscaler, we may end up with many such orphan disks, and we don't have active automation to keep track of these disks later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation the local-ssds are expected to be deleted with the VM termination. - https://cloud.google.com/compute/docs/disks#localssds .

  • @tennix please correct if that's not the case.

Though, to save us from the possible changes at gcp's behaviour, I would prefer to set it to true.

  • The issue actually is, the API type for AutoDelete is defined as bool and leaves no good way to check if it's set at MachineClass or not.
  • @tennix Please feel free to change the type to *bool from bool, that'll help you in checking AutoDelete==nil(implies not set at MachineClass), and default to true.

Copy link
Contributor Author

@tennix tennix Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've changed GCPDisk AutoDelete type to *bool and set AttachDisk.AutoDelete default value to true if users does not specify the field. @hardikdr @prashanth26 @ggaurav10 PTAL again.

}
} else {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For persistent disks, is it better to default to false so if the VM is deleted there is still a chance to recover the data in persistent disk especially for production environment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

Actually, at the moment, Gardener sets the AutoDelete to true for all the GCPMachineClasses, hence the current codepath should not impact us immediately.

For persistent disks, is it better to default to false so if the VM is deleted there is still a chance to recover the data in persistent disk especially for production environment?

I would actually suggest to set the AutoDelete=true even for the persistent disks, and let the user change the behaviour if needed via MachineClass.

  • The change to the api type to *bool would be useful here as well.

var attachedDisk compute.AttachedDisk
if disk.Type == "SCRATCH" {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

}
} else {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

@prashanth26 prashanth26 added needs/changes Needs (more) changes and removed needs/review Needs review labels Mar 3, 2020
@CLAassistant
Copy link

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@hardikdr hardikdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, thanks a lot @tennix for the PR, and sorry for the delay in reviewing. We got held up in a few other immediate tasks.

There are few minor comments related to defaulting the disk-deletion with VM.
As it seems, local-ssds are deleted with the VM-termination[needs your confirmation] and this PR doesn't alter the behavior for the persistent-disks[defaults to false in master],

  • I'll leave the decision to you if you want to change the overall defaulting of both local-ssds and persistent-disks to true - to be deleted with VM-deletion.

Also it would be very nice if you could add an example here, as a second disk with local-ssd type.

var attachedDisk compute.AttachedDisk
if disk.Type == "SCRATCH" {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation the local-ssds are expected to be deleted with the VM termination. - https://cloud.google.com/compute/docs/disks#localssds .

  • @tennix please correct if that's not the case.

Though, to save us from the possible changes at gcp's behaviour, I would prefer to set it to true.

  • The issue actually is, the API type for AutoDelete is defined as bool and leaves no good way to check if it's set at MachineClass or not.
  • @tennix Please feel free to change the type to *bool from bool, that'll help you in checking AutoDelete==nil(implies not set at MachineClass), and default to true.

}
} else {
attachedDisk = compute.AttachedDisk{
AutoDelete: disk.AutoDelete,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tennix - Can we default this flag to true in the MCM?

Actually, at the moment, Gardener sets the AutoDelete to true for all the GCPMachineClasses, hence the current codepath should not impact us immediately.

For persistent disks, is it better to default to false so if the VM is deleted there is still a chance to recover the data in persistent disk especially for production environment?

I would actually suggest to set the AutoDelete=true even for the persistent disks, and let the user change the behaviour if needed via MachineClass.

  • The change to the api type to *bool would be useful here as well.

@hardikdr hardikdr added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 9, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 9, 2020
@hardikdr
Copy link
Member

hardikdr commented Apr 9, 2020

@prashanth @ggaurav10 can you please take a quick look and approve if fine with it?

Copy link
Contributor

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change. Otherwise looks fine to me.

pkg/apis/machine/v1alpha1/types.go Show resolved Hide resolved
Copy link
Contributor

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

pkg/apis/machine/v1alpha1/types.go Show resolved Hide resolved
@hardikdr hardikdr added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 14, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/changes Needs (more) changes needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/gcp Google cloud platform/infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants