Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google compute instance recreate on disk resize #6087

Open
jdziat opened this issue Apr 9, 2020 · 18 comments
Open

Google compute instance recreate on disk resize #6087

jdziat opened this issue Apr 9, 2020 · 18 comments

Comments

@jdziat
Copy link

jdziat commented Apr 9, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v0.11.14

  • provider.google v2.20.3
  • provider.null v2.1.2
  • provider.tls v2.1.1

Affected Resource(s)

  • google_compute_instance

Terraform Configuration Files

resource "google_compute_instance" "manager" {
  name  = "manager"

  boot_disk {
    auto_delete = "${var.manager_disk_auto_delete==""? var.disk_auto_delete : var.manager_disk_auto_delete}"

    initialize_params {
      size  = "${var.manager_disk_size_gb==""? var.disk_size_gb : var.manager_disk_size_gb}"
      type  = "${var.manager_disk_type==""? var.disk_type : var.manager_disk_type}"
    }
  }

 
  service_account {
    scopes = ["userinfo-email", "compute-ro", "storage-full","logging-write","monitoring-write"]
  }
}

Expected Behavior

Terraform should resize the boot disk

Actual Behavior

Terraform wants to recreate the instance to increase the size of the bootdisk

google_compute_instance.manager (new resource required)
      id:                                                  "manager" => <computed> (forces new resource)
      boot_disk.0.initialize_params.0.size:                "100" => "150" (forces new resource)

Steps to Reproduce

  1. terraform plan

Important Factoids

References

  • #0000

b/321386804

@ghost ghost added the bug label Apr 9, 2020
@jdziat jdziat changed the title Terraform V11.14 Google-Provider v2.20.3 Google compute instance recreate on disk resize Apr 9, 2020
@venkykuberan venkykuberan self-assigned this Apr 9, 2020
@venkykuberan
Copy link
Contributor

Initialize_params are the parameters of a disk on creation, they’re a shorthand for actually creating a disk resource. So the only way to change the params a disk was created with is to actually recreate it (and therefore the instance).

I tried the API and Cloud Console they didn't allow boot disk size property. Please let me know if it helps ..

@jdziat
Copy link
Author

jdziat commented Apr 15, 2020

We use this without incident across many different instances. We can also increase this size without issue from api/ui. We shouldn't need to recreate an entire instance simply because we need to increase disk size.

@ghost ghost removed the waiting-response label Apr 15, 2020
@venkykuberan
Copy link
Contributor

You can make changes to additional disks (increase its size) without recreating the instance, however you can't do that on boot disk. Boot_disk is for specific instance can't be shared, looks like what you are looking for is additional disk.

resource "google_compute_attached_disk" "default" {
  disk     = google_compute_disk.default.id
  instance = google_compute_instance.default.id
}

@jdziat
Copy link
Author

jdziat commented Apr 16, 2020

We use attached disks as well but there are instances when the boot disk needs to grow in size. Having to completely recreate the instance seems excessive when we can increase the disk size.

@venkykuberan
Copy link
Contributor

Would you be able to increase the boot_disk size of an instance in gcloud or cloud console without stopping the instance ?. If so and terraform provider lacks that feature we can work on a feature request for that.

@jdziat
Copy link
Author

jdziat commented Apr 16, 2020

Yes you can increase disk size without stopping the instance or destroying it from the gcp ui/gcloud cli/api

@ghost ghost removed the waiting-response label Apr 16, 2020
@venkykuberan
Copy link
Contributor

With the current state of the provider code we can't resize the boot_disk(internal) without recreating the instance as we deal with only instance APIs for this resource. We may need to interact with Disk API directly for altering the size (boot_disk.initialize_params.size ) which is more of a feature request. Can you please close this issue and raise an enhancement ticket ?

@jdziat
Copy link
Author

jdziat commented Apr 22, 2020

I understand that it may require an additional api call but changing the boot disk size should not cause a recreation of the instance. So i don't think this is appropriate for an RFE. Terraform documentation also refers to these initialization options in the default documentation.

@danawillow
Copy link
Contributor

Hey @jdziat, this is one reason why we support the source parameter on the boot disk block. This way, you can create a disk in Terraform that can be updated whenever you want, and we can let initialize_params be something that truly means exactly what it says: parameters that are set when the disk is initialized.

Here's a sample config from our test suite:

data "google_compute_image" "my_image" {
  family  = "debian-9"
  project = "debian-cloud"
}

resource "google_compute_disk" "foobar" {
  name  = "my-disk"
  zone  = "us-central1-a"
  // only use an image data source if you're ok with the disk recreating itself with a new image periodically
  image = data.google_compute_image.my_image.self_link
}

resource "google_compute_instance" "foobar" {
  name         = "my-instance"
  machine_type = "n1-standard-1"
  zone         = "us-central1-a"

  boot_disk {
    source = google_compute_disk.foobar.name
  }

  network_interface {
    network = "default"
  }
}

If you aren't able to create a new resource at this time, I believe that importing the VM+disk should set the appropriate values in state such that you could seamlessly migrate to a config that looks like the one I posted, though I'd recommend trying it somewhere throwaway first before trying it on a real config. Do any of these ideas sound like something you could do?

@jdziat
Copy link
Author

jdziat commented Apr 25, 2020

@danawillow I appreciate that it supports the source block, but that's not clear from the examples or documentation what the best practice should be in this case. I don't see the use case for having the initialization params as a locked in value. I may be biased though because I have a few hundred machines that could be impacted by this.

@danawillow danawillow added this to the Goals milestone Apr 27, 2020
@venkykuberan venkykuberan removed their assignment Apr 28, 2020
@gdubicki
Copy link

How probable is that this is going to be implemented within the next few weeks, @danawillow ?

I am asking because we are also affected by this issue on hundreds of machines and we would like to know if we should work around this limitation or wait for the fix.

@danawillow
Copy link
Contributor

Not very likely. Our team still isn't quite convinced that we want to support updating something that's specifically meant to be for initialization.

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Jun 9, 2022
…ashicorp#6087)

Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit that referenced this issue Jun 9, 2022
…6087) (#11857)

Signed-off-by: Modular Magician <magic-modules@google.com>
@AarshDhokai
Copy link
Contributor

b/261963160

@snehal610
Copy link

snehal610 commented Dec 22, 2022

Boot disk resizes still not working with the GCP terraform ,its impacting number of VMS ,Terraform state file will be corrupted and it will block us to do any further changes.

@mwarkentin
Copy link

I'm going to add a +1 here as well, it feels strange that disk configurations like labels (literal metadata) and size can be updated in place without stopping or recreating anything if we use the API or console but there is no way to do it in Terraform.

We are trying to add cost tracking labels to our infrastructure and there is no way to do so without hacking around outside of terraform. We will definitely not be recreating boot disks across dozens of kafka, postgres, and clickhouse data storage clusters to add a label as that seems ripe for causing an incident, not to mention a ton of work.

Please consider:

  • Making parameters that can be updated without interruption in the API act the same way via terraform resource
  • If you don't want to allow modification of the initialize_params block then consider supporting reconfiguration of those params outside of the initialize_params block
  • Or dropping initialize_params entirely and following normal terraform conventions for determining if a change should require recreation or update in place (basically update in place wherever possible in the API) - I assume this may be a breaking / major provider change?

@MrTrustworthy
Copy link

I'd like to add another voice of support for what's been mentioned here previously.

To have a label change try to recreate disks is unexpected, unintuitive, and quite frankly outright dangerous.

Most of the examples you find online (officially and from blogs etc.) show the "embedded" variant of creating the boot disk, and don't prominently mention the potential issue of being unable to change it without recreating.

I understand that the parameter has "initialise" in its name, so there's the semantic argument about it technically not being correctly used if someone wants to change something.
However, from a semantic perspective the current behaviour still seems arbitrary: Why does "initialise" actually imply recreation? Why does it not just ignore any changes, or throw an error when you try to change it? That would be equally unintuitive, but at least not as outright dangerous.
On the other hand, and this is IMHO more significant, this small semantic argument really doesn't seem to outweigh the actual issues that people have mentioned here, and the risk this poses to people's data.

It seems a bit like the provider has built a UX trap, and then blames the user for falling into it.

I strongly urge to consider rethinking this behaviour and changing it to something that is more intuitive and less dangerous. @mwarkentin mentioned a few options that seem like they would be a good fit for a long-term solution.

@JakeCooper
Copy link

We ran into this issue and it caused an outage due to the fact that it cycled all the instances and caused production downtime because we'd manually worked around this, only for an out of date terraform run to cause all instances to be recreated.

We're already moving off Google Cloud, but for the poor souls who happen to be stuck with y'all, I would implore you have a look at modifying this behavior; it's actively dangerous

https://blog.railway.app/p/incident-august-27-2024

@mwarkentin
Copy link

Save us from the stalebot: #17044 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests