Cannot delete instance group because it's being used by a backend service #6376

kustodian · 2020-05-14T06:38:45Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.
If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v0.12.24

provider.google v3.21.0
provider.google-beta v3.21.0

Affected Resource(s)

google_compute_region_backend_service
google_compute_instance_group

Terraform Configuration Files

locals {
  project         = "<project-id>"
  network         = "<vpc-name>"
  network_project = "<vpc-project>"
  zones           = ["europe-west1-b", "europe-west1-c", "europe-west1-d"]
  s1_count        = 3
}

provider "google" {
  project = local.project
  version = "~> 3.0"
}

data "google_compute_network" "network" {
  name    = local.network
  project = local.network_project
}

resource "google_compute_region_backend_service" "s1" {
  name = "s1"

  dynamic "backend" {
    for_each = google_compute_instance_group.s1
    content {
      group = backend.value.self_link
    }
  }
  health_checks = [
    google_compute_health_check.default.self_link,
  ]
}

resource "google_compute_health_check" "default" {
  name = "s1"
  tcp_health_check {
    port = "80"
  }
}

resource "google_compute_instance_group" "s1" {
  count   = local.s1_count
  name    = format("s1-%02d", count.index + 1)
  zone    = element(local.zones, count.index)
  network = data.google_compute_network.network.self_link
}

I'm not sure is this a general TF problem or a Google provider problem, but here it goes.
Currently it's not possible to lover the number of google_compute_instance_group that are used in a google_compute_region_backend_service. In the code above if we lower the number of google_compute_instance_group resources and try to apply the configuration, TF will first try to delete the not needed instance groups and then update the backend configuration, but that order doesn't work because you cannot delete an instance group that is used by the backend service, the order should be the other way around.

So to sum it up, when I lower the number of the instance group resources TF does this:

delete surplus google_compute_instance_group -> this fails
update google_compute_region_backend_service

It should do this the other way around:

update google_compute_region_backend_service
delete surplus google_compute_instance_group -> this fails

Here is the output it generates:

google_compute_instance_group.s1[2]: Destroying... [id=projects/<project-id>/zones/europe-west1-d/instanceGroups/s1-03]

Error: Error deleting InstanceGroup: googleapi: Error 400: The instance_group resource 'projects/<project-id>/zones/europe-west1-d/instanceGroups/s1-03' is already being used by 'projects/<project-id>/regions/europe-west1/backendServices/s1', resourceInUseByAnotherResource

Expected Behavior

TF should first update the google_compute_region_backend_service, then delete the instance group.

Actual Behavior

TF tried to delete the instance group first, which resulted in an error.

Steps to Reproduce

terraform apply
Set s1_count = 2
terraform apply

Important Factoids

It's not a simple task to fix this. One "workaround" is to change the dynamic for_each to have a slice() function like this:

  dynamic "backend" {
    for_each = slice(google_compute_instance_group.s1, 0, 2)
    content {
      group = backend.value.self_link
    }
  }

So you first set the second number of slice() to the new number of the instanca groups run apply, then lower the s1_count to that same number and run apply again, but that's just to complicated for a simple task like this.

b/308569276

The text was updated successfully, but these errors were encountered:

c2thorn · 2020-05-19T18:05:21Z

Unfortunately, this is an upstream Terraform issue. The provider doesn't have access to the update/destroy order. This is a similar to the scenario outlined here: #3008
I believe multiple apply's is the only way to go for this case.

kustodian · 2020-05-19T18:08:42Z

Multiple apply doesn't fix the issue here. You have to change the config, apply, than change again, apply.

…

On Tue, May 19, 2020, 20:05 Cameron Thornton ***@***.***> wrote: Closed #6376 <#6376> . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6376 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAO34LNRT7KQV3MSZZTZ6CLRSLC7BANCNFSM4NAMXOJA> .

c2thorn · 2020-05-19T18:11:09Z

Sorry, that's what I meant. We don't have access to enable a solution for just one apply.

pdecat · 2020-05-20T10:50:13Z

Hi, here's a somewhat work-around for this specific use-case using an intermediate datasource (needs two applies):

provider google {
  version = "3.22.0"
  region  = "europe-west1"
  project = "myproject"
}

locals {
  #zones = []
  zones = ["europe-west1-b"]
}

data "google_compute_network" "network" {
  name = "default"
}

data "google_compute_instance_group" "s1" {
  for_each = toset(local.zones)
  name     = format("s1-%s", each.key)
  zone     = each.key
}

resource "google_compute_region_backend_service" "s1" {
  name = "s1"

  dynamic "backend" {
    for_each = [for group in data.google_compute_instance_group.s1 : group.self_link if group.self_link != null]
    content {
      group = backend.value
    }
  }
  health_checks = [
    google_compute_health_check.default.self_link,
  ]
}

resource "google_compute_health_check" "default" {
  name = "s1"
  tcp_health_check {
    port = "80"
  }
}

resource "google_compute_instance_group" "s1" {
  for_each = toset(local.zones)
  name     = format("s1-%s", each.key)
  zone     = each.key
  network  = data.google_compute_network.network.self_link
}

kustodian · 2020-05-20T10:58:08Z

@pdecat your suggestion removes the dependency between google_compute_region_backend_service and google_compute_instance_group so this will probably always require two applies, even when starting from scratch.

pdecat · 2020-05-20T11:01:22Z

so this will probably always require two applies, even when starting from scratch.

I can confirm it does.

But at least it does not need manual intervention out of band to fix the situation.

pdecat · 2020-05-20T13:44:07Z

Maybe something the google provider could do to fix this situation would be to manage backends of a google_compute_region_backend_service as a separate resource:

# NOT A WORKING EXAMPLE
locals {
  project         = "<project-id>"
  network         = "<vpc-name>"
  network_project = "<vpc-project>"
  zones           = ["europe-west1-b", "europe-west1-c", "europe-west1-d"]
  s1_count        = 3
}

provider "google" {
  project = local.project
  version = "~> 3.0"
}

data "google_compute_network" "network" {
  name    = local.network
  project = local.network_project
}

resource "google_compute_region_backend_service" "s1" {
  name = "s1"

  health_checks = [
    google_compute_health_check.default.self_link,
  ]
}

# WARNING: this resource type does not exist
resource "google_compute_region_backend_service_backend" "s1" {
  for_each = google_compute_instance_group.s1

  backend_service = google_compute_region_backend_service.s1.self_link
  group = backend.value.self_link
}

resource "google_compute_health_check" "default" {
  name = "s1"
  tcp_health_check {
    port = "80"
  }
}

resource "google_compute_instance_group" "s1" {
  count   = local.s1_count
  name    = format("s1-%02d", count.index + 1)
  zone    = element(local.zones, count.index)
  network = data.google_compute_network.network.self_link
}

As a side note, I feel like hashicorp/terraform#8099 is not really about the same issue. It is about replacing or updating a resource when another resource it depends on changes (and not being destroyed).

StephenWithPH · 2020-05-26T19:40:02Z

I added a comment on the Terraform core issue (hashicorp/terraform#25010 (comment))

Based on that comment (terraform taint up the dependency chain until a single-pass apply works), I think there's a provider-specific fix.

If ForceNew was part of the schema here ...

https://github.com/terraform-providers/terraform-provider-google/blob/c87e414b028becc33f64183a9bd52c92c9b49737/google/resource_compute_region_backend_service.go#L173-L179

... wouldn't that have the same effect as my manual terraform taint?

c2thorn · 2020-05-26T23:54:07Z

@pdecat that should work, and requires implementing a new fine-grained resource google_compute_region_backend_service_backend.

Reopening the issue since a solution is possible, and this will be tracked similarly to other feature-requests.

c2thorn · 2020-05-26T23:56:41Z

@StephenWithPH ForceNew would have the same effect, but make every change (addition as well as removal) to the backend set destructive. Providing a new fine-grained resource is the cleaner option here.

freeseacher · 2020-11-21T03:24:14Z

lack of pretty essential features and bugs like this makes me very disappointed with all the terraform and GCP

cagataygurturk · 2021-02-17T21:46:24Z

Providing a new fine-grained resource is the cleaner option here.

The question is when :)

derhally · 2021-03-26T13:36:16Z

This issue is actually quite problematic

I get these errors trying to destroy the whole module. It requires multiple targeted terraform destroys to complete


Error: Error when reading or editing HealthCheck: googleapi: Error 400: The health_check resource 'projects/test-proj/global/healthChecks/atlantis-healthcheck' is already being used by 'projects/test-proj/global/backendServices/atlantis-backend-service', resourceInUseByAnotherResource

Error: Error waiting for Deleting SecurityPolicy: The security_policy resource 'projects/test-proj/global/securityPolicies/atlantis-security-policy' is already being used by 'projects/test-proj/global/backendServices/atlantis-backend-service'

Error: Error deleting InstanceGroup: googleapi: Error 400: The instance_group resource 'projects/test-proj/zones/us-central1-a/instanceGroups/instance-group-all' is already being used by 'projects/test-proj/global/backendServices/atlantis-backend-service', resourceInUseByAnotherResource

konturn · 2021-07-11T21:00:56Z

I actually just ran into this issue a couple of days ago, and I was able to resolve it by appending a random string to the end of the group manager's name and using the create_before_destroy lifecycle policy for the instance group manager resource. For whatever reason, doing so leads Terraform to modify the backend service before destroying the original instance group. Still not the prettiest hack in the world, but better than having to issue multiple applies.

husseyd · 2021-10-05T03:24:55Z

This has been driving me nuts for months.
Using Cloud Run behind external GCLB. Backend services for the Serverless NEGs are in use by the URL map.

Once all this config/infra is in place, the service / backend service cannot be deleted even if removing the URL map in the same change. It's becomes a two step of removing URL map, then removing service and backend service.

In an enterprise setting with ~10 environments each receiving different releases at different schedules, having repeat CI pipelines is not okay and is basically unmanageable.

bluemalkin · 2022-03-09T04:35:30Z

I can relate to this, GCP doesn't update the URL map before destroying backend services. Very frustrating.

Signed-off-by: Modular Magician <magic-modules@google.com>

PranavSathy · 2022-08-17T22:53:13Z

Can confirm that this is the case with manual global load balancing setup on Google Provider as well. Definitely super annoying that we need to manually need to:

Update our terraform config to remove a desired deployment region (e.g. `us-central1).
Run the following command manually:

$ gcloud beta compute backend-services remove-backend --global revere-backend \
    --network-endpoint-group-region=<region> \
    --network-endpoint-group=revere-neg-<region>

terraform apply to achieve desired state.

This means anytime we turn down on a region some administrator is going to have to do this instead of simply relying on CI/CD. What's worse is that it makes proving certain security/compliance certifications harder as our CI/CD + pull request process is audited and logged; but random CLI commands from an administrator's shell environment is harder to track (i.e. we need to involve GCP Audit Logging in the business justifications).

Looking forward to an elegant solution by the provider here.

pedromiranda-telus · 2022-10-21T19:02:40Z

I can relate to this, GCP doesn't update the URL map before destroying backend services. Very frustrating.

I had the same problem. My workaround was to run following command (IT PROVOKES UNAVAILABILITY):

# This will delete the URL map, then the backend service and finally create them again
terraform apply -replace="google_compute_region_url_map.name_of_your_url_map"

Hope it helps.

Unichron · 2022-11-15T08:56:40Z

I think it's fundamentally a terraform core issue, but it could be fixed in the provider if there was a standalone resource to manage a backend of a backend service. In this case the deletion of the instance group/neg/whatever would naturally involve the deletion of the backend resource, and deletes in this case would be properly ordered. Of course the same then should be done for all analogous cases, which is a hassle and spans across most terraform providers (and maybe even impossible in some cases), but these would provide extra flexibility as well on top of being a workaround for this issue.

m00lecule · 2023-02-05T21:30:01Z

Keeping fingers crossed for somebody to solve this issue. Today I have faced it when trying to increase google_compute_region_instance_group_manager.distribution_policy_zones field with additional zone. I have learned that common operations are not possible in GCP.

luismendezescobar · 2023-04-14T04:05:35Z

I actually just ran into this issue a couple of days ago, and I was able to resolve it by appending a random string to the end of the group manager's name and using the create_before_destroy lifecycle policy for the instance group manager resource. For whatever reason, doing so leads Terraform to modify the backend service before destroying the original instance group. Still not the prettiest hack in the world, but better than having to issue multiple applies.

hi could you paste an example of what you did with the create_before_destroy ?

djsmiley2k · 2023-06-07T14:17:08Z

Disappointing this exists for 2+ years and still no fix.

How come terraform doesn't understand it can't delete a managed instance group without first removing the load balancer (i.e. backend) depending on it? Seems a pretty simple idea, which for some reason isn't implemented?

levid0s · 2023-07-21T16:00:16Z

I'm having the same issue.

I tried fixing it by adding a manual dependency using lifecycle.replace_triggered_by, but you have to do this on every single dependent resource, otherwise I keep getting the 'resource already used by' error.

cen1 · 2024-05-14T15:01:34Z

So.. this is a top 11 issue by likes, 4 years later we still have to do painful workarounds. create_before_destroy is not always feasible if you run a singleton..

plexus · 2024-06-04T07:16:31Z

This seems to work reasonably well as a workaround:

resource "random_id" "group-manager-suffix" {
  byte_length = 4
}

resource "google_compute_instance_group_manager" "my-group" {
  name = "my-instance-group-manager-${random_id.group-manager-suffix.hex}"

  ...

  lifecycle {
    create_before_destroy = true
  }
}

resource "google_compute_backend_service" "my-backend" {
  ...
  backend {
    group = google_compute_instance_group_manager.my-group.instance_group
    ...
  }
}

By randomizing the name it's possibly to create_before_destroy, so this will first create a second instance_group_manager, update the backend, then destroy the first instance_group_manager. Single pass apply and no intermediate downtime.

maxi-cit · 2024-06-05T16:55:58Z

Hello folks, I started working on adding this new resource google_compute_region_backend_service_backend. Hopefully this should be enough to close this issue. I am opening a PR in few days (just making sure tests works fine).

ghost added the bug label May 14, 2020

edwardmedia assigned edwardmedia and c2thorn May 14, 2020

c2thorn added the upstream-terraform label May 19, 2020

c2thorn closed this as completed May 19, 2020

kustodian mentioned this issue May 20, 2020

Update/replace resource when a dependency is changed hashicorp/terraform#8099

Closed

kustodian mentioned this issue May 21, 2020

Update dependency before destroy hashicorp/terraform#25010

Closed

c2thorn added persistent-bug Hard to diagnose or long lived bugs for which resolutions are more like feature work than bug work and removed upstream-terraform bug labels May 26, 2020

c2thorn unassigned edwardmedia May 26, 2020

c2thorn reopened this May 26, 2020

danawillow added the size/m label Jun 1, 2020

danawillow added this to the Goals milestone Jun 1, 2020

gmauleon mentioned this issue Jul 20, 2020

Cannot delete google_compute_address when it's a dependency of google_compute_router_nat #6812

Closed

slevenick mentioned this issue Oct 29, 2021

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

Open

c2thorn removed their assignment Apr 19, 2022

rileykarson added the new-resource label Apr 19, 2022

prashantv mentioned this issue Jun 23, 2022

Terraform apply destroys resources that are still referenced from other resources hashicorp/terraform#31309

Open

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Aug 4, 2022

Add updated go.sum file (hashicorp#6376)

235869c

Signed-off-by: Modular Magician <magic-modules@google.com>

modular-magician added a commit that referenced this issue Aug 4, 2022

Add updated go.sum file (#6376) (#12260)

c2ed2d5

Signed-off-by: Modular Magician <magic-modules@google.com>

github-actions bot added forward/review In review; remove label to forward service/compute-l7-load-balancer labels Oct 25, 2023

roaks3 removed the forward/review In review; remove label to forward label Oct 27, 2023

modular-magician added the forward/linked label Oct 31, 2023

github-actions bot added the service/compute-networking-ig label Nov 11, 2023

maxi-cit mentioned this issue Jul 23, 2024

Backend Service Backend decoupling resource GoogleCloudPlatform/magic-modules#11233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot delete instance group because it's being used by a backend service #6376

Cannot delete instance group because it's being used by a backend service #6376

kustodian commented May 14, 2020 •

edited by modular-magician

Loading

c2thorn commented May 19, 2020

kustodian commented May 19, 2020 via email

c2thorn commented May 19, 2020

pdecat commented May 20, 2020 •

edited

Loading

kustodian commented May 20, 2020

pdecat commented May 20, 2020

pdecat commented May 20, 2020 •

edited

Loading

StephenWithPH commented May 26, 2020

c2thorn commented May 26, 2020

c2thorn commented May 26, 2020

freeseacher commented Nov 21, 2020

cagataygurturk commented Feb 17, 2021

derhally commented Mar 26, 2021

konturn commented Jul 11, 2021

husseyd commented Oct 5, 2021

bluemalkin commented Mar 9, 2022

PranavSathy commented Aug 17, 2022

pedromiranda-telus commented Oct 21, 2022 •

edited

Loading

Unichron commented Nov 15, 2022

m00lecule commented Feb 5, 2023 •

edited

Loading

luismendezescobar commented Apr 14, 2023

djsmiley2k commented Jun 7, 2023

levid0s commented Jul 21, 2023 •

edited

Loading

cen1 commented May 14, 2024

plexus commented Jun 4, 2024

maxi-cit commented Jun 5, 2024

Cannot delete instance group because it's being used by a backend service #6376

Cannot delete instance group because it's being used by a backend service #6376

Comments

kustodian commented May 14, 2020 • edited by modular-magician Loading

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

c2thorn commented May 19, 2020

kustodian commented May 19, 2020 via email

c2thorn commented May 19, 2020

pdecat commented May 20, 2020 • edited Loading

kustodian commented May 20, 2020

pdecat commented May 20, 2020

pdecat commented May 20, 2020 • edited Loading

StephenWithPH commented May 26, 2020

c2thorn commented May 26, 2020

c2thorn commented May 26, 2020

freeseacher commented Nov 21, 2020

cagataygurturk commented Feb 17, 2021

derhally commented Mar 26, 2021

konturn commented Jul 11, 2021

husseyd commented Oct 5, 2021

bluemalkin commented Mar 9, 2022

PranavSathy commented Aug 17, 2022

pedromiranda-telus commented Oct 21, 2022 • edited Loading

Unichron commented Nov 15, 2022

m00lecule commented Feb 5, 2023 • edited Loading

luismendezescobar commented Apr 14, 2023

djsmiley2k commented Jun 7, 2023

levid0s commented Jul 21, 2023 • edited Loading

cen1 commented May 14, 2024

plexus commented Jun 4, 2024

maxi-cit commented Jun 5, 2024

kustodian commented May 14, 2020 •

edited by modular-magician

Loading

pdecat commented May 20, 2020 •

edited

Loading

pdecat commented May 20, 2020 •

edited

Loading

pedromiranda-telus commented Oct 21, 2022 •

edited

Loading

m00lecule commented Feb 5, 2023 •

edited

Loading

levid0s commented Jul 21, 2023 •

edited

Loading