Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resourceInUseByAnotherResource when deleting health check and autohealing from instance group manager #1883

Open
danisla opened this issue Aug 15, 2018 · 6 comments

Comments

@danisla
Copy link
Contributor

danisla commented Aug 15, 2018

API error resourceInUseByAnotherResource when attempting to delete a health check from an instance group manager resource.

Terraform Version

Terraform v0.11.7

  • provider.google v1.16.2

Affected Resource(s)

Please list the resources as a list, for example:

  • google_compute_instance_group_manager
  • google_compute_region_instance_group_manager

Terraform Configuration Files

resource "google_compute_health_check" "autohealing" {
  name                = "autohealing-health-check"
  check_interval_sec  = 5
  timeout_sec         = 5
  healthy_threshold   = 2
  unhealthy_threshold = 10                         # 50 seconds

  http_health_check {
    request_path = "/healthz"
    port         = "8080"
  }
}

resource "google_compute_instance_group_manager" "appserver" {
  name = "appserver-igm"

  base_instance_name = "app"
  instance_template  = "${google_compute_instance_template.appserver.self_link}"
  update_strategy    = "NONE"
  zone               = "us-central1-a"

  target_pools = ["${google_compute_target_pool.appserver.self_link}"]
  target_size  = 2

  named_port {
    name = "customHTTP"
    port = 8888
  }

  auto_healing_policies {
    health_check      = "${google_compute_health_check.autohealing.self_link}"
    initial_delay_sec = 300
  }
}

Now change the HCL to the following, deleting the health check resource and its reference in the google_compute_instance_group_manager at the same time.

resource "google_compute_instance_group_manager" "appserver" {
  name = "appserver-igm"

  base_instance_name = "app"
  instance_template  = "${google_compute_instance_template.appserver.self_link}"
  update_strategy    = "NONE"
  zone               = "us-central1-a"

  target_pools = ["${google_compute_target_pool.appserver.self_link}"]
  target_size  = 2

  named_port {
    name = "customHTTP"
    port = 8888
  }
}

Error Output

Error: Error applying plan:

1 error(s) occurred:

* google_compute_health_check.autohealing (destroy): 1 error(s) occurred:

* google_compute_health_check.autohealing: Error deleting HealthCheck: googleapi: Error 400: The health_check resource 'projects/REDACTED/global/healthChecks/autohealing-health-check' is already being used by 'projects/REDACTED/zones/us-central1-a/instanceGroupManagers/appserver-igm', resourceInUseByAnotherResource

Expected Behavior

Instance group manager to be updated with removal of healthcheck before healthcheck deletion is attempted.

Actual Behavior

Health check was deleted when instance manager was still using it.

References

Might be related to: hashicorp/terraform#8099

@nat-henderson
Copy link
Contributor

I see, you're saying we need to perform the update before the delete. Interesting. That seems to be best-fixed in the Terraform core planner - I'm not sure we can control that in the provider. I suppose we could see if the GCP API contains a list of resources that are using a health check, and do like we do with Disk - remove the health checks from their users during the Delete() call.

@nat-henderson
Copy link
Contributor

@paddycarver is looking into this on the core side! Thanks, Paddy! :)

I looked into the API and there doesn't seem to be anything tricksy we can do to find a list of all resources that depend on the health check.

I'm assigning to @paddycarver - when you have an answer, please feel free to unassign / assign-to-me!

@paddycarver
Copy link
Contributor

So, I've talked about this with the core team. At the moment, there's a few things at play. When a config is present for a resource, that controls the dependencies and order of resource actions. When a config is not present for a resource, state controls. So our problem here is that a config is present for the instance group manager, but a config is not present for the health check. The IGM state, which tracks the dependency, is not used, therefore. So in this situation, Terraform has no idea these resources are related, and tries to process them in parallel.

Ironically, were you to try to delete both at once, they should delete in the correct order, because then the state would be used for both, and the dependency would be noticed.

I talked about solutions for this--we talked about considering both config and state when determining dependencies, and we talked about more involved solutions like defining types of dependencies resources have, which the provider could then control behavior on--but honestly, neither of those are likely to happen in the next few weeks.

Like @ndmckinley, I too am, unfortunately, a bit stumped as to what to do here, as I'm not sure a workaround is available to us. The best I can do is suggest either a two-phase commit, removing the health check from the IGM, applying, then deleting the health check and applying, or to add upstream labels here and wait for either Terraform's core to become more sophisticated about this, or for the GCP API to allow us to list all the things a health check is attached to, so we can detach it before destroy, as a workaround. Sorry I don't have a better answer for you.

@danisla
Copy link
Contributor Author

danisla commented Aug 21, 2018

Thanks for looking into this Paddy. It's unfortunately a case that may become more common with GCP resources as folks use Terraform more and more to iterate on changes and upgrade/patch configs. I can think of other cases where there are dependencies like this around the load balancer components and Terraform can get stuck trying to modify/delete something before it's been detached.

I like the solution around defining types of dependencies like a "used by" similar to the discussion in hashicorp/terraform#8099.

I'll just hang tight for now and try to give explicit apply instructions when upgrading and I know something is going to break like what I did in one of my module's release notes.

Thanks again!

@kewei5zhang
Copy link

Run terraform refresh before terraform destroy solved the problem for me

@jakewan
Copy link

jakewan commented Mar 2, 2024

I encountered a similar error when trying to change a google_compute_region_instance_group_manager's distribution_policy_target_shape setting to BALANCED. It appears this is only a valid option when there's already an autoscaler in place. However, updating this setting on an existing MIG forces replacement, leading to a conflict because it can't be recreated as long as there's an autoscaler referencing it.

In this case, the two-phase commit approach works with one slight modification. After setting up the MIG with default settings and an autoscaler, I manually updated the distribution policy target shape in the GCP UI, then updated the Terraform configuration to match. A subsequent plan shows no changes to be made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants