container_cluster resources error on create but leave dangling resources #3875

chrisst · 2019-06-18T22:55:14Z

Affected Resource(s)

google_container_cluster

Terraform Configuration Files

Problem

Several categories of Create failures will result in the cluster existing on in GCP but does not get persisted to state. This can include timeouts while waiting for the Create Operation or burning through retries due to quota issues. In these situations the cluster is often created but a subsequent terraform apply will attempt to create the cluster again usually resulting in a conflict because it already exists.

The text was updated successfully, but these errors were encountered:

nimahak · 2019-06-19T00:07:58Z

An example of such failed cluster:

$ gcloud beta container clusters describe xxx --region us-central1 --project xxx
...
status: ERROR
statusMessage: 'Try a different location, or try again later: Google Compute Engine
  does not have enough resources available to fulfill request: us-central1-b.'

A subsequent terraform apply or destroy is bound to fail since this cluster is not persisted in the state. Example from destroy which fails while trying to delete the network that was part of the config:

1 error occurred:                                                                                                                             
        * google_compute_subnetwork.default (destroy): 1 error occurred:                                                                      
        * google_compute_subnetwork.default: Error reading Subnetwork: googleapi: 
Error 400: The subnetwork resource 'projects/xxx/regions/us-central1/subnetworks/xxx' is already
being used by 'projects/xxx/zones/us-central1-a/instances/gke-xxx-sg7j', resourceInUseByAnotherResource

chrisst · 2019-07-09T22:08:16Z

I haven't been able to find any examples of our test suite that has left dangling cluster when there was a stockout. I have found a couple examples of stockouts failing to create the cluster but so far our cleanup logic has handled things correctly and removed the cluster.

Since the cleanup doesn't retry the delete call I suspect that what is happening is the call to cleanup the cluster fails at which point the cluster is still removed from state.

I've added handling for that condition but it's only speculative at this point. @nimahak if you see this again and are able to capture the debug log output it would help me confirm for sure.

ghost · 2019-08-10T13:50:36Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

chrisst self-assigned this Jun 18, 2019

chrisst added the bug label Jun 18, 2019

chrisst mentioned this issue Jul 9, 2019

Don't clear cluster from state if cleanup fails GoogleCloudPlatform/magic-modules#2030

Merged

modular-magician closed this as completed in GoogleCloudPlatform/magic-modules#2030 Jul 11, 2019

ghost locked and limited conversation to collaborators Aug 10, 2019

github-actions bot added service/container forward/review In review; remove label to forward labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

container_cluster resources error on create but leave dangling resources #3875

container_cluster resources error on create but leave dangling resources #3875

chrisst commented Jun 18, 2019

nimahak commented Jun 19, 2019

chrisst commented Jul 9, 2019

ghost commented Aug 10, 2019

container_cluster resources error on create but leave dangling resources #3875

container_cluster resources error on create but leave dangling resources #3875

Comments

chrisst commented Jun 18, 2019

Affected Resource(s)

Terraform Configuration Files

Problem

nimahak commented Jun 19, 2019

chrisst commented Jul 9, 2019

ghost commented Aug 10, 2019