Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible flaky CI with remove_default_node_pool = true #1227

Open
apeabody opened this issue Apr 15, 2022 · 4 comments
Open

Possible flaky CI with remove_default_node_pool = true #1227

apeabody opened this issue Apr 15, 2022 · 4 comments
Labels
blocked Blocked by some other work bug Something isn't working triaged Scoped and ready for work upstream Work required on Terraform core or provider

Comments

@apeabody
Copy link
Contributor

TL;DR

Possible flaky CI with remove_default_node_pool = true

Error: Error deleting default node pool: googleapi: Error 400: Operation operation-{} is currently creating a node pool for cluster node-pool-cluster-{}. Please wait and try again once it is done.

Expected behavior

No response

Observed behavior

No response

Terraform Configuration

converge node-pool-local

Terraform Version

1.1.8

Additional information

No response

@apeabody
Copy link
Contributor Author

Upstream provider issue hashicorp/terraform-provider-google#10366

@apeabody apeabody added the blocked Blocked by some other work label May 24, 2022
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the Stale label Jul 23, 2022
@apeabody apeabody reopened this Aug 1, 2022
@apeabody apeabody added upstream Work required on Terraform core or provider and removed Stale labels Aug 1, 2022
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the Stale label Sep 30, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 7, 2022
@bharathkkb bharathkkb reopened this Oct 18, 2022
@bharathkkb bharathkkb added the triaged Scoped and ready for work label Oct 18, 2022
@github-actions github-actions bot removed the Stale label Oct 18, 2022
@jacknif3
Copy link

jacknif3 commented Nov 7, 2022

Hello everyone,

Saw this issue and wanted to share my input on this and ask for any suggestions if there is a way to resolve the error even temporarily.

At the moment I am able to reproduce the 400 Error 7/10 times while provisioning a GKE cluster via Terraform with the option remove_default_node_pool = true.

The Terraform module I am running creates 4 resources:

google_compute_network
google_compute_subnetwork
google_container_cluster
google_container_node_pool

Each time it errors out on google_container_cluster, around 6min into the resource creation.

If I set remove_default_node_pool = false. the module completes every time since it takes just a bit under 6 min to create the cluster.

I am sharing the last few lines of terraform apply debug output including the received error message: https://gist.githubusercontent.com/jacknif3/062dce1ae3d23e01d1295f5470e0091f/raw/264bf02ad59959ce50d081f116a1c36ed4028c51/gistfile1.txt

Please let me know if you need to see how I'm creating the resources or anything else, I'll be glad to try anything to resolve this issue.

Additional info:

Terraform version: 1.3.4
Google provider version: 4.42.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Blocked by some other work bug Something isn't working triaged Scoped and ready for work upstream Work required on Terraform core or provider
Projects
None yet
Development

No branches or pull requests

3 participants