Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud storage bucket retries slowing state refreshes #10423

Closed
DSchrupert opened this issue Oct 27, 2021 · 11 comments · Fixed by GoogleCloudPlatform/magic-modules#5542, hashicorp/terraform-provider-google-beta#3938 or #10781
Assignees
Labels

Comments

@DSchrupert
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

v3.90.0_x5

Affected Resource(s)

terraform-google-modules/cloud-storage/google//modules/simple_bucket

Debug Output

https://gist.github.com/NicholasAzar/dbd961da7df00699a07f5a5f0517600a

Expected Behavior

Bucket state retry shouldn't occur during state refresh

Actual Behavior

Bucket retries, causing a much slower apply

Steps to Reproduce

  1. terraform apply

Important Factoids

Only noticed this behaviour since yesterday.

@DSchrupert DSchrupert added the bug label Oct 27, 2021
@DSchrupert
Copy link
Author

Potentially related to #10287?

@edwardmedia edwardmedia self-assigned this Oct 27, 2021
@edwardmedia
Copy link
Contributor

@NicholasAzar can you share your config?

@DSchrupert
Copy link
Author

Not much I can share publicly, it should just be reproducible with any new/moved bucket:

module "a_bucket" {
  source  = "terraform-google-modules/cloud-storage/google//modules/simple_bucket"
  version = "~> 1.4"
  name       = "bucket_name"
  project_id = module.project.project_id
  location   = "us-central1"
}

Logs indicate that it seems like the refresh state phase of the apply hits the 20 minute timeout to check for each bucket that's had a change.

Step #2 - "Apply": 2021-10-27T19:31:29.386Z [INFO]  plugin.terraform-provider-google_v3.90.0_x5: 2021/10/27 19:31:29 [DEBUG] Retry Transport: Returning after 1 attempts: timestamp=2021-10-27T19:31:29.386Z
Step #2 - "Apply": /10/27 19:31:29 [DEBUG] Dismissed an error as retryable. Retry 404s for bucket creation - googleapi: Error 404: The specified bucket does not exist., notFound: timestamp=2021-10-27T19:31:29.386Z
Step #2 - "Apply": 2021-10-27T19:31:29.386Z [INFO]  plugin.terraform-provider-google_v3.90.0_x5: 2021/10/27 19:31:29 [TRACE] Waiting 10s before next try: timestamp=2021-10-27T19:31:29.386Z
Step #2 - "Apply": 2021-10-27T19:31:33.075Z [INFO]  plugin.terraform-provider-google_v3.90.0_x5: 2021/10/27 19:31:33 [WARN] WaitForState timeout after 20m0s: timestamp=2021-10-27T19:31:33.074Z
Step #2 - "Apply": 2021-10-27T19:31:33.075Z [INFO]  plugin.terraform-provider-google_v3.90.0_x5: 2021/10/27 19:31:33 [WARN] WaitForState starting 30s refresh grace period: timestamp=2021-10-27T19:31:33.074Z
Step #2 - "Apply": 2021-10-27T19:31:33.075Z [INFO]  plugin.terraform-provider-google_v3.90.0_x5: 2021/10/27 19:31:33 [WARN] Removing Storage Bucket "bucket_name" because it's gone: timestamp=2021-10-27T19:31:33.074Z
Step #2 - "Apply": 2021/10/27 19:31:33 [WARN] Provider "registry.terraform.io/hashicorp/google" produced an unexpected new value for module.a_bucket.google_storage_bucket.bucket during refresh.

We can also see that the terraform plan just started taking 20 minutes when there's a bucket change:

2021/10/27 18:28:20 Running: [terraform plan -out plan-for-import-766615410.tfplan]
2021/10/27 18:48:34 Running: [terraform show -json plan-for-import-766615410.tfplan]
2021/10/27 18:48:41 Found importable resource: "module.a_bucket.google_storage_bucket.bucket"
2021/10/27 18:48:41 Skipping module.a_bucket.google_storage_bucket.bucket, not in list of specific resource types to import

@edwardmedia
Copy link
Contributor

edwardmedia commented Oct 28, 2021

@NicholasAzar using your config , I can't repro the issue. Does it fail on the very same code? Can you share your debug log (export TF_LOG=DEBUG)?

@edwardmedia
Copy link
Contributor

@NicholasAzar is this still an issue?

@edwardmedia
Copy link
Contributor

@NicholasAzar closing this assuming it is no longer an issue

@DSchrupert
Copy link
Author

Just for an update on this, while I haven't yet been able to work on a standalone reproduction case, we've found that this is an issue introduced in 3.89.0 that's not reproducible in 3.88.0.

Vague steps are (in case they're helpful):

  1. Create bucket with terraform tracked state
  2. Manually delete bucket from gcp
  3. Rerun apply/plan

Hopefully I'll be able to work on some better reproduction steps in the near future. For now we're just pinning on 3.88.0 to meet a deadline.

@DSchrupert
Copy link
Author

Yeah this is definitely reproducible with this standalone example:

Define main.tf similar to this with the values filled in:

terraform {
  required_version = ">=0.14"
  required_providers {
    google      = "= 3.89.0"
  }
  backend "gcs" {
    bucket = "{STATE BUCKET}"
  }
}

module "a_bucket" {
  source  = "terraform-google-modules/cloud-storage/google//modules/simple_bucket"
  version = "~> 1.4"
  name       = "{NEW BUCKET}"
  project_id = "{PROJECT ID}"
  location   = "us-central1"
}

Commands:

  1. terraform init
  2. terraform plan -> See that it returns the plan immediately.
  3. terraform apply
  4. delete the bucket from gcp manually and wait for the delete to complete.
  5. run terraform plan -> see that it takes 20 minutes.

@DSchrupert
Copy link
Author

@edwardmedia did you have a chance to take another look at this?

@turkenh
Copy link

turkenh commented Dec 7, 2021

Having this issue as well, tested both with v4.0.0 and v4.2.1.

@edwardmedia could we reopen this?

@github-actions
Copy link

github-actions bot commented Jan 7, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.