Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google provider is rejected from gcp API for frequent request #6021

Closed
toshitanian opened this issue Apr 1, 2020 · 18 comments
Closed

google provider is rejected from gcp API for frequent request #6021

toshitanian opened this issue Apr 1, 2020 · 18 comments

Comments

@toshitanian
Copy link

toshitanian commented Apr 1, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

When I was applying google provider to GCP, terraform plan failed with API timeout.

Error: Error reading Container NodePool xxx: Get https://container.googleapis.com/v1beta1/projects/xxx/locations/us-central1/clusters/xxx/nodePools/xxx?alt=json&prettyPrint=false: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Finally I found that I have to reduce parallelism of terraform.

$ terraform plan  --parallelism=1

If it's possible I want google-provider to handle parallelism by itself.

Version

Terraform v0.12.24

  • provider.google v3.15.0
  • provider.google-beta v3.15.0

New or Affected Resource(s)

  • google_XXXXX

Potential Terraform Configuration

# Propose what you think the configuration to take advantage of this feature should look like.
# We may not use it verbatim, but it's helpful in understanding your intent.

References

  • #0000
@ghost ghost added the enhancement label Apr 1, 2020
@danawillow
Copy link
Contributor

@toshitanian what version of the provider are you on? We have several improvements in our recent versions to take care of these sorts of issues.

If you're on an up-to-date version, it would help also to get some of the information that we ask for in the bug template since this feels more like a bug to me.

@toshitanian
Copy link
Author

I added the version

@ghost ghost removed the waiting-response label Apr 7, 2020
@danawillow
Copy link
Contributor

Does this happen consistently or only sometimes? Does it happen with every resource / data source, or just GKE?

@toshitanian
Copy link
Author

It happened only on GKE cloud cluster for me as I don't use other resource with terraform.
In terms of frequency, it happens earlier when I applying again and again.
So, I suppose that this error is related to GCP's API request limitation.
I'm not sure but there might be something like GCP's API have limitation of 10 requests / minutes.

@danawillow
Copy link
Contributor

@emilymye any idea whether there's anything that can be done here? You know more than I do about this type of error.

In the meantime, @toshitanian, keep us posted on whether this seems to happen often or just during certain times. Normally, quota errors show up as successful requests but with an error code (rather than this, where the request doesn't seem to have actually made it to a GCP backend).

@danawillow
Copy link
Contributor

@toshitanian, if this happens again, can you post debug logs (env var TF_LOG=DEBUG)? I just ran into this myself where it started returning that error after retrying on several 503s in a row, and I'm curious whether yours are the same or if it happens on the very first request.

@toshitanian
Copy link
Author

@danawillow yes. when it happens again, i will report more detail logs. Thank you for your help.

@bharathkkb
Copy link

bharathkkb commented Aug 13, 2020

@ghost ghost removed the waiting-response label Aug 13, 2020
@danawillow
Copy link
Contributor

Interesting, thanks @bharathkkb. You don't happen to have debug logs (showing the full request/response) for those, do you? I'd be curious to see what sorts of responses we're getting back from the API and retrying on before we finally get that error.

@bharathkkb
Copy link

Unfortunately not since its in CB, I will post if I encounter locally.
We have had much for frequent failures recently with at least one in three failing due to this.

@bharathkkb
Copy link

@danawillow I have collected debug logs and attached to b/164189639

@danawillow
Copy link
Contributor

Thank you, those are super helpful! I have some changes I'd like to make that should improve this (it might not fix it 100% but should make it better in some cases). Will keep you informed.

@tvvignesh
Copy link

@danawillow @bharathkkb This is happening for me too quite frequently when running tf plan

Error: Error creating service account: Post "https://iam.googleapis.com/v1/projects/<projectid>/serviceAccounts?alt=json&prettyPrint=false": oauth2: cannot fetch token: Post "https://oauth2.googleapis.com/token": dial tcp: lookup oauth2.googleapis.com on 127.0.0.53:53: read udp 127.0.0.1:57976->127.0.0.53:53: i/o timeout

Run tf plan again

Error: Error when reading or editing Instance Template "bastion-instance-template-<id>": Get "https://www.googleapis.com/compute/beta/projects/projects/<projectid>/global/instanceTemplates/bastion-instance-template-<id>?alt=json&prettyPrint=false": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Run tf plan again

Error: Error reading instance group manager returned as an instance group URL: Get "https://compute.googleapis.com/compute/beta/projects/<projetid>/zones/asia-southeast1-c/instanceGroupManagers/<poolid>-grp?alt=json&prettyPrint=false": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Run tf plan again

It works

It appears randomly, I keep running tf plan again and again and it gets works automatically at some point. Running v0.13.4 of Terraform with registry.terraform.io/hashicorp/google v3.41.0 and registry.terraform.io/hashicorp/google-beta v3.41.0

@danawillow
Copy link
Contributor

@tvvignesh can you try increasing the value for request_timeout in your provider block? https://www.terraform.io/docs/providers/google/guides/provider_reference.html#request_timeout-1

@tvvignesh
Copy link

tvvignesh commented Oct 8, 2020

@danawillow Ahh. Did not know that a timeout option existed. My bad. Tried it and it works great most of the times now (it does fail in some instances though but the error changed from timeout to network unreachable), Thanks.

PS: The interesting thing is I am running TF within a GCP VM and applying only GCP resources so, basically I am using everything within the google network but still there are timeouts and failures.

@danawillow
Copy link
Contributor

No worries! It actually took me a decent amount of time debugging this a month or two back to realize that that specific error would be fixed by increasing the timeout. And yeah, they happen sometimes for non-networking related reasons (sometimes things just queue up on the service end for whatever reasons)

@rileykarson
Copy link
Collaborator

Given the broad scope of the issue, the age, and the fact that I believe this has been addressed over time, i.e. with the changes Dana outlined having made above. I'm going to close out this issue, but if folks are running into it please open new one(s)!

I recognise that having open source issues closed without a clear resolution can be annoying, and I'm sorry about that. The reality is that this issue is pretty far into our backlog, and won't realistically have action taken on it- a newer issue is much more likely to drive progress on a solution if the issue is still present.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants