Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigtable: Create table in create_table test, deadline exceed error fix #7

Closed
wants to merge 1 commit into from

Conversation

sangramql
Copy link
Owner

[Issue 8480]
Bigtable: 'test_bigtable_create_table' snippet flakes with '504 Deadline Exceeded'.

@sangramql
Copy link
Owner Author

@mf2199 please review

Copy link

@mf2199 mf2199 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is very simple and straightforward approach, I'm not sure whether this is the best way of dealing with flakes. We need to know and understand the root cause of the flakes. Maybe instead of retries, there's a deadline parameter that can be modified, in which case all these line will be redundant.

You probably know the issue better than anybody, since you spent many hours on it. Please write it up and share your knowledge. As I said before, this information is crucial not only for our Wiki but also for the entire Google developers community in general.

@sangramql
Copy link
Owner Author

Although this is very simple and straightforward approach, I'm not sure whether this is the best way of dealing with flakes. We need to know and understand the root cause of the flakes. Maybe instead of retries, there's a deadline parameter that can be modified, in which case all these line will be redundant.

Agree, flakes can be very inconsistent and unpredictable. As I was not able to repro this issue with multiple approaches mentioned below.

I tried different approach to hit this issue,

  1. Created multiple tables(>100) in test function.
  2. Created instance and immediately created multiple tables (> 100) in class setup.
  3. Tried to created multiple instances (>10) and table in an instance, but got the following error after creating 6 instances. But didn’t get ‘deadline exceeded’ error though.

—error—
E grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
E status = StatusCode.RESOURCE_EXHAUSTED
E details = "Insufficient node quota. You requested a node count of 3 nodes for your cluster, but this request would exceed your project's node quota of 30 nodes total across all clusters in this zone. Contact us to request a quota increase: https://goo.gl/pf2Su8"
E debug_error_string = "{"created":"@1564490401.065480000","description":"Error received from peer ipv4:216.58.203.138:443","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Insufficient node quota. You requested a node count of 3 nodes for your cluster, but this request would exceed your project's node quota of 30 nodes total across all clusters in this zone. Contact us to request a quota increase: https://goo.gl/pf2Su8","grpc_status":8}"
E >
.nox/snippets-3-7/lib/python3.7/site-packages/grpc/_channel.py:467: _Rendezvous
—/error—

You probably know the issue better than anybody, since you spent many hours on it. Please write it up and share your knowledge. As I said before, this information is crucial not only for our Wiki but also for the entire Google developers community in general.

We used to observe this issue when we used to work on bigtable issues. As per my understanding this used to occur intermittently when Google cloud takes longer than expected to create instance or table. So we added cool down time.
Now I have added retry logic to factor in the cool down time(exponential) if table creation fails. Though max retry is 4 as of now, but can be configured.

@mf2199
Copy link

mf2199 commented Jul 30, 2019

Tried to created multiple instances (>10)

We have to be careful about how many instances we open, as the API may place hard limits on that!

@mf2199
Copy link

mf2199 commented Jul 30, 2019

This looks like a good write-up to me. We can submit this PR as a "possible" solution, along with all your comments.

@sangramql
Copy link
Owner Author

Opened in googleapis#8889, hence closing this.

@sangramql sangramql closed this Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants