Implement retries in BQ adapter #1963

kconvey · 2019-11-27T13:14:09Z

Uses the google.api_core.retry library to retry exceptions within the context of the exception handler, raising an exception to the handler once the error is unretryable, or configured/default retry quota has been exhausted.

As a side-effect of this, timeout is now correctly enforced.

#1579

kconvey · 2019-12-02T17:58:51Z

Could use a hand with the commented out assertion that the retry handling correctly logs. Not sure if it has to do with the test environment, but I'm getting:
E AssertionError: no logs of level INFO or higher triggered on dbt
These unit tests are passing for me locally.

I also noticed that since 0.14.2 or 0.14.3, there are a few extra hoops to jump through in initiating a BigqueryConnectionsManager (helpful for unit testing). Where before it was possible to initiate a connection manager with an empty dict, or a very simple credentials object, it seems like now code in query_headers breaks trying to access 'query_comment' through dot notation, possibly because no reasonable default is set (https://github.com/fishtown-analytics/dbt/blob/dev/0.15.1/core/dbt/adapters/base/query_headers.py#L95) Curious if there is a reasonable way to get back to easier connection manager creation, potentially making it possible to add unit tests for the connection manager.

Tagging @beckjake since the query_headers code was written by him.

beckjake · 2019-12-03T14:34:42Z

Hey @kconvey - sorry, I've been away for a bit. I don't quite follow the problem here, but your mock isn't quite right. The object you pass to the connection manager __init__ should have at least two attributes - credentials and query_comment. Instead, you are passing in a credentials mock and giving it a query_comment attribute. So this might work better:

profile = Mock(credentials=credentials, query_comment=None)
self.connections = BigQueryConnectionManager(profile)

In our unit tests we tend to just use dicts and .from_dict() methods on these things to do this - see BaseTestBigQueryAdapter.get_adapter, for example.

drewbanin

Thanks for this really great first pass @kconvey!

Couple of things:

I think retries like this are really well suited for a decorator. I don't have particularly informed thoughts about which retry lib would be good to use here, or if we should write this decorator ourselves, but I largely think it would be cleaner and clearer to decorate retryable methods than to pass around function closures like you've done here. We can do this because each of these methods is idempotent and, with the exception of actually affecting some change in the database, these methods don't have any side effects. More info on this approach here: https://www.calazan.com/retry-decorator-for-python-3/
I want to remove the create_view, create_bigquery_table, and create_date_partitioned_table methods from the BigQuery plugin. These are vestiges of a time before BigQuery supported create table|view .. as () statements and column partitioning! I think we should revert the changes in these methods and explicitly not support retries. We can additionally add a deprecation warning to show that these methods should not be used anymore and will be removed in a future release (maybe in a separate PR). Let me know if you feel strongly that we should not do that.
I don't think we actually want to enforce the query_timeout here -- the default is 300s which will cause a lot of BigQuery projects to start failing for no reason. I don't really buy that per-model timeouts are a good idea, and I don't think I'd be in favor of implementing them for other databases that dbt supports. Instead, I think timeouts like these are better handled by orchestration tools at the level of a dbt invocation. If there is sufficient interest in supporting per-model timeouts, I'd rather support it via a model config and not a profile config. The timeout_seconds config as it exists today is pretty heavy-handed! So, I'd be in favor of retaining the previous behavior, inconsistent as it may be.

I just threw a lot at you here - let me know what you think about all of it :)

plugins/bigquery/dbt/adapters/bigquery/connections.py

Co-Authored-By: Drew Banin <drew@fishtownanalytics.com>

kconvey · 2019-12-03T20:46:46Z

@drewbanin

I went ahead and implemented the small changes you suggested. I'm also comfortable deferring improving timeout for later. Happy to add a deprecation warning in a follow up PR (curious what form it would be best in: a simple comment, logger warning, or something else, but can sort that out later).

I did want to push back on the larger suggestion to refactor this as a decorator (for now), although I had initially been thinking along the same lines before the current proposed implementation. The problem(s) I see with doing this as a retry decorator are that:

You're retrying the entire method, which isn't necessary since the errors you want to retry are just coming from the code that touches the bigquery client & polls for results. The proposed solution is more granular in what it retries, which wastes less time retrying unrelated code (acquiring connection, etc.), and makes it more clear what you are retrying. This granularity is already the status quo by only running part of these methods within the exception handler.
You're retrying based on the exception that gets raised by the exception handler, not the bq client. This adds difficulty in determining whether the error raised was a retryable error before being filtered by the exception handler, while still having to raise the error-handled exception. In general this doesn't seem like an intuitive order in which to do retrying and error handling.

Exception handling seems like it should occur outside of retrying, when you're ready to handle the final exception after retrying. If retrying is done at a decorator level, I would think exception handling would then be another, outermost decorator to ensure it takes place after retrying. Doubling down on decorators seems less intelligible than the current function closure, and might require more complicated changes to exception handling.

To me, it makes more sense to maintain the granularity / status quo of the current exception handler (which has its advantages), and get this feature in, deferring some cleanup of both exception handling and retrying for later.

Curious what you think!

kconvey · 2019-12-03T21:52:56Z

@beckjake I guess I'm wondering if it is possible to add a default None to query_comment somewhere like https://github.com/fishtown-analytics/dbt/blob/e51c942e91a94936f68f2965963d3b46f1257658/core/dbt/contracts/connection.py#L136

Based on my attempt to trace through this:
-Adapter gets init'd with a config
-That config gets passed to the ConnectionManager, where it becomes the profile field

-In test, what you're passing to adapter as config is config_from_parts_or_dicts()
-This config / profile has a credentials field and a query_comment field

Is there any reason the query_comment field can't default to None somewhere? It isn't clear to me where the contract for the config / profile to specifies that it needs a query_comment field.

… feature/retries

drewbanin · 2019-12-09T19:05:52Z

@kconvey I buy that! I think you'll want to rebase this one against dev/0.15.1 :)

Adding Jake to review and follow up on query headers

beckjake

@kconvey sorry, I never saw that question!

I suppose it's ok to add =None there, although it seems a little funky - I don't think you're supposed to mock out Protocols!

I have an alternative suggestion that only involves changing the unit tests, which I think resolves the issue more completely. Let me know what you think.

I also had a couple suggestions for the unit tests that I found while I was making sure my suggestion wasn't crazy!

Pylint also has a number of complaints about indentation and assigning lambdas to things. I know it's clunky, but can you just appease the beast?

test/unit/test_bigquery_adapter.py

plugins/bigquery/dbt/adapters/bigquery/connections.py

Clean up retries unit test's connection manager mocking Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

kconvey · 2019-12-09T23:48:49Z

Tried to get all of the formatting changes, but may have missed some because we're using different linters.

beckjake

I've kicked off tests again, I also suggested the 3 changes that pylint is still failing over. We'll see how the bigquery tests go on azure, at least.

plugins/bigquery/dbt/adapters/bigquery/connections.py

beckjake · 2019-12-10T15:22:07Z

test/unit/test_bigquery_adapter.py

+#        with self.assertLogs(logger.name) as logs:
+        with self.assertRaises(DummyException):
+            self.connections._retry_and_handle(
+                 "some sql", {'credentials': {'retries': 8}},


This should be a mock credentials object now, instead of a dict. Probably something like Mock(credentials=Mock(retries=8))

Good catch.

beckjake · 2019-12-10T15:24:31Z

test/unit/test_bigquery_adapter.py

+#        self.assertIn(
+#            'WARNING:dbt:Retry attempt 1 of 8 after error: DummyException()',
+#            logs.output)
+


Can you remove this commented-out code? You can use pytest's stdout capture stuff if you can get it working in the tests instead, but otherwise I wouldn't bother too much about it.

Removed it.

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

beckjake

Assuming this is the last reason tests fail, this will be good to go.

plugins/bigquery/dbt/adapters/bigquery/connections.py

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

beckjake

It looks like the integration tests are still failing

…/retries

kconvey · 2019-12-11T20:22:50Z

Needed to add return statements when using defs instead of lambdas. Oops. Hopefully this is passing now. Thanks for bearing with me!

beckjake · 2019-12-11T20:56:01Z

/azp run

beckjake · 2019-12-12T17:35:30Z

/azp run

beckjake

I'm not sure what's up with azure here, but if this last try doesn't fix it I'm just going to merge this anyway.
Thanks for your contribution @kconvey, I'm excited to have this in dbt finally!

Implement retries in BQ adapter

21da0ed

cla-bot bot added the cla:yes label Nov 27, 2019

kconvey added 11 commits November 27, 2019 08:33

fix imports

3b45659

blank line

86f0609

Fix

b602c9c

Fix test

b2c1727

fix test better

3b696ee

fix

ad0bd87

Add query_comment

3aabe2d

Mock

0347238

Mock

33e75f8

Mock

118344b

Remove fake credentials in unit test

c0e8540

drewbanin requested changes Dec 3, 2019

View reviewed changes

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

Update plugins/bigquery/dbt/adapters/bigquery/connections.py

9b9c1db

Co-Authored-By: Drew Banin <drew@fishtownanalytics.com>

kconvey added 2 commits December 3, 2019 15:19

Use client.create_dataset and client.delete_dataset

e03fd44

Merge branch 'feature/retries' of https://github.com/kconvey/dbt into…

b548375

… feature/retries

kconvey requested a review from drewbanin December 3, 2019 23:09

dbt-labs deleted a comment from Sherm4nLC Dec 4, 2019

drewbanin requested review from beckjake and removed request for drewbanin December 9, 2019 19:02

beckjake suggested changes Dec 9, 2019

View reviewed changes

kconvey changed the base branch from dev/louisa-may-alcott to dev/0.15.1 December 9, 2019 21:48

kconvey and others added 2 commits December 9, 2019 17:09

Apply suggestions from code review

40c9328

Clean up retries unit test's connection manager mocking Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

Still get table ref

0940309

Fix formatting

460d73f

kconvey requested a review from beckjake December 9, 2019 23:48

beckjake suggested changes Dec 10, 2019

View reviewed changes

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

beckjake reviewed Dec 10, 2019

View reviewed changes

kconvey and others added 4 commits December 10, 2019 12:55

Update plugins/bigquery/dbt/adapters/bigquery/connections.py

7c8a21d

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

Update plugins/bigquery/dbt/adapters/bigquery/connections.py

e6b4a12

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

Update plugins/bigquery/dbt/adapters/bigquery/connections.py

ce4c58a

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

Clean up tests

5d181c3

kconvey requested a review from beckjake December 11, 2019 15:55

beckjake suggested changes Dec 11, 2019

View reviewed changes

plugins/bigquery/dbt/adapters/bigquery/connections.py Outdated Show resolved Hide resolved

Update plugins/bigquery/dbt/adapters/bigquery/connections.py

c5c7932

Co-Authored-By: Jacob Beck <beckjake@users.noreply.github.com>

kconvey requested a review from beckjake December 11, 2019 17:47

beckjake suggested changes Dec 11, 2019

View reviewed changes

kconvey added 3 commits December 11, 2019 12:57

Return function results

43959dc

Merge branch 'feature/retries' of github.com:kconvey/dbt into feature…

37a1288

…/retries

clean up merge conflict

0356a74

kconvey requested a review from beckjake December 11, 2019 20:22

beckjake approved these changes Dec 12, 2019

View reviewed changes

beckjake merged commit 9222f80 into dbt-labs:dev/0.15.1 Dec 12, 2019

kconvey deleted the feature/retries branch December 12, 2019 18:29

drewbanin added this to the 0.15.1 milestone Dec 18, 2019

This was referenced Jan 7, 2020

dbt seed should retry on bigquery when it tells us to #1579

Closed

Allowing for steps to retry #1630

Closed

kconvey mentioned this pull request Aug 11, 2020

Add retry of additional errors #2694

Merged

4 tasks

jtcohen6 mentioned this pull request Apr 14, 2022

[CT-491] [v1.1 regression] Remove default value for job_execution_timeout_seconds dbt-labs/dbt-bigquery#158

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement retries in BQ adapter #1963

Implement retries in BQ adapter #1963

kconvey commented Nov 27, 2019 •

edited

Loading

kconvey commented Dec 2, 2019

beckjake commented Dec 3, 2019

drewbanin left a comment

kconvey commented Dec 3, 2019

kconvey commented Dec 3, 2019

drewbanin commented Dec 9, 2019 •

edited

Loading

beckjake left a comment

kconvey commented Dec 9, 2019

beckjake left a comment

beckjake Dec 10, 2019 •

edited

Loading

kconvey Dec 11, 2019

beckjake Dec 10, 2019

kconvey Dec 11, 2019

beckjake left a comment

beckjake left a comment

kconvey commented Dec 11, 2019

beckjake commented Dec 11, 2019

beckjake commented Dec 12, 2019

beckjake left a comment

Implement retries in BQ adapter #1963

Implement retries in BQ adapter #1963

Conversation

kconvey commented Nov 27, 2019 • edited Loading

kconvey commented Dec 2, 2019

beckjake commented Dec 3, 2019

drewbanin left a comment

Choose a reason for hiding this comment

kconvey commented Dec 3, 2019

kconvey commented Dec 3, 2019

drewbanin commented Dec 9, 2019 • edited Loading

beckjake left a comment

Choose a reason for hiding this comment

kconvey commented Dec 9, 2019

beckjake left a comment

Choose a reason for hiding this comment

beckjake Dec 10, 2019 • edited Loading

Choose a reason for hiding this comment

kconvey Dec 11, 2019

Choose a reason for hiding this comment

beckjake Dec 10, 2019

Choose a reason for hiding this comment

kconvey Dec 11, 2019

Choose a reason for hiding this comment

beckjake left a comment

Choose a reason for hiding this comment

beckjake left a comment

Choose a reason for hiding this comment

kconvey commented Dec 11, 2019

beckjake commented Dec 11, 2019

beckjake commented Dec 12, 2019

beckjake left a comment

Choose a reason for hiding this comment

kconvey commented Nov 27, 2019 •

edited

Loading

drewbanin commented Dec 9, 2019 •

edited

Loading

beckjake Dec 10, 2019 •

edited

Loading