Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch BigQueryClient to retry on network error #3088

Merged
merged 4 commits into from
Oct 13, 2021

Conversation

jamesmcm
Copy link
Contributor

Description

This patch retries by re-creating the client object if it fails (up to a retry limit), so that the client will not be timed out when calling complete() on the BigQueryTarget after a long job.

Motivation and Context

At the moment sometimes long jobs can cause the BigQueryClient to timeout since the connection is created at initialisation of the target.

Have you tested this? If so, how?

Yes, passing this patched client worked for jobs that would have timeout issues previously.

At the moment sometimes long jobs can cause the BigQueryClient to
timeout since the connection is created at initialisation of the target.

This patch retries by re-creating the client object if it fails (up to a
retry limit), so that the client will not be timed out when calling
`complete()` on the BigQueryTarget after a long job.
@jamesmcm jamesmcm requested review from dlstadther, Tarrasch and a team as code owners June 23, 2021 14:14
Comment on lines 161 to 166
except (TimeoutError, BrokenPipeError, IOError) as bq_connection_error:
if self.retry_count > self.retry_limit:
raise Exception(f"Exceeded max retries for BigQueryClient connection error: {bq_connection_error}") from bq_connection_error
self.retry_count += 1
self.__initialise_client()
self.dataset_exists(dataset)
Copy link
Contributor

@hirosassa hirosassa Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using retry decorator to simplify the codes?
The implementation below will be helpful.
https://github.com/spotify/luigi/blob/master/luigi/contrib/gcs.py#L72-L81

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll make that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesmcm Thank you! Looks good to me, but flake8 notifies some formatting problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I fixed that too.

jamesmcm added 2 commits June 24, 2021 13:52
Note the use of the after lambda function to re-create the client if the
connection fails
@hirosassa
Copy link
Contributor

This PR looks great!
@dlstadther @Tarrasch @DataEx plz take a look.

Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dlstadther dlstadther merged commit ad5ddc9 into spotify:master Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants