Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry GitHub download failures #3729

Merged
merged 6 commits into from
Aug 24, 2021
Merged

Retry GitHub download failures #3729

merged 6 commits into from
Aug 24, 2021

Conversation

leahwicz
Copy link
Contributor

resolves #3546

Description

We are seeing dbt deps fail intermittently when trying to reach out to GitHub. We do retry logic for when we reach out to the registry hub so we want to try the same retry logic for GitHub to see if it improves the success rate of dbt deps.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Aug 11, 2021
@leahwicz leahwicz temporarily deployed to Postgres August 11, 2021 15:54 Inactive
@leahwicz leahwicz requested a review from kwigley August 11, 2021 15:55
@leahwicz leahwicz temporarily deployed to Bigquery August 11, 2021 15:55 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 11, 2021 15:55 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 11, 2021 15:55 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 11, 2021 15:55 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 11, 2021 15:55 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 11, 2021 15:55 Inactive
@leahwicz
Copy link
Contributor Author

Does this even look in the ballpark? My Python is not good 😄

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hooray!! Glad we're reusing + extending the existing decorator. The code looks pretty good to me, but I should not be the arbiter :)

Also interested to hear folks' thoughts about how best to test - perhaps mocking a package with a nonexistent tarball URL? I know test_registry_get_request_exception added some unneeded baggage to our unit testing.

core/dbt/utils.py Outdated Show resolved Hide resolved
Copy link
Contributor

@nathaniel-may nathaniel-may left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python checks out. However, on a personal note, I find this wrapping pattern to be particularly opaque.

Here is an example of how I might suggest refactoring this:

# no juggling of indexes which is extremely error prone
def _exception_retry(lambda_fn, retries):
    if retries <= 0:
        raise RegistryException('Unable to connect to registry hub')
    else: 
        try:
            return lambda_fn()
        except (requests.exceptions.ConnectionError, requests.exceptions.Timeout):
            time.sleep(1)
            _exception_retry(lambda_fn, retries - 1)

# example call:
# accepting a lambda function as a parameter allows the outer 
# function to run it regardless of what parameters it takes
_exception_retry(lambda: 1+2, 5)

(the original exception is lost here, but that can be put back. I didn't write that part because it obfuscated the point I'm trying to make in the review)

In both ways of writing this, I would suggest the following tests:

  1. pass a function that always throws these exceptions and check that you get the right exception back
  2. pass a function that throws twice then returns correctly and check that you get the correct response

core/dbt/utils.py Outdated Show resolved Hide resolved
core/dbt/utils.py Outdated Show resolved Hide resolved
@leahwicz leahwicz temporarily deployed to Postgres August 16, 2021 18:18 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 16, 2021 18:18 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 16, 2021 18:18 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 16, 2021 18:19 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 16, 2021 18:19 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 16, 2021 18:19 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 16, 2021 18:19 Inactive
@leahwicz leahwicz temporarily deployed to Postgres August 16, 2021 18:41 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 16, 2021 18:41 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 20, 2021 00:22 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:22 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:22 Inactive
Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
@leahwicz leahwicz temporarily deployed to Postgres August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 20, 2021 00:27 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 20, 2021 00:39 Inactive
@leahwicz leahwicz temporarily deployed to Bigquery August 20, 2021 00:40 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:40 Inactive
@leahwicz leahwicz temporarily deployed to Redshift August 20, 2021 00:40 Inactive
@leahwicz leahwicz temporarily deployed to Postgres August 20, 2021 00:40 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 20, 2021 00:40 Inactive
@leahwicz leahwicz temporarily deployed to Snowflake August 20, 2021 00:40 Inactive
@leahwicz leahwicz requested a review from kwigley August 20, 2021 17:26
@leahwicz leahwicz merged commit 09ea989 into develop Aug 24, 2021
@leahwicz leahwicz deleted the leahwicz/retry_download branch August 24, 2021 17:35
leahwicz added a commit that referenced this pull request Aug 24, 2021
* Retry GitHub download failures

* Refactor and add tests

* Fixed linting and added comment

* Fixing unit test assertRaises

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Fixing casing

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Changing to use partial for function calls

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
leahwicz added a commit that referenced this pull request Aug 27, 2021
* Retry GitHub download failures

* Refactor and add tests

* Fixed linting and added comment

* Fixing unit test assertRaises

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Fixing casing

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Changing to use partial for function calls

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
IS-Josh pushed a commit to IS-Josh/dbt that referenced this pull request Sep 4, 2021
* Retry GitHub download failures

* Refactor and add tests

* Fixed linting and added comment

* Fixing unit test assertRaises

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Fixing casing

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Changing to use partial for function calls

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
TeddyCr pushed a commit to TeddyCr/dbt that referenced this pull request Sep 9, 2021
* Retry GitHub download failures

* Refactor and add tests

* Fixed linting and added comment

* Fixing unit test assertRaises

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Fixing casing

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Changing to use partial for function calls

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>
@jtcohen6 jtcohen6 mentioned this pull request Nov 8, 2021
4 tasks
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* Retry GitHub download failures

* Refactor and add tests

* Fixed linting and added comment

* Fixing unit test assertRaises

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Fixing casing

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

* Changing to use partial for function calls

Co-authored-by: Kyle Wigley <kyle@fishtownanalytics.com>

automatic commit by git-black, original commits:
  09ea989
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect GitHub network errors and provide a clean error that defines that a bad response came from GitHub
5 participants