-
Notifications
You must be signed in to change notification settings - Fork 768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retryable transport for errors #1704
Conversation
In order to address the issue integrations#210 I have added 3 new parameters to the provider - retry_delay_ms - max_retries - retryable_errors In case max_retries is > 0 (defaults to zero) it will retry the request in case it is an error the retryable_errors defaults to [500, 502, 503, 504] It retries after the ms specified in retry_delay_ms (defaults to 1000)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your attention to testing! This will be a cool feature to get released.
@dcfranca what are your thoughts on handling the error to silence the lint warning? |
Done |
@kfcampbell |
@kfcampbell @ilmax Hey, any update on this? |
@kfcampbell @ilmax Hi, is there anything else I need to do here? |
Hey @dcfranca I have no power here 😅 I'm just an end user, @kfcampbell is the one you're looking for. |
|
||
for retry := 0; retry <= t.maxRetries; retry++ { | ||
// Reset the body | ||
// Code from httpretry (https://github.com/ybbus/httpretry/blob/master/roundtripper.go#L60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you talk to me a little bit about this code and why you chose it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was this tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests I have added on transport_test.go
were failing due to the body not being reset after the first retry, then I checked the package I use on some of my projects to retry (httpretry) and saw that they have a code for this scenario, so I shameless copied it and the tests succeed
I didn't figure out how to test the specific scenario of using the function GetBody
, the original library also only tests without the GetBody, but if you have an idea on how to test with it I can update the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a manual workload you can test this on that will reliably trigger a 500? I'm hesitant to merge/release to a wide audience without a bit more description of the validation process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a Flux integration that quite often gives me a 500 (a few times a day), I can try linking my branch instead of the default provider there... however, I have never integrated an in development provider, so I will appreciate anything that can guide me through this
I'm also a bit afraid of doing that on this workload, it is not critical, but I would expect there are more changes here that are not tested/stable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can create the binary for this provider and point Terraform towards it by seeing our debugging instructions.
We do not have successful automated integration testing at the moment so we've been over-relying on manual testing. I don't feel super comfortable merging without manually testing this extensively first.
Co-authored-by: Keegan Campbell <me@kfcampbell.com>
Hello! When do you believe this will get merged? This behaviour is seen when refreshing/creating the resource
The only way to make it succeed is to retry 3-4-5 times. Having a way to specify a retry mechanism would help immensely. Thank you for the work done here! |
I'd love to know as well :D |
@gm-hyp @dcfranca we'd love to release a major version in the near future (next couple of months?) pending our team's capacity to do so. We have several items we're looking at merging into that version that we'll need to test together thoroughly. Thanks for your persistence here. |
I've changed the target branch of this PR to the |
See #2091. |
* Add retryable transport for errors (#1704) * Add retryable transport for errors In order to address the issue #210 I have added 3 new parameters to the provider - retry_delay_ms - max_retries - retryable_errors In case max_retries is > 0 (defaults to zero) it will retry the request in case it is an error the retryable_errors defaults to [500, 502, 503, 504] It retries after the ms specified in retry_delay_ms (defaults to 1000) * Update documentation with new parameters for retry * Change default of max_retries to 3 * Fix typo in naming * Update github/transport_test.go * Add error check for data seek * Consolidate the default retriable errors on a function * Fix typo on the comments Co-authored-by: Keegan Campbell <me@kfcampbell.com> * Update vendor * Fix merging conflicts * Update documentation with new parameters for retry * Change default of max_retries to 3 * Fix typo in naming * Add error check for data seek * Update github/transport_test.go * Consolidate the default retriable errors on a function * Fix typo on the comments Co-authored-by: Keegan Campbell <me@kfcampbell.com> * Don't run go mod tidy on release (#1788) * Don't run go mod tidy on release * Be more specific about releases * Fix lint with APIMeta deprecation --------- Co-authored-by: Keegan Campbell <me@kfcampbell.com> Co-authored-by: Nick Floyd <139819+nickfloyd@users.noreply.github.com> * fix: remove repository topic from state if it doesnt exist in GitHub anymore (#1918) * remove repository topic if they cannot be found in GitHub anymore * Fix build error by bumping package version in offending file --------- Co-authored-by: Keegan Campbell <me@kfcampbell.com> * Bump version to v6 (#2106) * Upgrade to Terraform Plugin SDK v2 (#1780) * go mod tidy -go=1.16 && go mod tidy -go=1.17 * Run go mod vendor * Attempt v2 upgrade * Plugin compiling * Fix some provider test errors * Fix test compilation error * ValidateFunc --> ValidateDiagFunc * Fix casing * Sprinkle toDiagFunc everywhere * More fixes for validation functions * State --> StateContext * %s --> %v when printing diags * ConfigureFunc --> ConfigureContextFunc * Checking results of d.Set, round one * Continue checking d.Set results * Check results of d.Set, round three * Checking d.Set results, round four * d.Set round five * In tests, export GITHUB_TEST_ORGANIZATION * Remove unnecessary MaxItems on computed value * Go build now works * Resolve linting errors * Apply diag.FromErr twice more * Pass key names into toDiagFunc helper * Construct cty.Path from strings * Tests now working * Update terraform-plugin-sdk version * Remove commented attribute setting in resource_github_team.go * Fix restrict pushes on github_branch_protection. Fix branch protection tests (#2045) * Update restrict pushes. Fix branch protection tests Change blocks_creations default to true. Fix broken build. * add state migration * rename push_restrictions to push_allowances * correct state migration issue * add docs clarification * update migration func args * fix test args * cleanup tests * Pass context.Background() in test * fix timestamp fields --------- Co-authored-by: Keegan Campbell <me@kfcampbell.com> * Set group_id correctly (#2133) * Run go get -u github.com/golangci/golangci-lint * Separate github_team_members import from github_team as create_default_maintainers is not defined for members resource (#2126) Co-authored-by: Keegan Campbell <me@kfcampbell.com> * Add missing variable definition for test case --------- Co-authored-by: Daniel França <github.t6297kgphp.dv@koderama.com> Co-authored-by: Nick Floyd <139819+nickfloyd@users.noreply.github.com> Co-authored-by: Felix Luthman <34520175+felixlut@users.noreply.github.com> Co-authored-by: georgekaz <1391828+georgekaz@users.noreply.github.com> Co-authored-by: Rich Young <richjyoung@users.noreply.github.com>
In order to address the issue #210
I have added 3 new parameters to the provider
In case max_retries is > 0 (defaults to zero)
it will retry the request in case it is an error
the retryable_errors defaults to [500, 502, 503, 504]
It retries after the ms specified in retry_delay_ms (defaults to 1000)
Resolves #210
Behavior
Before the change?
After the change?
Other information
Additional info
Pull request checklist
Does this introduce a breaking change?
The code should work as before in case no max_retries is set
Please see our docs on breaking changes to help!
Type: Breaking change
label)If
Yes
, what's the impact:Pull request type
Please add the corresponding label for change this PR introduces:
I don't think I have permission to add a label here
Type: Bug
Type: Feature
Type: Documentation
Type: Maintenance