fix: retry S3 on RequestError. Fixes #9914 #12191
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #9914
Motivation
Artifact uploads are retried when failing due to transient error, but not with error code
"RequestError"
e.g. in the case of TLS handshake timeout. The error code"RequestError"
should be marked as transient.Modifications
In
s3TransientErrorCodes
, I changed one of the duplicate"InternalError"
array items to"RequestError"
. I suspect the duplication may have been made by mistake.This also aligns with the source referenced in the preceding comment i.e. retry.go from minio-go.
Verification
The specific timeout issue very hard to test, as it is transient, but I added an assertion to
TestIsTransientOSSErr
to make sure that the error code is marked as transient.