Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: retry S3 on RequestError. Fixes #9914 #12191

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

tachylatus
Copy link
Contributor

Fixes #9914

Motivation

Artifact uploads are retried when failing due to transient error, but not with error code "RequestError" e.g. in the case of TLS handshake timeout. The error code "RequestError" should be marked as transient.

Modifications

In s3TransientErrorCodes, I changed one of the duplicate "InternalError" array items to "RequestError". I suspect the duplication may have been made by mistake.

This also aligns with the source referenced in the preceding comment i.e. retry.go from minio-go.

Verification

The specific timeout issue very hard to test, as it is transient, but I added an assertion to TestIsTransientOSSErr to make sure that the error code is marked as transient.

Signed-off-by: Helge Willum Thingvad <1019305+tachylatus@users.noreply.github.com>
Copy link
Member

@Joibel Joibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable.
The two ExpiredToken errors from minio's error list look unlikely to succeed if retried quickly.

@terrytangyuan terrytangyuan merged commit f95a5fc into argoproj:master Nov 13, 2023
27 checks passed
@tachylatus tachylatus deleted the patch-1 branch November 14, 2023 11:20
terrytangyuan pushed a commit that referenced this pull request Nov 27, 2023
Signed-off-by: Helge Willum Thingvad <1019305+tachylatus@users.noreply.github.com>
@agilgur5 agilgur5 added the area/artifacts S3/GCP/OSS/Git/HDFS etc label Nov 28, 2023
isubasinghe pushed a commit to isubasinghe/argo-workflows that referenced this pull request Mar 12, 2024
Signed-off-by: Helge Willum Thingvad <1019305+tachylatus@users.noreply.github.com>
isubasinghe pushed a commit to isubasinghe/argo-workflows that referenced this pull request May 6, 2024
Signed-off-by: Helge Willum Thingvad <1019305+tachylatus@users.noreply.github.com>
@tooptoop4 tooptoop4 mentioned this pull request Nov 3, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

saving log artifact to s3 missing retries
5 participants