-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maximum number of retries aws sdk attempts for recoverable exceptions. #3965
Maximum number of retries aws sdk attempts for recoverable exceptions. #3965
Conversation
The SDK default is 3. 30 seems like a lot... |
I'm actually now a bit confused by the purpose of this PR. The original issue was for Vault to attempt to retry failed requests. But the code you linked to for the client suggests that it's already using a client default of 3 retries. The fact that it's still failing probably means it is an error that actually should be propagated back. |
True, But there is an option in the SDK to increase the number. yeah, 30 is a lot, maybe 10 try? |
e1515be
to
63a7768
Compare
Changed it to 10 try |
If you turn this into something that can be configured in both the credential and secrets backend I will consider it, but I'm not going to merge something that sets the value to an arbitrary non-default static number, especially given that this can already be done via env vars. |
Thanks, I will look into that, |
I didn't refer to variables -- can you point out what you're asking about? |
" especially given that this can already be done via env vars." |
Sorry, brain fart. I think I mixed up the aws.UseServiceDefaultRetries with aws.UseEnvDefaultRetries. |
63a7768
to
146cc0f
Compare
Hi, 👋 Can you have a look? |
146cc0f
to
53c2362
Compare
The default for the parameter if not set by the user will be zero but this will actually disable retries. I think you want to use aws.UseServiceDefaultRetries as the default value for the field. |
53c2362
to
7980bac
Compare
On other note, Noticed kops (the kubernete tool for aws), Hardcoded the value for this to 13. Using the same aws sdk. |
Updated the PR with the default value. |
7980bac
to
fc92db1
Compare
Adding my own $0.02 here, I'd like to echo what @jefferai said about 30 being excessive -- you'll really do yourself a favor if you track down what's causing all the excessive API traffic. Many things will be having these issues, not just Vault (and I speak from personal experience). Also, having a retry count of 30 is probably counter-productive -- many people will just complain that Vault is "slow" under certain circumstances and either file impossible-to-track-down bugs or just stop using it. So, in most cases, it's probably better to actually fail out to let people know what's going on so they can correct it, rather than just constantly increase the retry count. 3 might be a bit low, but 30 is probably too high. |
I agree I was just testing the waters with 30 :) , 30 is a lot, I have already looked into other possible causes too, but practically speaking with 3 retries backtracking functionality in the SDK cannot do much. Regarding the sleep, AWS backend and AUTH are involved with AWS API, I think a reasonable person will agree that that matters in terms of responsiveness. Other than that, This is the first time I head about vault being slow. |
Please let me know if anything else is required to be done by me, $: make test TEST=./builtin/logical/aws
==> Checking that code complies with gofmt requirements...
==> Checking that build is using go version >= 1.9.1...
go generate
ok github.com/hashicorp/vault/builtin/logical/aws 0.022s
$ make test TEST=./builtin/credential/aws
==> Checking that code complies with gofmt requirements...
==> Checking that build is using go version >= 1.9.1...
go generate
ok github.com/hashicorp/vault/builtin/credential/aws 0.531s |
Thanks for the help, all the checks have passed. |
Thanks! |
When do you think we will have 0.9.4? |
Soon. |
Potentially this could be a configuration too, AWS defaultRetryer has a backoff logic for a set of recoverable exceptions.
Related issue: #3942
https://github.com/hashicorp/vault/blob/master/vendor/github.com/aws/aws-sdk-go/aws/client/client.go#L59