Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resign requests if they will expire within 1 min #822

Closed
wants to merge 1 commit into from

Conversation

AMeng
Copy link

@AMeng AMeng commented Aug 31, 2016

I use Terraform, which uses this library. From time to time while running a terraform plan (asks AWS for the status of all my resources, takes a long time) I get this error:

Error refreshing state: 4 error(s) occurred:

* aws_route53_record.XXXXXX: SignatureDoesNotMatch: Signature expired: 20160831T170848Z is now earlier than 20160831T170921Z (20160831T171421Z - 5 min.)
    status code: 403, request id: XXXXXXXXXXXXXXX

I believe there is a race condition where the signature is checked for expiration, verified as not expired, then it expires, and the request is sent off. The expiration is being set in this same file here:

ctx.Query.Set("X-Amz-Expires", strconv.FormatInt(duration, 10))

I'm also curious why 10 minutes was chosen as the expiration. Would it be possible to increase this value? Why not, say, one hour?

@jasdel
Copy link
Contributor

jasdel commented Aug 31, 2016

Thanks for contacting us, @AMeng. The SDK should automatically attempt to retry requests that fail because of expired credentials.

It looks like the error code returned by the Route53 service API called is different than those expected. I'll take a look at how the other SDKs handle expired requests for Route53. Naively it looks like the error message will need to be parsed because the error code SignatureDoesNotMatch is too generic.

I think if we can solve the request signature resigning issue on failed request we can solve the issue. Though I agree the signer should be updated to allow custom request expiration to be set.

With that said I don't see any issue with checking if the request's signature will expire "soon" and proactively resign it.

@AMeng
Copy link
Author

AMeng commented Sep 1, 2016

Yep, looking over my failed requests, they are all from Route53.

Let me know what you think the best path forward is, and if I can help in any way.

@jasdel
Copy link
Contributor

jasdel commented Sep 1, 2016

Thanks for the update @AMeng. This morning I realized in my previous comment I mixed up expired credentials and expired request signatures. With that said, there is no work for us to do on the retry requests I linked in the previous post.

In addition is the request being used a pre-signed URL request that the Terraform library vends to your application, or is Terraform making a request on behalf of your application? Or is Terraform providing a AWS SDK for Go request.Request value back to your application?

I ask because the use-case between the two are fairly different. The 10 minutes only applies to resigning normal request signatures, not pre-signed URLs. In the case of the X-Amz-Expires header, that value is only set when the signed request is for a pre-signed URL. Not regular API calls made with the SDK.

Is the SDK just taking a very long time to create the signature of the request?

@AMeng
Copy link
Author

AMeng commented Sep 1, 2016

I'm not terribly familiar with the Terraform internals. Are you suggesting that they are not correctly handling the request failures? This error is from the terraform CLI version 0.7.2. I am not doing anything custom there, just running the plan command. I guess Terraform could handle this failure from their end, but it feels more like an issue with the SDK.

The failure seems to come about because AWS is rate limiting my account (I have over 1000 Route53 records), so Terraform waits for the SDK to finish the request (through retries). The actual plan command does seem to take more than 10 minutes when it fails, which is why I assumed the issue was with re-signing. Do you think the race condition I mentioned earlier is a possible issue?

One other thing I considered was clock skew, where my machine is off from Amazon's API machine just enough to have the failure raise before the request gets re-signed. Although I was able to reproduce this on both EC2 machines, and my local machine.

@jasdel
Copy link
Contributor

jasdel commented Sep 1, 2016

Thanks for the additional information @AMeng, this is very helpful. This helps clarify what my be going on. I don't think there is any issue in the way Terraform is using the API, and I don't think there is a need for them to handle the error on their end.

I'll dive deeper into how the retries are being done in this context. Its possible that a race condition is going on, but I'm not sure if the change here is the best approach, or if it should be more significant.

@jasdel jasdel added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Sep 6, 2016
@AMeng
Copy link
Author

AMeng commented Sep 28, 2016

Any updates on this? I wouldn't mind helping out here, given guidance on what the appropriate fix is.

@jasdel
Copy link
Contributor

jasdel commented Sep 28, 2016

@AMeng, sorry for the delay, I'm looking into this issue.

The more I review this condition I think it might be better for the SDK to not try to determine if the request needs to be resigned at all, and just resign it. Would need to ensure the digest of the request's Body is persisted between retries since this can be costly for large requests. But will not change between retries.

@jasdel
Copy link
Contributor

jasdel commented Sep 28, 2016

Syncing with the other AWS SDKs, neither AWS SDK for Java nor AWS SDK for Javascript employ logic to not resign the request. They do implement various levels of caching though. AWS SDK for Java will cache the signing key, and AWS SDK for Javascript will cache the body digest. Since the Go SDK will already cache the body's digest through the X-Amz-Content-Sha256 header I think we can remove the is signed check.

With that said, we might want to consider adding the expiry check in the request's Send handlers just before the HTTP request is sent. Resigning the request if needed. I have a feeling that the issue you're experiencing is because the request was created and signed, and a point later in time is sent. The check we've been talking about were moved to this new request handler it would help ensure request's sent are signed with a valid signature.

Something like the following added to the core handlers, and default handlers

func RefreshSignature(r *request.Request) {
    signingTime := req.Time
    if !req.LastSignedAt.IsZero() {
        signingTime = req.LastSignedAt
    }

    // 10 minutes to allow for some clock skew/delays in transmission. Would be improved
    // with aws/aws-sdk-go#423
    if time.Now().Before(signingTime.Add(10*time.Minute)) {
        return
    }

    r.Sign()
}

jasdel added a commit to jasdel/aws-sdk-go that referenced this pull request Oct 6, 2016
Improves the relyability of the request signature by resigning the
request each retry. In addition logic was added to the Send request
handler chain to ensure the delay between Sign and Send will not prevent
expired signature. This also brings the SDK in line with the other AWS
SDKs in their behavior of resigning the request each retry.

Related to aws#822
jasdel added a commit that referenced this pull request Oct 6, 2016
…ed (#876)

Improves the relyability of the request signature by resigning the
request each retry. In addition logic was added to the Send request
handler chain to ensure the delay between Sign and Send will not prevent
expired signature. This also brings the SDK in line with the other AWS
SDKs in their behavior of resigning the request each retry.

Related to #822
@jasdel
Copy link
Contributor

jasdel commented Oct 6, 2016

Hi @AMeng I merged in #876 that should resolve the expired signature you're seeing. Let us know if you're still having any issues, or have feedback.

@jasdel jasdel closed this Oct 6, 2016
@diehlaws diehlaws removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Nov 7, 2019
skotambkar pushed a commit to skotambkar/aws-sdk-go that referenced this pull request May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants