Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support context pattern #75

Closed
justonia opened this issue Jan 30, 2015 · 28 comments
Closed

Support context pattern #75

justonia opened this issue Jan 30, 2015 · 28 comments
Labels
feature-request A feature should be added or improved.

Comments

@justonia
Copy link

See this blog article:

https://blog.golang.org/context

We use this heavily in all aspects of our SOA, and it allows us to inject request scoped parameters to systems that need them to provide useful information -- such as logging or in-memory caches. Here is a drastically simplified example:

type key int
const LoggerKey key = 0
const RequestIdKey key = 1
const MethodNameKey key = 2

type Logger interface {
    Debug(ctx context.Context, message, args ...interface{})
}    

// alternatively the context interface could have the log methods on them to avoid this step
func LoggerFrom(ctx context.Context) Logger {
    if logger, ok := ctx.Value(LoggerKey).(Logger); ok {
        return logger
    } else {
        return nullLogger
    }
}

ctx := context.WithCancel(context.Background())
ctx = context.WithValue(ctx, RequestIdKey, uuid.NewV4())
ctx = context.WithValue(ctx, LoggerKey, myLogger)

someFrontendMethod(ctx, other, arguments)

...

func someFrontendMethod(ctx context.Context, other int, arguments string) {
    ctx = context.WithValue(ctx, MethodNameKey, "someFrontendMethod")
    result, err := aws.DynamoDb().Get(ctx, ...) 
    ...
}

// totally made up interface, not implying this should be the real AWS one
func (d *DynamoDb) Get(ctx context.Context, some int, args string) (*Result, error) {
    LoggerFrom(ctx).Debug(ctx, "dynamodb.get - starting request")
    ...
}

This would then output something like:

[DEBUG] [requestId=5f0712749191adda1079e71c2403ec24d65ebf32] someFrontendMethod: dynamodb.get - starting request

Additionally, in the first method, you would be able to timeout the call to someFrontendMethod by closing the Done channel on the context. Depending upon how much of the callstack is using the context, every goroutine underneath it would be notified of the cancelled context and abort as soon as possible. In the case of the AWS lib, this could be used to abort an exponential backoff algorithm that is currently sleeping without the caller of the lib having to cancel manually somehow.

@lsegal lsegal added the feature-request A feature should be added or improved. label Jan 30, 2015
@kidoman
Copy link

kidoman commented Mar 14, 2015

+1 to this! Go is standardizing around golang.org/x/net/context

gRPC (http://www.grpc.io/) recently open sourced is also heavily built around context package. It would be a shame for one of the most important libraries in Go (aws !) to not be inline with the trend.

@lsegal
Copy link
Contributor

lsegal commented Mar 15, 2015

@justonia @kidoman thanks for reporting and supporting this feature. Can you elaborate a little bit more on how the SDK itself would expose this context pattern? The SDK currently does not expose goroutines and we are leaving this to be handled by consumers of the library. Since we don't deal with concurrency primitives at the SDK level, I'm trying to understand how the context pattern would fit into our design.

It seems to me like the context pattern could be applied by being built on top of the SDK via composition (in the same way that goroutines themselves would be). Is there something the SDK needs to do explicitly to support this pattern being built out around the core API?

@justonia
Copy link
Author

The context pattern isn't itself intimately tied to goroutines. Cancellation of one is an obvious example they provide, but it can be useful with or without them. You will no doubt as part of this library need to support exponential backoff for various Amazon APIs that require it. If the code that implements the backoff has access to a context object, it would be able to prematurely exit the backoff loop if the context were cancelled. Here is an extremely rough example that requires no goroutines:

func awsBackoffRequest(ctx context.Context, .......) error {
    for i := 0; i < maxBackoffAttempts; i++ {
        select {
        case <-ctx.Done:
            return ctx.Err() // e.g. ErrCancelled
        default: // context not cancelled, continue with attempt
        }

        // make connection attempt
        ...
    }
    return nil
}

It is very important to understand the context object is meant to be a request-scoped object only. I may have one instance of an AWS service instantiated that I intend to share across many frontend requests, but each request itself would have its own context object that would be passed down all the way to any calls into the AWS lib. It would be a mis-use of the pattern to make the context object part of a Config struct passed once to a "New*" method of a service, since this would now prevent per-request cancellation of an API call.

Besides cancellation, internally we use the context object for dependency injection of request scoped parameters. For instance, our context object is a required argument for our logging interface. This allows us to inject parameters that the logging system wants to decorate messages with, like an HTTP request ID, without intermediate systems having to have this information or even know they are part of an HTTP request cycle. We also inject call stack information into the context object so that our log messages can provide useful debug information we can run through analytics.

@jasdel
Copy link
Contributor

jasdel commented May 4, 2015

Thank you very much for the examples and information. Context is a very interesting pattern, which deserves more investigation. We'd like to hold off updating the SDK wholesale to use Context for the moment, and watch to see where the Go community goes with this pattern.

One initial roadblock to integrating Context would be the need to refactor the API operations to be asynchronous instead of synchronous. The time and effort to refactor the SDK to be async is not a problem. Our primary concern arises that the refactor would impose an async/concurrency model on all developers using the SDK. With the SDK's current pattern of sync API operations developers are free to use any concurrency pattern they would like. This style seems to be very idiomatic in Go where function calls are only async if absolutely needed.

The SDK's current design allows nearly any concurrency model to be used on top of the API requests. This includes Context. Though I see having Context built in would cut down on boilerplate code wrapping the SDK. Also request specific cancellation handling isn't really supported fully at the moment. You could implement with a custom Send Handler, and a non-trivial amount of code to support cancellation to support per request cancellation.

Also on a side note you can use the API's Request functions to separate building a request from sending it.

output, req := s3.ListBucketsRequest(&s3.ListBucketsInput{})
// output will be populated when Send() returns.
err := req.Send()
...

The key/value data payload Context provides is interesting. Though we would lose some of our compile time type validation if the SDK were to use the values passed in. Alternatively having concrete types for the values would reduce the need or runtime type casting or checking.

@jasdel
Copy link
Contributor

jasdel commented Aug 5, 2015

Closing since the SDK should not require and specific concurrency model. Using wrapper functions around the SDK's Request operations the Context library can be used.

@jasdel jasdel closed this as completed Aug 5, 2015
@jharlap
Copy link

jharlap commented Aug 8, 2015

Unfortunate to see this getting closed. An example where context support would be really nice is when using SQS ReceiveMessage with non-zero WaitTimeSeconds. In this case, cancelling a context while the ReceiveMessage request is just waiting idly would be a very reasonable thing to do.

@justonia
Copy link
Author

justonia commented Aug 8, 2015

Yea I agree, the context pattern is not just for concurrency and this is a great example of a use case for it.

@jasdel
Copy link
Contributor

jasdel commented Aug 8, 2015

Thanks for the feedback. Context certainly is a powerful tool. With that said we would like to avoid requiring users of the SDK to use a specific upstream dependencies within their applications. Especially since Context is still marked as an experimental package, and is subject to change. Once Context is moved into the stdlib I think we should reevaluate that decision.

Would adding a Cancel method to aws.Request address the use case you are looking for? With this pattern you would still be able to proactively cancel outstanding requests.

req, result := sqsSvc.ReceiveMessageRequest(...)
err := req.Send()

// In some other  goroutine
err := req.Cancel()

// Request.Cancel() would preform something like the following
if t, ok : req.HTTPClient.Transport.(http.Transport); ok {
   t.CancelRequest(req.HTTPRequest)
}

@jasdel jasdel reopened this Aug 8, 2015
@lsegal
Copy link
Contributor

lsegal commented Aug 8, 2015

It's worth noting that the above code provided by @jasdel is already usable today in the public API if you do want to cancel early. Adding req.Cancel() would be a convenience method only.

@kidoman
Copy link

kidoman commented Aug 8, 2015

I always thought that "golang.org/x" was home for packages which the team does not want to add to the stdlib but is a critical part of the golang eco-system. While I cannot comment about the experimental nature of the other packages under "golang.org/x" but looking at the widespread usage of golang.org/x/net/context (https://godoc.org/golang.org/x/net/context?importers) I feel that betting on this package would be sound.

Writing code using contexts (particularly when controlling the deadline/timeout of a subtree of calls) is quite elegant. I would rather use it everywhere vs looking for API specific cancellation mechanisms.

For example, this things like this: https://godoc.org/google.golang.org/api/storage/v1#ObjectsInsertCall.ResumableMedia

Here, the cancellation can be dictated by the parent caller (since everyone is passing the parents ctx around, or a ctx dervied from the parents) quite nicely.

@lsegal
Copy link
Contributor

lsegal commented Aug 8, 2015

@kidoman

I always thought that "golang.org/x" was home for packages which the team does not want to add to the stdlib but is a critical part of the golang eco-system.

/x/ is explicitly unstable: https://github.com/golang/go/wiki/SubRepositories -- the widespread usage does not mean very much, since breaking changes can and have occurred (even through the short lifespan of this project when we had previously used some packages from /x/). We are attempting to have very strict backward compatibility guarantees on the first major version of our SDK, so we cannot base our entire architecture on a library that is explicitly not making those same guarantees. Instead, we are inverting are architecture such that the core codebase is stable but extensible enough to support these abstractions when built on top.

I would rather use it everywhere vs looking for API specific cancellation mechanisms.

The idea here is that it should be possible to build context patterns on top of the primitives that the SDK exposes. In other words, you should be able to wrap API specific cancellation mechanisms in a nice context package without affecting core SDK logic. After you write this abstraction once, you should not have to deal with the API directly, if that's your specific goal.

If someone is having a problem building out these abstractions, I think that's a better conversation to have at this time. One of the goals of the SDK is to provide a solid core foundation to compose other abstractions if they are not yet provided, so if it is not possible to throw context into the mix, that's something we should look at making more feasible.

@jasdel
Copy link
Contributor

jasdel commented Oct 8, 2015

Thank you very much for the discussion and feedback if the SDK should be using the Context library and pattern. Since we would like to ensure the SDK does not impose a specific concurrency pattern we would like to hold off integrating Context into the SDK. Since the SDK doesn't impose its own pattern you are still free to build your own concurrency pattern on top of it, including Context if that makes sense for your application. Again, thank you for your feedback.

@jasdel jasdel closed this as completed Oct 8, 2015
@kidoman
Copy link

kidoman commented Oct 17, 2015

@lsegal:

Please have a look at this: https://groups.google.com/forum/#!topic/golang-dev/cQs1z9LrJDU

TL;DR: net/context is moving into the stdlib.

@seiffert
Copy link
Contributor

seiffert commented May 22, 2016

I've spent some time on this topic and thought about different ways to solve this.
My problem is mainly with DynamoDB and throttled requests. When one misses to increase the provisioned capacity of a table, this leads to throttled requests, which in turn causes the SDK to retry its requests with exponential backoff. This causes long request times which cause timeouts in our system. By using context objects, we have more control over the time to fail in such situations.

First I thought that implementing a custom request.Retryer would be enough. Most API requests are really quick anyway and this way one could prevent new requests being started after the context was cancelled. The problem is, that a request.Retryer is really not capable of deciding whether to perform further retries. It has a method ShouldRetry(*request.Request) bool which I would expect to be consulted for this decision. However, it turns out that it is not called in most cases and that the decision is mostly done in a different place.

type Retryer interface {
    RetryRules(*Request) time.Duration
    ShouldRetry(*Request) bool
    MaxRetries() int
}           

The next thing I evaluated was performing all SDK operations asynchronously and stop waiting for the responses as soon as the surrounding context was cancelled.

var (
    errCh = make(chan error)
    outCh = make(chan *dynamodb.ScanOutput) 
)

go func() {
    out, err := client.Scan(&dynamodb.ScanInput{...})
    if err != nil {
        errCh <- err
        return
    }
    outCh <- out
}()

select {
    case <-ctx.Done():
        return ctx.Error()
    case err := <-errCh:
        return err
    case out := <-outCh:
        return doStuffWithOutput(out)
}

This turned out to be a bad idea pretty quickly. For safe operations (speak "read-only") this is ok, for operations that change the state of a resources however, one cannot be sure whether the operation actually was performed or not.

Digging deeper in the SDK, I read more about handlers. Apparently everything related to the request/response cycle is done in handlers. One signs the request, another one validates its parameters, one validates the response and so on. In order to modify something in the lifecycle of a request, one can add or modify a handler.
Normally when performing requests to a backing service via HTTP, I prefer to use the x/net/context/ctxhttp package which provides helper methods to perform cancellable HTTP requests. So why not use them in the AWS SDK's "send handler"? This is what I did and until now I'm quite confident about this approach.
By replacing the default corehandlers.SendHandler with a very similar implementation that uses ctxhttp.Do(...) instead of the default HTTP client's Do method, all API requests respect the lifetime of a context. It is important to note that one should not do this with the handlers configured in a SDK session object. Contexts are most of the times only valid during one request and therefore need to be configured on an SDK operation basis.

I created a really small library (ctxaws) that does exactly this. Does this approach make sense to you? I'd really like to get some feedback on this.

@lsegal
Copy link
Contributor

lsegal commented May 22, 2016

@seiffert

Just a preliminary note here to avoid confusion: the SDK team can provide a better / more up-to-date answer, but I just wanted to chime in with some extra info from when I was working on this codebase (I no longer do), so here goes:

Having separable context logic seems sensible, and ctxaws looks very cool. A couple things to note, though. Firstly, there are a number of ways to solve this problem, and many of them do not require async. Specifically, you probably don't actually want to be canceling HTTP requests anyway (and there are synchronous ways to do that with HTTP timeouts, as ctxaws seems to use).

One way to do it would be to lower your MaxRetries configuration parameter. If exponential backoff really is creating a slowdown, you might want to just not retry that many times and try later. You can even completely turn it off and use your own retry logic outside of the request context. That's a little uglier, but it works.

Also, contrary to your above statements, you can control precisely how and when a request retries. First, ShouldRetry is only ignored when the request already has an explicit answer about whether it should retry from an earlier part of the request cycle. This mostly only happens for clear scenarios like networking errors or checksum issues; basically, things that you should probably not be overriding anyway. In other words, it should be safe enough to implement Retryer. That said, if you really wanted full control of retryability, you can do so by hooking into the ValidateResponse step and (re-)setting r.Retryable yourself. Both of these allow you to cancel "atomically", i.e., stopping between HTTP requests rather than sending an HTTP request and canceling while it's been sent (a somewhat unsafe move if the server already received the request, as you noted).

Normally when performing requests to a backing service via HTTP, I prefer to use the x/net/context/ctxhttp package which provides helper methods to perform cancellable HTTP requests. So why not use them in the AWS SDK's "send handler"? This is what I did and until now I'm quite confident about this approach.

The issue here is that the "send handler" is not the entire request, and, in fact, only a small part of it. In the case of a low latency service like DynamoDB, the Send portion of the request is only about ~5-50ms of the request. The actual retry delay happens in further handlers, so wrapping just Send inside of a context would not make the request cancelable in your specific scenario, namely, when your CPU is waiting on exponential backoff. Send is also the least safe place to cancel at, because of the notes above. You should use regular HTTP timeouts for that.

If the SDK does ever end up using contexts, they would have to use it in the way you implemented it with ctxaws, wrapping the entire request handler chain inside of a context object, not just individual handlers. Basically it would have to be composed the other way around. As I mentioned before, a separable and optional context package like the one you created is actually a pretty solid idea and, at least from my perspective, how I envisioned context support being cleanly built on top of the SDK.

@seiffert
Copy link
Contributor

Thank you @lsegal for your detailed response!

What I extract from it are the following thoughts:

  • cancelling HTTP requests is usually not what we want (however I think that HTTP timeouts basically cause the same issues - in both cases you don't know whether the server got the message)
  • I really should go back to implementing a custom Retryer in order to stop retrying when the context has exceeded
  • In order to do this properly, one would need to adapt the whole handler chain. Here, I still don't know how to improve my solution. If I understand you correctly, you suggest to set sensible HTTP timeouts and rely on a custom Retryer otherwise, correct?

Thanks again, feedback like this was exactly what I was looking for!

@lsegal
Copy link
Contributor

lsegal commented May 23, 2016

@seiffert you're right that Timeout raises the same potential issues. If you are not having network slowdown issues, then Timeout isn't what you want anyway. An HTTP request that was timed out may actually trigger a retry in the SDK (with backoff, though I'm not 100% sure if it actually retries, currently), so it doesn't fundamentally solve the problem anyhow. You can completely omit the Timeout portion of this solution and still get the most robust version of what you are looking for.

And yes, implementing the Retryer or ValidateResponse step to check whether your context has exceeded would be the best way to do it. Doing this solves the problem of being able to effectively cancel a request at a safe stop point during the request's lifetime when it is wrapped in a controlling context object.

If I understand you correctly, you suggest to set sensible HTTP timeouts and rely on a custom Retryer otherwise, correct?

To reiterate, only the Retryer (or ValidateResponse) is the necessary part. Timeouts can be useful in some cases where the network is the bottleneck (long requests due to latency / packet loss, or large amounts of data being pulled out, and ideally in a read operation). You may choose to enable timeouts in those cases, but in the specific case of a low-latency DynamoDB connection, it's probably not needed (your mileage may vary).

@ryanfowler
Copy link
Contributor

For anyone still interested in using context with aws-sdk-go, the solution appears quite simple with Go 1.7 now that you can set the context in an HTTP request:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

// SQS ReceiveMessage
params := &sqs.ReceiveMessageInput{ ... }
req, resp := s.ReceiveMessageRequest(params)
req.HTTPRequest = req.HTTPRequest.WithContext(ctx)
err := req.Send()

@jasdel
Copy link
Contributor

jasdel commented Sep 2, 2016

Hi @ryanfowler Thanks for posting this suggestion. If you'd like to create a PR it would be great to have a simple example like this added to the SDK's example folder. Probably under the folder, example/aws/request.

If you don't mind I'd also like to add your example to the SDKs wiki.

@ryanfowler
Copy link
Contributor

@jasdel Sounds good, I'll submit a PR in the next couple days.

Definitely a good idea to add to the SDKs wiki, too 👍

@jasdel
Copy link
Contributor

jasdel commented Sep 2, 2016

Thanks, I've added a section to the SDK's wiki on using context with SDK requests.

@muhqu
Copy link

muhqu commented Sep 16, 2016

@ryanfowler @jasdel It looks like the req.HTTPRequest = req.HTTPRequest.WithContext(ctx) only works for the first attempt, but not if the request is automatically retried (due to expired credentials).

I guess the SDK needs to be adjusted to propagate req.HTTPRequest.Context() in case of retries. WDYT?

@ryanfowler
Copy link
Contributor

ryanfowler commented Sep 16, 2016

@muhqu @jasdel You're right. I've just submit #839 to take a shallow copy of the HTTP request (before assigning the URL, body, and Headers) when retrying. This will carry the context field over in Go 1.7, yet still be compatible with Go 1.x.

@seiffert
Copy link
Contributor

@jasdel it seems like the section you've added to the documentation was not moved to https://github.com/awsdocs/aws-go-developer-guide. Could you please re-add it?

@seiffert
Copy link
Contributor

Do you have an idea how to use contexts best for operations requiring pagination?

@jasdel
Copy link
Contributor

jasdel commented Sep 28, 2016

Thanks for letting us know @seiffert I'm working to get the context example added to the dev guide.

@jasdel
Copy link
Contributor

jasdel commented Sep 28, 2016

@seiffert pagination is a little bit more complicated. At first glance I can see the option of treating all pages as a single request, but this easily breaks down for timeout contexts. What should really be done here is a way to provide a new Context for the next page's request to use. But its not immediately obvious how to do this because the request.Request value is not exposed for Pagination functions. It might be possible to address this with the SDK's request handlers, but I think that would require separate service client instances per Pagination operation the user is looking to perform, which isn't a great option.

I think this deserves a feature request issue if you'd like to create one. I don't think this issue can be solved just with a doc update, but will need some code change along with it.

@seiffert
Copy link
Contributor

seiffert commented Sep 29, 2016

Ok, I opened a feature request: #861.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

8 participants