Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: context: enable first class citizenship of third party implementations #28728

Closed
gobwas opened this issue Nov 11, 2018 · 50 comments
Closed

Comments

@gobwas
Copy link

gobwas commented Nov 11, 2018

Hello there,

This proposal concerns a way to implement custom context.Context type, that is clever enough to cancel its children contexts during cancelation.

Suppose I have few goroutines which executes some tasks periodically. I need to provide two things to the tasks – first is the goroutine ID and second is a context-similar cancelation mechanism:

type WorkerContext struct {
    ID uint
    done chan struct{}
}

// WorkerContext implements context.Context.

func worker(id uint, tasks <-chan func(*WorkerContext)) {
    done := make(chan struct{})
    defer close(done)

    ctx := WorkerContext{
        ID: id,
        done: make(chan struct{}),
    }

    for task := range w.tasks {
        task(&ctx)
    }
}

go worker(1, tasks)
go worker(N, tasks)

Then, on the caller's side, the use of goroutines looks like this:

tasks <- func(wctx *WorkerContext) {
    // Do some worker related things with wctx.ID.

    ctx, cancel := context.WithTimeout(wctx, time.Second)
    defer cancel()

    doSomeWork(ctx)
}

Looking at the context.go code, if the given parent context is not of *cancelCtx type, it starts a separate goroutine to stick on parent.Done() channel and then propagates the cancelation.

The point is that for the performance sensitive applications starting of a separate goroutine could be an issue. And it could be avoided if WorkerContext could implement some interface to handle cancelation and track children properly. I mean something like this:

package context

type Canceler interface {
    Cancel(removeFromParent bool, err error)
}

type Tracker interface {
    AddChild(Canceler)
    RemoveChild(Canceler)
}

func propagateCancel(parent Context, child Canceler) {
	if parent.Done() == nil {
		return // parent is never canceled
	}
	if p, ok := parentCancelCtx(parent); ok { // p is now Tracker.
		if err := p.Err(); err != nil {
			// parent has already been canceled
			child.Cancel(false, err)
		} else {
			p.AddChild(child)
		}
	} else {
		go func() {
			select {
			case <-parent.Done():
				child.Cancel(false, parent.Err())
			case <-child.Done():
			}
		}()
	}
}

func parentCancelCtx(parent Context) (Tracker, bool) {
	for {
		switch c := parent.(type) {
                case Tracker:
			return c, true                
		case *timerCtx:
			return &c.cancelCtx, true
		case *valueCtx:
			parent = c.Context
		default:
			return nil, false
		}
	}
}

Also, with that changes we could even use custom contexts as children of created with context.WithCancel() and others:

type myCustonContext struct {} // Implements context.Canceler.

parent, cancel := context.WithCancel()
child := new(myCustomContext)
context.Bind(parent, child)

cancel() // Cancels child too.

Sergey.

@gopherbot gopherbot added this to the Proposal milestone Nov 11, 2018
@rsc
Copy link
Contributor

rsc commented Nov 28, 2018

/cc @Sajmani

I'm not convinced we want to expose the detail of the exact current Cancel signature.
In general waiting for cancellation requires a goroutine and that can be heavy (not just in this one case but in other uses too). We might want to think about that more general problem than this one short-circuit.

@Sajmani
Copy link
Contributor

Sajmani commented Dec 2, 2018

/cc @bcmills

Bryan has brought this up in the past, that we should make it possible for custom context implementations to avoid creating goroutines like the built-in implementation does.

In this example, it seems like the worker ID could be carried as a context value without a custom implementation. This would get you the efficiency you want at the cost of a runtime type assertion to extract the worker ID.

@rsc
Copy link
Contributor

rsc commented Dec 12, 2018

We know that there's no way to avoid new goroutines. Maybe that's important, maybe it's not. It's honestly unclear. Maybe things are fine as they are. The specific proposal here is probably not acceptable: we don't want to commit to exposing that specific API.

@gobwas
Copy link
Author

gobwas commented Dec 23, 2018

@Sajmani seems like not only runtime type assertion, but also additional allocation to fill up interface{} header field with pointer to the, say, int ID?

@Sajmani
Copy link
Contributor

Sajmani commented Dec 23, 2018 via email

@rsc rsc added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jan 9, 2019
@rsc
Copy link
Contributor

rsc commented Jan 23, 2019

ping @gobwas, do you have any data to answer @Sajmani's question from Dec 23?

@gobwas
Copy link
Author

gobwas commented Jan 24, 2019

Hello @rsc @Sajmani,

Sorry for silence.

It causing an overhead in general. Not any real world significant impact example yet. My point was to get rid of such overhead by doing a little in context’s API.

@rsc
Copy link
Contributor

rsc commented Jan 30, 2019

In general new API has a very high cost in constraining future implementations, what new users have to learn and so on. We don't add new API speculatively for overhead without real-world significant impact. There needs to be a real benefit to outweigh this real cost. Let's close this until we have a more pressing need.

@gobwas
Copy link
Author

gobwas commented Jan 30, 2019

Okay, not a problem. By the way this may be implemented as an optional interface with just a type assertion, without adding cost of learning but adding flexibility of use.

@posener
Copy link

posener commented Jun 1, 2019

Proposal: context: Enable 1st class citizenship of 3rd party implementations

Problem statement

Third party context implementations suffer from performance degradation when being used alongside the standard library implementation, as every call to context.WithCancel, context.WithTimeout or context.WithDeadline results in invocation of a goroutine.

Above, the proposed solution is to expose Canceler and Tracker interfaces in the context package. It was declined because of the need to expose new interfaces, which are not sure to be needed, and to constrain the current context standard library implementation.

Proposal

I propose a compromise, in which, the current standard library implementation will use an unexported interfaces that contain exported functions. This solves the problem and addresses its disadvantages. I also propose a slightly different interface than the one in #28728. See draft implementation in posener/go@2641c4b:

// canceler is an optional interface for a context implementation for propagating
// cancellation. If a context implementation want to manage context
// cancellation it needs to cancel all its children when cancelled.
// This interface provides a way for a child to register itself in a parent context.
// When a context implements this interface and is cancelled, it promises to
// cancel all the registered contexts which have yet to unregister.
type canceler interface {
	// Register a new cancelable child for a context.
	Register(cancelable)
	// Unregister existing cancelable child.
	Unregister(cancelable)
}

The cancelable interface is a modified version of the current canceler interface. I found it awkward that the cancel method has the removeFromParent argument, as this responsibility should not be on the caller. I renamed the interface since whoever implements it enables being canceled and not canceling other contexts. In all the places it is used in the code it makes more sense now (at least to me).

// cancelable is an optional interface for a context implementation that can be
// canceled directly.
type cancelable interface {
	Cancel(err error)
	Done() <-chan struct{}
}
// child is an optional interface for a context implementation in which it returns its
// parent context. It is useful for context to expose this interface to prevent
// an extra goroutine when being called with a cancellable context conversion.
type child interface {
	// Return the parent context of a context.
	Parent() Context
}

This change is fully backward compatible.

Draft Implementation

See draft implementation in posener/go@2641c4b.

Since this proposal highly influences the design of the current context implementation, here is a draft implementation which shows the usage of these interfaces in the standard library.

Details

A nice thing that came up from this design is that now cancelCtx implements the cancelable and canceler interface, and valueCtx implements the child interface. The implementation of parentCancelCtx is much clearer now (it was renamed to lookupCanceler):

func lookupCanceler(parent Context) canceler {
	for {
		switch c := parent.(type) {
		case canceler:
			return c
		case child:
			parent = c.Parent()
		default:
			return &cancelerCtx{Context: parent}
		}
	}
}

From the code above, the cancelerCtx is a wrapper around Context that adds implementation of the canceler interface, by invoking the a goroutine in the Register function (Replaces the invocation in propagateCancel):

type cancelerCtx struct {
	Context
}

func (c *cancelerCtx) Register(child cancelable) {
	go func() {
		select {
		case <-c.Done():
			child.Cancel(c.Err())
		case <-child.Done():
		}
	}()
}

func (r *cancelerCtx) Unregister(child cancelable) {}

The valueCtx implements the child interface:

func (c *valueCtx) Parent() Context {
	return c.Context
}

And the cancelCtx implements the canceler and cancelable interfaces (very simple refactor of the existing logic).

Disadvantages

  • There might be performance cost of adding interfaces usage.

@posener
Copy link

posener commented Jun 14, 2019

@rsc can you please revise according to #28728 (comment)

@rsc
Copy link
Contributor

rsc commented Jun 28, 2019

Reopening. Note that we do not reread closed issues, so if we miss the GitHub mail, we never see them. Having reopened, this will appear in the proposal milestone searches again.

@av86743
Copy link

av86743 commented Jul 21, 2019

How this proposal will be making further progress from this point on?

Also: Can a specific release be assigned before proposal is marked Proposal-Accepted?

Thank you.

@ianlancetaylor
Copy link
Member

The proposal review team needs to take a serious look at the suggested API above and see if it makes sense.

It would help a great deal if other people looked at it and commented.

Accepting a proposal just means that it's OK for someone to do the work. It doesn't mean that someone will actually do the work. Go is an open source project, and we can't control what contributors do and when. We assign release numbers to bug reports to try to make sure that bugs are addressed for specific releases. We use the release blocker label on some critical issues to say that we will not make a release until the issue is fixed in some way. We do not normally assign release numbers to new feature work, which is what this would be, and when we do assign release numbers they are aspirational rather than prescriptive.

@posener
Copy link

posener commented Aug 22, 2019

@ianlancetaylor , I'm not sure why this proposal is labeled with WaitingForInfo.
I would also like to change the name of the issue to "Proposal: context: Enable 1st class citizenship of 3rd party implementations"
Can it make it to the decision forum?
I am willing to own it if it will be accepted - I already proposed a draft implementation in posener/go@2641c4b.
Thanks

@ianlancetaylor ianlancetaylor changed the title proposal: context: expose canceler interface proposal: context: enable first class citizenship of third party implementations Aug 22, 2019
@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 22, 2019
@ianlancetaylor
Copy link
Member

I updated the title and removed the WaitingForInfo label.

@ianlancetaylor
Copy link
Member

CC @Sajmani

@posener
Copy link

posener commented Sep 17, 2019

@ianlancetaylor can I help in anything to promote this?

@av86743
Copy link

av86743 commented Sep 17, 2019

@posener Go release dashboard - Pending proposals (summary is on the top.)

@rsc
Copy link
Contributor

rsc commented Sep 18, 2019

To summarize the discussion so far, this issue concerns the implementation of context.WithCancel(parent) (or context.WithDeadline, but we can focus on WithCancel).

WithCancel has signature:

func WithCancel(parent Context) (ctx Context, cancel CancelFunc)

It creates a new context ctx and cancel func and then wires them to the parent in two ways:

  1. It arranges that when the parent is canceled the ctx context will be canceled too (cancel propagates down).
  2. It arranges that when ctx is finished, the wiring of step 1 is removed (subcontexts don't pile up).

Right now if the parent is a known implementation from inside the context package, these steps are done by manipulating data structures, specifically p.children. But otherwise for a general parent implementation, WithCancel starts a goroutine (in propagateCancel). That goroutine waits on both parent.Done and child.Done and implements either the second half of step 1 or step 2, depending on which happens.

The problem addressed by this issue is the creation of one goroutine per WithCancel call, which lasts until the created child is canceled. The goroutine is needed today because we need to watch for the parent to be done. The child is a known implementation; the parent is not.

One thing we could do with no API changes (pointed out by @bradfitz) is to redo the internals of context to have only one goroutine per parent with an unknown implementation, making that goroutine in charge of all the parent's children. For the case where you have one parent context with many direct children, this would reduce from many goroutines to one. Even if a parent had grandchildren, as long as they were created with standard contexts you'd still end up with one goroutine.

That is, we can do one goroutine per non-standard context implementation instead of one goroutine per operation on a non-standard context implementation.

For the specific case in the top comment by @gobwas, the program today uses one background goroutine per task; the change suggested by @bradfitz would change it to just one goroutine, shared by all the tasks.

@posener and @av86743, you've both commented but I think not shown a concrete example problem motivating your interest. Would the change mentioned above (many goroutines turning into one goroutine) fix the problems you are seeing as well?

It seems like a reasonable path forward would be to improve the internals of context without any public API changes (not even unexported optional interfaces) - that will help everyone without any client code updates - and then see what problems still remain in real-world programs.

What do people think?

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/196521 mentions this issue: context: use fewer goroutines in WithCancel/WithTimeout

@rsc
Copy link
Contributor

rsc commented Sep 19, 2019

In WithCancel/WithDeadline, the context package wants to map parent back to the innermost "cancelable" context - a *cancelCtx - and then attach the child. Yesterday we were talking about how to make the package allocate fewer goroutines when that mapping fails, because parent is not a known implementation.

But what if we make the mapping succeed even when parent is an unknown implementation? Sameer is trying to get at that with the optional Parent method, but again we don't really want to add new methods, even optional ones. New API, even optional new API, complicates usage for all users, and if we can avoid that, we usually prefer to avoid it.

It occurred to me that you get almost all the way to the *cancelCtx by calling parent.Done. That is, for a not-yet-canceled context, Done returns a channel that uniquely identifies the underlying *cancelCtx. If we simply maintain a map from done channel to *cancelCtx, we have everything we need, even for a custom implementation, without any new API. Call parent.Done, look up the channel in our map, edit the associated *cancelCtx.

I sent CL 196521 doing exactly this. For any custom context implementation that passes along a done channel from an implementation created by the context package, there's no longer any goroutine.

There will still be a goroutine if WithCancel/WithDeadline are passed a parent context implementation that allocates its own done channels. I don't think that's particularly common, but of course I'd love to hear otherwise.

We could optimize further by applying the suggestion from yesterday, allocating only one goroutine per unknown done channel instead of one per WithCancel/WithDeadline call using a context with an unknown done channel. But since at the moment I expect there are approximately zero context with unknown done channels floating around, we may not need to do that at all.

CL 196521 has no new API, so I think it can be submitted independent of the outcome in this issue. But after submitting it, this issue may no longer be an issue.

Especially helpful at this point would be real-world examples of problems that CL 196521 does not solve; in particular, real-world examples of context implementations that allocate their own done channels.

@Sajmani
Copy link
Contributor

Sajmani commented Sep 19, 2019 via email

@rsc
Copy link
Contributor

rsc commented Sep 25, 2019

CL 196521 looks likely to land soon (to be in Go 1.14). Assuming it does, are there any real-world examples of code that would still allocate goroutines in WithCancel etc?

@gobwas
Copy link
Author

gobwas commented Sep 25, 2019

Hi @rsc,

Agree with @Sajmani – clever observation.

Am I understand it right, that for the example from the top comment I still need to embed some *cancelCtx for my WorkerContext to prevent additional goroutine spawn? I mean:

type WorkerContext struct {
    ID int

    context.Context
    cancel context.CancelFunc
}

func NewWorkerContext(id int) *WorkerContext {
    ctx, cancel := context.WithCancel(context.Background())
    return &WorkerContext{
        ID: id,

        Context: ctx,
        cancel: cancel,
    }
}

@posener
Copy link

posener commented Sep 26, 2019 via email

@gobwas
Copy link
Author

gobwas commented Sep 26, 2019

@posener, looks like not only Value() but Done() too.

@rsc
Copy link
Contributor

rsc commented Sep 26, 2019

Every context that wraps another should pass along calls to Value with unknown keys to the wrapped context. That should not be a problem.

If you want to make a custom context with a custom done channel (that is, a custom result from Done) and then use that context with WithTimeout etc, then yes, that will still incur a goroutine. Do you have a reason to do that? All the arguments for custom context implementations that I have seen are about more efficient Value methods for large sets of key-value pairs, not a custom done channel.

gopherbot pushed a commit that referenced this issue Sep 26, 2019
If the parent context passed to WithCancel or WithTimeout
is a known context implementation (one created by this package),
we attach the child to the parent by editing data structures directly;
otherwise, for unknown parent implementations, we make a
goroutine that watches for the parent to finish and propagates
the cancellation.

A common problem with this scheme, before this CL, is that
users who write custom context implementations to manage
their value sets cause WithCancel/WithTimeout to start
goroutines that would have not been started before.

This CL changes the way we map a parent context back to the
underlying data structure. Instead of walking up through
known context implementations to reach the *cancelCtx,
we look up parent.Value(&cancelCtxKey) to return the
innermost *cancelCtx, which we use if it matches parent.Done().

This way, a custom context implementation wrapping a
*cancelCtx but not changing Done-ness (and not refusing
to return wrapped keys) will not require a goroutine anymore
in WithCancel/WithTimeout.

For #28728.

Change-Id: Idba2f435c81b19fe38d0dbf308458ca87c7381e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/196521
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@av86743
Copy link

av86743 commented Sep 26, 2019

What would be proper place to document examples of custom context patterns?

@rsc
Copy link
Contributor

rsc commented Sep 26, 2019

If there is a very compelling reason to make a custom context, once that we want all users to be aware of, then we could add an example. I am honestly unaware of such a reason. The only time I've heard of custom contexts is when people want to add lots of key-value pairs, and the fact that the current system doesn't scale for that is more a bug than anything else.

I have never heard of wanting a context implementation for providing a custom done channel. And even if you did have a reason, you could still use:

ctx, cancel := context.WithCancel(context.Background())
ch = ctx.Done()

to allocate the channel in a way that would avoid the goroutine.

@OneOfOne
Copy link
Contributor

@rsc not 100% sure this applies here, but how about allowing a custom cancel-able context with a custom error, which I've ran into multiple times.

Simply exporting the canceler interface and canceler.cancel func would make custom implementations much easier and efficient without actually effecting anything in the stdlib.

@bcmills
Copy link
Contributor

bcmills commented Sep 26, 2019

@OneOfOne, if you want a custom cancelable context with a custom error, you can still use context.WithCancel internally within its implementation. (You can forward the Done and Value methods and still substitute your own Err method.)

@gobwas
Copy link
Author

gobwas commented Sep 27, 2019

If you want to make a custom context with a custom done channel (that is, a custom result from Done) and then use that context with WithTimeout etc, then yes, that will still incur a goroutine. Do you have a reason to do that? All the arguments for custom context implementations that I have seen are about more efficient Value methods for large sets of key-value pairs, not a custom done channel.

@rsc No, currently I can not see the case when custom done channel is required. That is, changes from CL are fine for the case I mentioned on top. Thanks!

@OneOfOne
Copy link
Contributor

@bcmills the problem with that is, a lot of APIs (gce/aws for example), will wrap your context with another one when you pass it, and if you cancel your custom context, it won't work.

https://play.golang.org/p/7sW-DZy-7SZ

@bcmills
Copy link
Contributor

bcmills commented Sep 27, 2019

@OneOfOne, as long as the custom context is well-behaved — that is, as long as it forwards the Value methods for unknown keys and the Done channel unless it injects its own channel — then it will work.

The example you provided cancels synchronously using gotip:

$ gotip version
go version devel +ac1d440e Thu Sep 26 23:54:29 2019 +0000 linux/amd64

example.com$ gotip mod init example.com
go: creating new go.mod: module example.com

example.com$ cat > main.go
package main

import (
        "context"
        "fmt"
)

func main() {
        std()
        custom()
}

type C struct {
        context.Context
}

func std() {
        ctx0, cfn0 := context.WithCancel(context.Background())
        ctx1, cfn1 := context.WithCancel(ctx0)
        defer cfn1()

        cfn0()

        fmt.Println("ctx1 is done:", done(ctx1))
}

func custom() {
        ctx0, cfn0 := context.WithCancel(context.Background())
        ctx1, cfn1 := context.WithCancel(C{ctx0})
        defer cfn1()

        cfn0()

        fmt.Println("ctx1 is done:", done(ctx1))
}

func done(ctx context.Context) bool {
        select {
        case <-ctx.Done():
                return true
        default:
                return false
        }
}

example.com$ go1.13.1 run main.go
ctx1 is done: true
ctx1 is done: false

example.com$ gotip run main.go
ctx1 is done: true
ctx1 is done: true

example.com$

@bcmills
Copy link
Contributor

bcmills commented Sep 27, 2019

(And, of course, if you want to optimize an upstream API that is adding a Done channel, you can send them a PR to do that in terms of context.WithCancel.)

@rsc
Copy link
Contributor

rsc commented Oct 2, 2019

Given that the CL landed as 0ad3686 and there have been no reports of real-world use cases that are not optimized by that change, this proposal now seems like a likely decline (no changes needed anymore).

Leaving open for a week for final comments.

@Sajmani
Copy link
Contributor

Sajmani commented Oct 3, 2019

Future work: provide a way to identify which custom context implementations still trigger goroutine creations. This might be as simple as providing a const-guarded block to print parent.String() when we create the goroutine.

@DmitriyMV
Copy link
Contributor

I think should be good to document the new behavior. At least in release notes.

@rsc
Copy link
Contributor

rsc commented Oct 9, 2019

Added a relnote to the CL to remind us for the release notes.
No final comments otherwise since #28728 (comment).

Declined (performance issue resolved a different way)

@rsc rsc closed this as completed Oct 9, 2019
@navytux
Copy link
Contributor

navytux commented Dec 20, 2019

@rsc wrote at #28728 (comment):

There will still be a goroutine if WithCancel/WithDeadline are passed a parent context implementation that allocates its own done channels. I don't think that's particularly common, but of course I'd love to hear otherwise.

@rsc wrote at #28728 (comment):

are there any real-world examples of code that would still allocate goroutines in WithCancel etc?

I'm a bit late here, but below is example of real-world usage where custom context needs to create its own done channel: this need arises when one wants to merge two contexts to cancel the work when either the service is stopped, or when caller's context is canceled. Let me quote https://godoc.org/lab.nexedi.com/kirr/go123/xcontext#hdr-Merging_contexts to explain:

---- 8< ----

Merging contexts

Merge could be handy in situations where spawned job needs to be canceled whenever any of 2 contexts becomes done. This frequently arises with service methods that accept context as argument, and the service itself, on another control line, could be instructed to become non-operational. For example:

func (srv *Service) DoSomething(ctx context.Context) (err error) {
	defer xerr.Contextf(&err, "%s: do something", srv)

	// srv.serveCtx is context that becomes canceled when srv is
	// instructed to stop providing service.
	origCtx := ctx
	ctx, cancel := xcontext.Merge(ctx, srv.serveCtx)
	defer cancel()

	err = srv.doJob(ctx)
	if err != nil {
		if ctx.Err() != nil && origCtx.Err() == nil {
			// error due to service shutdown
			err = ErrServiceDown
		}
		return err
	}

	...
}

func Merge

func Merge(parent1, parent2 context.Context) (context.Context, context.CancelFunc)

Merge merges 2 contexts into 1.

The result context:

  • is done when parent1 or parent2 is done, or cancel called, whichever happens first,
  • has deadline = min(parent1.Deadline, parent2.Deadline),
  • has associated values merged from parent1 and parent2, with parent1 taking precedence.

Canceling this context releases resources associated with it, so code should call cancel as soon as the operations running in this Context complete.

---- 8< ----

To do the merging of ctx and srv.serveCtx done channels current implementation has to allocate its own done channel and spawn corresponding goroutine:

https://lab.nexedi.com/kirr/go123/blob/5667f43e/xcontext/xcontext.go#L90-118
https://lab.nexedi.com/kirr/go123/blob/5667f43e/xcontext/xcontext.go#L135-150

context.WithCancel, when called on resulting merged context, will have to spawn its own propagation goroutine too.

For the reference here is context.Merge implementation in Pygolang that does parents - child binding via just data:

https://lab.nexedi.com/kirr/pygolang/blob/64765688/golang/context.cpp#L74-76
https://lab.nexedi.com/kirr/pygolang/blob/64765688/golang/context.cpp#L347-352
https://lab.nexedi.com/kirr/pygolang/blob/64765688/golang/context.cpp#L247-251
https://lab.nexedi.com/kirr/pygolang/blob/64765688/golang/context.cpp#L196-226

Appologize, once again, for being late here, and thanks beforehand for taking my example into account.
Kirill

/cc @Sajmani

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests