Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LocalWaker support #191

Open
tvallotton opened this issue Mar 16, 2023 · 45 comments
Open

Add LocalWaker support #191

tvallotton opened this issue Mar 16, 2023 · 45 comments
Labels
A-async-await api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api WG-async

Comments

@tvallotton
Copy link

tvallotton commented Mar 16, 2023

Proposal

Problem statement

The current implementation of Waker is Send + Sync, which means that it can be freely moved between threads. While this is necessary for work stealing, it also imposes a performance penalty on single threaded and thread per core (TPC) runtimes. This runtime cost is contrary to Rust's zero cost abstraction philosophy. Furthermore, this missing API has led some async runtimes to implement unsound wakers, both by neglect and intentionally.

Motivation, use-cases

Local wakers leverage Rust's compile-time thread safety to avoid unnecessary thread synchronization, offering a compile-time guarantee that a waker has not been moved to another thread. This allows async runtimes to specialize their waker implementations and skip unnecessary atomic operations, thereby improving performance.

It is noteworthy to mention that while this is especially useful for TPC runtimes, work stealing runtimes may also benefit from performing this specialization. So this should not be considered a niche use case.

Solution sketches

Construction

Constructing a local waker is done in the same way as a shared waker. You just use the from_raw function, which takes a RawWaker.

use std::task::LocalWaker; 
let waker = unsafe { LocalWaker::from_raw(raw_waker) }

Alternatively, the LocalWake trait may be used analogously to Wake.

use std::task::LocalWake;

thread_local! {
    /// A queue containing all woken tasks
    static WOKEN_TASKS: RefCell<VecDeque<Task>>; 
}

struct Task(Box<dyn Future<Output = ()>>);

impl LocalWake for Task {
    fn wake(self: Rc<Self>) {
        WOKEN_TASKS.with(|woken_tasks| {
            woken_tasks.borrow_mut().push_back(self)
        })
    }
}

The safety requirements for constructing a LocalWaker from a RawWaker would be the same as a Waker, except for thread safety.

Usage

A local waker can be accessed with the local_waker method on Context and woken just like a regular waker. All contexts will return a valid local_waker, regardless of whether they are specialized for this case or not.

let local_waker: &LocalWaker = cx.local_waker(); 
local_waker.wake(); 

ContextBuilder

In order to construct a specialized Context, the ConstextBuilder type must be used. This type may be extended in the future to allow for more functionality for Context.

use std::task::{Context, LocalWaker, Waker, ContextBuilder}; 

let waker: Waker = /* ... */; 
let local_waker: LocalWaker = /* ... */;

let context = ContextBuilder::new()
    .waker(&waker)
    .local_waker(&local_waker)
    .build()
    .expect("at least one waker must be set"); 

Then it can be accessed with the local_waker method on Context.

let local_waker: &LocalWaker = cx.local_waker(); 

If a LocalWaker is not set on a context, this one would still return a valid LocalWaker, because a local waker can be trivially constructed from the context's waker.

If a runtime does not intend to support thread safe wakers, they should not provide a Waker to ContextBuilder, and it will construct a Context that panics in the call to waker().

Links and related work

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

@tvallotton tvallotton added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Mar 16, 2023
@withoutboats
Copy link

withoutboats commented Mar 16, 2023

(NOT A CONTRIBUTION)

A local waker can be accessed with the local_waker method on Context and woken just like a regular waker. All contexts will return a valid local_waker, regardless of weather they are specialized for this case or not.

Should clarify what you mean by this: a context without a local waker set will just return its Waker as a LocalWaker, because every Waker is a valid LocalWaker (they are basically just local wakers that implement Send).

It'd be helpful probably to sketch out what the definition of Context would look like in this proposal to support all of these operations:

  1. Returning the Waker for .waker() if the waker is set, and panicking if not.
  2. Returning the LocalWaker for .local_waker() if the local waker is set, and returning the Waker is not.

Should be possible but it may be a bit tricky.


let context = ContextBuilder::new()
    .waker(&waker)
    .local_waker(&local_waker)
    .build()
    .expect("at least one waker must be set"); 

Maybe build should just panic if no waker is set, instead of returning a Result? Not clear what anyone would do except call expect on this result.

@tvallotton
Copy link
Author

Yeah, currenty Waker is transparent to RawWaker, so a transmute from a &Waker to a &LocalWaker should probably be fine, and it would require no clones (assuming LocalWaker would too be transparent to RawWaker).

@withoutboats
Copy link

(NOT A CONTRIBUTION)

possibly LocalWaker could also just be defined as struct LocalWaker(Waker) and you could store the waker as a LocalWaker in Context. Would be a bit counterintuitive but would avoid transmutes.

@pitaj
Copy link

pitaj commented Mar 16, 2023

Agreed, please outline the exact function signatures for WakerBuilder and any added or changed functions and fields for Context. I am wondering if a builder is really warranted here. Do we plan on adding even more options to Context?

@pitaj
Copy link

pitaj commented Mar 16, 2023

Do we want to allow constructing a Context with only a LocalWaker (no Waker)? In that case, calling .waker() would panic? Can we somehow make that a monomorphization error instead?

@tvallotton
Copy link
Author

tvallotton commented Mar 16, 2023

Note I proposed a ContextBuilder type only, that for the moment will have:

  • local_waker: to set the local waker on context
  • waker: to set the thread safe waker on context.
  • build: to build the context.

The whole purpose of Context was to leave room for future extensibility, so I believe a builder is warranted.

@tvallotton
Copy link
Author

Do we want to allow constructing a Context with only a LocalWaker (no Waker)? In that case, calling .waker() would panic? Can we somehow make that a monomorphization error instead?

I doubt it, contexts would be entirely dynamic with respect to their Waker support. Now, it would be possible to add this optional support later without it being a breaking change.

@pitaj
Copy link

pitaj commented Mar 17, 2023

I'm a little confused.

I doubt it, contexts would be entirely dynamic with respect to their Waker support.

You're saying you doubt it would be possible to make this a monomorphization error?

Just to be clear, you're not proposing to make Waker optional here, just leaving it open for the future.

@pitaj
Copy link

pitaj commented Mar 17, 2023

I think something like try_local_waker() -> Option<&LocalWaker> could be useful as well. I'm wondering if maybe local_waker() should just return an option. Then we don't have to worry about transmutes or anything. Let the user fall back to waker() themselves or unwrap to panic if they always expect a local waker to exist.

@withoutboats
Copy link

(NOT A CONTRIBUTION)

You're saying you doubt it would be possible to make this a monomorphization error?

Just to be clear, you're not proposing to make Waker optional here, just leaving it open for the future.

Via context builder, @tvallotton's proposal does make it possible to construct a Context without a waker. Calling waker on that Context would panic.

The alternative is to not provide a way to construct a Context without a waker. If Rust does that, executors that don't support being awoken from other threads will just have to define a Waker that panics when you wakes it. This seems strictly worse to me.

I think something like try_local_waker() -> Option<&LocalWaker> could be useful as well. I'm wondering if maybe local_waker() should just return an option. Then we don't have to worry about transmutes or anything. Let the user fall back to waker() themselves or unwrap to panic if they always expect a local waker to exist.

There's no reason for this. It's always safe to wake the task from the same thread via the waker mechanism.

The point of this API is that reactors that wake from the same thread that they were polled on can use the local_waker mechanism instead of waker, and they will always just work. Since these reactors often store the waker somewhere, it's important that they be able to get the same type (LocalWaker) regardless of if they are actually waking via a thread safe mechanism or not. Reactors written that way can be run on a single threaded executor or a multithreaded executor and will be totally agnostic to it.

@pitaj
Copy link

pitaj commented Mar 17, 2023

I was just thinking that if someone is trying to enhance performance by using local wakers, knowing if you actually got a local waker or if it just fell back to the non-local waker would be useful.

@tvallotton
Copy link
Author

I think it is safe to say that a 100% of the people that ask for a LocalWaker will conform with a Waker given that all wakers are local wakers. Its preferable to have std perform the transmute than the ecosystem.

@withoutboats
Copy link

(NOT A CONTRIBUTION)

I was just thinking that if someone is trying to enhance performance by using local wakers, knowing if you actually got a local waker or if it just fell back to the non-local waker would be useful.

They're not the same parties. The person who wants the enhanced performance is the one who selects the executor, which isn't the same one who writes the reactor and deals with wakers, it's the one who calls spawn.

@jkarneges
Copy link

jkarneges commented Oct 25, 2023

Speaking of builders, the ability to build off of an existing Context would be useful for the context reactor hook concept, whereby the executor should be able to make an interface available to leaf futures via Context.

I'm imagining something like this to produce the top level context:

let context = ContextBuilder::new()
    .top_level_poller(&mut poller)
    .waker(&waker)
    .local_waker(&local_waker)
    .build()
    .expect("at least one waker must be set"); 

And then any sub-future that makes its own context in order to provide its own waker (selectors and such) would inherit from the previous:

let context = ContextBuilder::from(cx)
    .waker(&waker)
    .local_waker(&local_waker)
    .build()
    .expect("at least one waker must be set"); 

Such that an eventual leaf future could do:

let poller: Option<&mut dyn TopLevelPoller> = cx.top_level_poller();

There may be other reasons to want to inherit config or interfaces from Context like this, for example I/O budgeting. Just something to keep in mind when changing the API of Context for LocalWaker.

@withoutboats
Copy link

(NOT A CONTRIBUTION)

@rust-lang/libs-api What is needed to make progress on this proposal? It seems like it has been dropped. You made the breaking change to support it last year (the actually difficult bit), but then didn't implement the API to take advantage of the change.

@tvallotton
Copy link
Author

tvallotton commented Nov 23, 2023

From what I understand, the libs team needs to review the proposal, but they review once a week and there is a lot of proposals in the queue. Not to say that I don't share your frustration.

@dtolnay dtolnay added the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Nov 23, 2023
@tvallotton
Copy link
Author

Something that may be interesting to consider, is if the thread safe waker could be lazily initialized on the call to waker from a LocalWaker. I can picture a couple of ways of this could be leveraged for some optimizations. I'm thinking of the following pseudocode:

thread_local! {
    // storage for wakers sent to other threads
     static SHARED_WAKERS: RefCell<HashMap<u32, Waker>> = RefCell::default(); 
}

fn create_thread_safe_waker(waker: &LocalWaker) -> Waker {
    let id = new_id();
    store_waker_in_thread_local(SHARED_WAKERS, id, waker);
    create_waker_by_id(id)
}

let context = ContextBuilder::new()
      .local_waker(local_waker)
      .on_waker(create_thread_safe_waker)
      .build()
      .unwrap(); 

This way the cost of upgrading a LocalWaker to a Waker doesn't need to be paid upfront.

@m-ou-se
Copy link
Member

m-ou-se commented Nov 28, 2023

I think it'd be useful to have a complete overview of the proposed API. (The LocalWaker type, the Context::local_waker method, the ContextBuilder, anything else?)

@m-ou-se
Copy link
Member

m-ou-se commented Nov 28, 2023

cc @rust-lang/wg-async

@m-ou-se
Copy link
Member

m-ou-se commented Nov 28, 2023

We discussed this in today's libs-api meeting. This seems like the logical next step after rust-lang/rust#95985, but we'd like to be sure that the async wg is aware of this before continuing. Thanks!

@Amanieu
Copy link
Member

Amanieu commented Nov 28, 2023

I do have concerns about the builder API. I understand that Context is meant to be extensible, but a builder is a heavyweight solution in terms of API surface which should only be used if necessary. If there are only ever going to be 2 ways to construct a Context (with a Waker or with a LocalWaker) then it may be better to just have 2 explicit constructor functions on Context.

@tvallotton
Copy link
Author

tvallotton commented Nov 28, 2023

I think it is quite possible we will end up moving some of the logic that is currently being handled with thread locals into context. For example, there is the scheduling budget proposal. I can also imagine some runtime agnostic way of accessing reactor utilities using context or something like that. And of course, some kind of reactor hook like the one that was mentioned by @jkarneges (which might work better if it used file descriptors instead of closures so reactors can share a thread, by the way).

It could also expose executor utilities like spawn. Basically, the way I see it, Context could help us slowly make futures less runtime specific.

Concerning the current proposal, we may want to offer both a Waker and LocalWaker for the same Context, which would require three constructors. A possible alternative would be to set these values directly on Context like so:

let mut context = Context::from_local_waker(waker); 
context.set_waker(waker); 

However, as it was pointed out in the IRLO thread:

If you use &mut self methods, a poll method (which gets context by &mut) could change the context in a way that will escape that method; with RawWaker not capturing lifetimes this is a bit of a footgun: imagine something like FuturesUnordered using this to set the waker to its newly constructed waker not considering the escape aspect. Probably &mut self methods are not a good idea here.

@dtolnay dtolnay removed the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Nov 28, 2023
@tvallotton
Copy link
Author

tvallotton commented Nov 30, 2023

For what it's worth. I pushed this branch containing the changes in this proposal. You can check out the diff here. I am really interested in seeing this move forward, as I'm currently working on a project of my own that would benefit in the stabilization of this API.

@A248
Copy link

A248 commented Dec 2, 2023

I'm not sure if this is the correct place to comment on the API that @tvallotton has sketched up. If there's a better place, let me know.


My main observation is that rather than duplicating the machinery behind Waker to create LocalWaker, there should be a clever way to re-use some of these structures. A LocalWaker is identical to a Waker in every way except for the static thread-safety guarantees.

We can perhaps leverage the fact that a runtime will only ever need 1 waker. Internally, a Context can store an enum of a LocalWaker and a Waker, with the variant depending on runtime choice. Since all Wakers are controvertible to LocalWakers, a LocalWaker would always be available, but a Waker would require the appropriate variant.

enum WakerKind<'w> {
  Local(&'w LocalWaker),
  Sendable(&'w Waker)
}

impl<'w> WakerKind<'w> {
  fn local_waker(&self) -> &'w LocalWaker {
    match self {
      Self::Local(local) => local,
      Self::Sendable(sendable) => /* convert */
    }
  }
}

At least, I can't imagine a runtime implementing both a Waker and a LocalWaker at the same time -- this feels wrong. If a runtime provides a thread-safe Waker, that Waker should suffice to be a LocalWaker as well.


Additionally, there are some other ways I think the linked changes could be improved. These are relatively minor points by comparison, by still worth mentioning.

One is that Rust affords us the ability to create compile time-safe builders rather than panicking. It should be possible, via trait bounds, to make ContextBuilder::build only callable when the waker has been set. This is a major improvement over other languages because it prevents API usage errors -- and it is incorrect to assume that only async runtimes will construct Context. Besides, who wants to pay a runtime penalty because of an unnecessary panicking branch in ContextBuilder::build? Yes, I know that waker() will necessarily have an inelegant panicking branch, but that's an unfortunate cost which must be paid.

Also, it may be worth considering a blanket impl LocalWake for W where W: Wake. This accords with the idea that every Waker is a LocalWaker. Of course, one should always be cautious in introducing blanket implementations. They stay around forever.

If we had a redo on the API design, we would probably have that Wake: LocalWake + Send + Sync and LocalWake would define the relevant waking methods. However, that ship has probably sailed due to for backwards compatibility concerns (e.g. it's possible to call Waker::wake() directly rather than using method syntax). I don't know of any way to circumvent this problem except by some kind of edition-related upgrade magic.

@tvallotton
Copy link
Author

My main observation is that rather than duplicating the machinery behind Waker to create LocalWaker, there should be a clever way to re-use some of these structures. A LocalWaker is identical to a Waker in every way except for the static thread-safety guarantees.

Could you elaborate a little further on this thought? I'm not sure if you find that there is too much duplication in std or that there would be too much duplication on the applications.

If a runtime provides a thread-safe Waker, that Waker should suffice to be a LocalWaker as well.

This is the case in fact, as boats said earlier:

a context without a local waker set will just return its Waker as a LocalWaker, because every Waker is a valid LocalWaker (they are basically just local wakers that implement Send).

From the runtime's perspective, they will either store a bunch of wakers in their io-reactor, or a bunch of local wakers. They won't ever need to cast a waker to a local waker themselves.

At least, I can't imagine a runtime implementing both a Waker and a LocalWaker at the same time

I can imagine this. For example, tokio may want to add a local waker to access a Cell instead of a Mutex when waking a task. Or a single threaded runtime may want to support both kinds of wakers, even though the runtime itself only works with LocalWaker. I imagine there will be runtimes that can configure their support for Waker, either at the runtime level or at the task level.

One is that Rust affords us the ability to create compile time-safe builders rather than panicking. It should be possible, via trait bounds, to make ContextBuilder::build only callable when the waker has been set. This is a major improvement over other languages because it prevents API usage errors -- and it is incorrect to assume that only async runtimes will construct Context. Besides, who wants to pay a runtime penalty because of an unnecessary panicking branch in ContextBuilder::build?

I think I've seen this sort of thing in the ecosystem be done with deprecation warnings and #[deny(warnings)] to throw an error at compile time. I haven't seen it with trait bounds, so I'm not sure how it would look. Now, when implementing this I made everything const and inline hoping the assert! could be optimized out by the compiler. If this isn't the case, I think I would prefer having a try_build method, that if we really care about not unwinding, we can just unwrap_unchecked().

If we had a redo on the API design, we would probably have that Wake: LocalWake + Send + Sync

I don't think this is all that bad. In fact, I think forcing users to implement LocalWake before Wake is rather annoying if they are not going to use it. And we don't really need the implementation to be able to cast a Waker to a LocalWaker.

@tvallotton
Copy link
Author

On the note of try_build it might be useful to also have try_waker, so a future can inspect if the context supports Waker or if they need to make an alias to the &LocalWaker (probably with assistance from the runtime).

@traviscross
Copy link

WG-async discussed this in our recent triage meeting, and we plan to discuss it further later this week.

@kpreid
Copy link

kpreid commented Dec 4, 2023

Two ideas for possible API enhancements based on examining @tvallotton 's branch:

  • The entire alloc::task module is currently conditional on cfg(target_has_atomic = "ptr"), but with the addition of LocalWaker, it has items that are in principle usable without atomics … except that Context unconditionally offers a Waker. This could be addressed as follows:
    • Remove that part of the cfg condition from the module.
    • Add cfg condition to Wake since it needs Arc.
    • Add cfg condition to Waker since it can't meaningfully be constructed without atomics.
    • Add cfg condition to Context::waker(), so that on non-atomic-ptr platforms, Context has only local_waker().
    • (In principle, Waker could be implemented in a hardware-specific way that isn't strictly involving atomic ptrs as rustc understands them, allowing further relaxing to have Waker but not Wake everywhere, but that's even more niche.)
  • Add impl<W: Wake> From<Arc<W>> for LocalWaker (corresponding to the existing impl for Waker). This will simplify the job of optionally-no_std code that wishes to offer either Wakers or LocalWakers and isn't looking for maximum performance, by not requiring it to meticulously use Rc everywhere and explicitly implement LocalWaker; instead it can use Arc<impl Wake> to produce either kind of waker. (This would still be possible without core assistance, but require an unsafe route through RawWaker.)

Both of these are additive, so they could be done later, but they'd be easy to do immediately.

@withoutboats
Copy link

withoutboats commented Dec 6, 2023

(NOT A CONTRIBUTION)

I can imagine this. For example, tokio may want to add a local waker to access a Cell instead of a Mutex when waking a task. Or a single threaded runtime may want to support both kinds of wakers, even though the runtime itself only works with LocalWaker. I imagine there will be runtimes that can configure their support for Waker, either at the runtime level or at the task level.

I can affirm that this is desirable. It's plausible in a single threaded executor to have a waker that has special handling for being awoken from another thread, in case another thread triggers work for a task on that executor. The LocalWaker wouldn't need to include this check, but the Waker would. They could even use the same (thread safe) data type, but with different vtables.

One is that Rust affords us the ability to create compile time-safe builders rather than panicking. It should be possible, via trait bounds, to make ContextBuilder::build only callable when the waker has been set.

You don't need trait bounds for this, you can just make ContextBuilder not have the empty constructor, and instead have two constructors: one which passes a Waker and one which passes a LocalWaker. The Waker one could also set the LocalWaker (instead of doing it in build), and then that would not be an Option even in ContextBuilder. There would never need to be a null check of LocalWaker.

ContextBuilder would still have setters that overwrite these fields; just you're guaranteed LocalWaker is always set.

@tvallotton
Copy link
Author

I think both of @kpreid proposals seem reasonable, if no one has concerns about them I will include them in the branch.

You don't need trait bounds for this, you can just make ContextBuilder not have the empty constructor, and instead have two constructors.

I hadn't thought of this. It seems like an overall improvement and I can't think of any downsides. I'll include it too.

@tvallotton
Copy link
Author

impl<W: Wake> From<Arc<W>> for LocalWaker

Thinking about this a little further, I think this doesn't offer much more of what Wake already offers by itself. That is, only setting the Waker and having Context transmute it to a LocalWaker. We could maybe have Waker implement AsRef<LocalWaker>, but I don't think that is very important at this point, and we can always add it later if we deem it useful.

Also I made two additions to the branch that weren't discussed very much, but I think would be useful. These are:

  1. Add Context::try_waker() as a fallible alternative to waker()
  2. Add From<&mut Context<'a> for ContextBuilder to allow futures to extend their contexts in a self contained way.

A use case example for these two additions is the following function:

use std::task::{Waker, ContextBuilder};
use std::future::{poll_fn, Future};
use std::pin::pin;

async fn with_waker<F>(f: F, waker: &Waker) -> F::Output
where
    F: Future
{
    let mut f = pin!(f);
    poll_fn(move |cx| {
        let has_waker = cx.try_waker().is_some();
        if has_waker {
            return f.as_mut().poll(cx);
        }
        let mut cx = ContextBuilder::from(cx)
            .waker(&waker)
            .build();
        f.as_mut().poll(&mut cx)
    }).await
}

with_waker(async { /* ... */ }, Waker::noop()).await;

An example of a future that allows to set a Waker on Context if none was defined.
This can be used to await futures that require a Waker even if the runtime does not support Waker.

@compiler-errors
Copy link
Member

WG-async discussed this in our recent triage meeting, and we plan to discuss it further later this week.

@rust-lang/wg-async is okay with experimentation here (also, it's my -- personal, in this case -- opinion that the bar for feature-gated experimentation should be low enough as to not stifle innovation), but we would like to review the final implementation if this were to move forward to stabilization. Please keep us in the loop!

@tvallotton
Copy link
Author

tvallotton commented Dec 15, 2023

I created a tracking issue for this feature. Among the unresolved questions I included this:

Should Waker implement AsRef<LocalWaker>?

I now noticed this method exists in LocalWaker:

pub fn will_wake(&self, waker: &LocalWaker) -> bool;

However, this does not allow you to know if a local waker and a waker wake the same task. I believe implementing `AsRef for waker would be the best solution for this.

@Thomasdezeeuw
Copy link

For the email users, the tracking issue @tvallotton created is rust-lang/rust#118959.

@tvallotton
Copy link
Author

Ok, so I implemented a new API on this other branch called lazy_waker (here is a diff). The API allows to initialize lazily a waker instead of eagerly. I think that with this API there would really be no excuse for a runtime to support LocalWaker and not Waker. There really would not be any performance cost to supporting both, and it would be fairly easy to do.

The example I placed in the documentation is roughly:

fn constructor(_: &LocalWaker) -> Waker {
    println!("constructor was called!");
    Waker::noop()
}

let mut storage = None;
let local_waker = LocalWaker::noop();

let mut cx = ContextBuilder::from_local_waker(&local_waker)
    .on_waker(&mut storage, constructor)
    .build();

This requires the caller to provide a storage argument for lifetime reasons.

Also, while implementing this, I noticed that some of the examples break if ContextBuilder<'a> is made to be invariant over 'a. Specifically the with_waker example I provided above breaks. This happens because attempting to override &Waker demands the new waker to have the a 'static lifetime.

So I believe we must figure out if we will ever need to make ContextBuilder invariant over 'a before we add that From implementation. I do think that From implementation is very useful, but we should probably be cautious.

@withoutboats I'd be interested in what you have to say about this API. I can see some downsides, like waker() not being able to be const.

@Darksonn
Copy link

You don't need a specialized API support lazily initializing the different types of wakers. When you poll the future, you can pass it a "fake" waker, and just upgrade to a real waker in the clone operation on the vtable. Note that clone can return a different vtable than the one being cloned. In Tokio, we use a similar trick to avoid incrementing the refcount when creating the waker we pass to poll. (We don't actually use a different vtable. Instead, we just don't run the destructor of the fake waker.)

@tvallotton
Copy link
Author

Ok, after thinking this for a while I think you are correct about this. You could indeed use this to lazily initialize a waker on clone. There are a couple of edge cases, like if you send a &Waker to another thread without cloning it, but it is unlikely and almost certainly incorrect, so I'd say it is safe to abort on that case.

@traviscross
Copy link

We discussed this in the WG-async triage meeting today. We were still of course in favor of experimentation. We also had some high level questions we wanted to share to perhaps prompt discussion.

We felt that LocalWaker might reasonably imply the need for e.g. LocalFuture and other LocalFoo variants. Since this PR doesn't do that, we were curious about whether there were any limitations (in terms of e.g. performance overhead, related e.g. to Arc) that were implied by this.

There was also interest in the question of how this might interact with FuturesUnordered or FutureGroup.

@A248
Copy link

A248 commented Dec 18, 2023

@traviscross I'm not really sure what is meant by suggesting the need for "LocalFuture and other LocalFoo variants." Future is not Send + Sync hence why impl Future + Send + Sync is such a common pattern. By contrast, Waker is always Send and Sync, a restriction encoded in the type. Would you be able to elaborate?

@tvallotton
Copy link
Author

I agree with @A248's comments here. Concerning this:

There was also interest in the question of how this might interact with FuturesUnordered or FutureGroup.

These futures are likely not going to be able to use LocalWaker internally because that would make them !Send. This is a result of the fact that they store the wakers themselves. If they stored the wakers on a reactor, as some other futures do, they would be able to leverage this API.

@A248
Copy link

A248 commented Dec 19, 2023

Based on all of this discussion, however, I am starting to reconsider the current design. This discussion thread is filled with examples and patterns of how wakers, and the waking mechanism generally, are used in practice. I wonder if we might benefit by re-analyzing the motivations which led us to the current, and, following from this, if there is a cleaner solution to implement wake-up notifications in async contexts.


To start with, it seems helpful to consider the futures API, not only from a runtime perspective, but also from a Future implementor's perspective. Consider what happens when a future is polled. There seem to be a few possibilities related to waking:

  1. Waking is triggered immediately, on the same thread, using waker().wake_by_ref(). This causes the future to be re-polled.
  2. A "waker" is cloned and stored elsewhere. It is potentially moved to a different thread. A different and potentially concurrent code path triggers a wake-up notification.
  3. A "waker" is cloned and stored, but kept on the same thread. A different code path on the same thread triggers a wake-up notification.

I've intentionally split apart 2.) and 3.), because that is the primary motivation for this issue. Also, importantly, I've decided to refer to the "waker" separately from the wake-up notification itself. The reason for this is that sending a wake-up is conceptually different from storing a waker. The waker is merely a means through which wake-up notifications are performed.

After all of these, the future returns Poll::Pending and will be re-polled at least once after the runtime receives the wake-up notification via the passed waker. Now, it becomes useful to also think about the runtime's implementation of the waker. Importantly, I will frame the aspects of implementing a waker in terms of the consumer-side usage, from above. Remember also that Tokio, and others, do not construct a "fully-fledged" Waker from the outset; rather, to minimize atomic increments, the "top-level" Waker becomes a thin shim which is really a factory for the "full" atomically reference counted waker. This implementation detail with Tokio, actually, tracks with how we should conceive of the waker API.

All together, a runtime provides the capabilities:

  1. To wake the current task. This corresponds to cx.waker().wake_by_ref. However, as evidenced by Tokio, we don't really need or desire a "full" Waker here. To be precise, only the ability to wake the current task is pertinent here. The intermediary of the waker is merely a coincidence of design.
  2. To obtain a storable waker, which can then be cloned and moved around as needed. It seems almost elemental to the design of Waker that atomic reference counting be used here. This operation corresponds to cx.waker().clone() and usage of the resulting Waker. Again, however, only the capability of obtaining a storable waker is pertinent. A method like create_waker() on context would suffice -- I'm not suggesting such a method, merely pointing out that it fully conveys the intent and encapsulates the seemingly two-step operation implied by cx.waker().clone().
  3. To obtain a storable local waker. This is the same as 2.) except that the resulting waker is never moved to or referenced on a different thread.

Returning to this comment I made previously:

My main observation is that rather than duplicating the machinery behind Waker to create LocalWaker, there should be a clever way to re-use some of these structures. A LocalWaker is identical to a Waker in every way except for the static thread-safety guarantees.

The current approach taken in @tvallotton 's linked PR is to bifurcate the waking-related API into Waker and LocalWaker. The only difference between these is the Send + Sync thread-safety guarantee. However, I contend that this proposed design merely entrenches the conflation of multiple API use cases. Instead, delineating between thread-safe and non-thread-safe wakers might best be achieved by cleanly separating the API uses with respect to the operations I described above. These operations are waking up the current task, creating a thread-safe "waker," and creating a thread-confined "waker."

Let's suppose that we model our API as having a single source of the operations described above. We could call this the WakerFactory, a WakeupFactory, WakeupSink, or something similar. I've opted for WakeupContext since it is the part of the Context relating to wake-ups.

A mock-up:

impl Context<'_> {
    /// The new API that will support all wake up related operations.
    pub fn wakeup_context(&self) -> &dyn WakeupContext;
}

pub trait WakeupContext {
    /// Wakes up the current task
    fn wake_current_task(&self);

    /// Creates a thread safe storable waker.
    ///
    /// Panics if this operation is not supported by the runtime.
    fn create_waker(&self) -> Waker;

    // If fallibility is desired, then use
    // fn try_create_waker(&self) -> Option<Waker>;

    /// Creates a local storable waker.
    fn create_local_waker(&self) -> LocalWaker;
}

I think if the waker API were re-designed today with these considerations in mind, we'd end up with an API very similar to this trait.

However, that doesn't mean we couldn't retrofit such a system onto the existing APIs. For compatibility, it is possible to construct a shim Waker from the above trait, whose clone() method corresponds to create_waker(), and wake_by_ref() corresponds to wake_current_task(). The shim Waker's wake method should be unreachable.

To explore this more fully, I created a playground for it: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=f79ed94beef7fb0cf43dbf64b38ada99 . The new design based on WakeupContext seems to better encapsulate the set of available operations. As a downside, the internal machinery is slightly more complicated to follow.

Let me know if you have any feedback or if this is a workable idea.

@jkarneges
Copy link

Let me know if you have any feedback or if this is a workable idea.

It's interesting to re-think things after real world experience. However, I do see one issue: there would need to be some equivalent for will_wake.

@tvallotton
Copy link
Author

Hi @rust-lang/libs-team, @Mark-Simulacrum is asking for an explicit signoff for the currently proposed API. Do you approve the API proposed for nightly experimentation? You can view the PR here.

@tvallotton
Copy link
Author

Btw, I made the waker type mandatory on the last commit. Several people have expressed their concerns about wakers being optional. And while I do think that it is better for them to be optional than not, that is not something I consider terribly important for this feature to be useful.

@traviscross
Copy link

@rustbot labels -I-async-nominated

We discussed this today in WG-async triage. We notice that the tracking issue has a checkbox for ensuring that WG-async approves the API prior to stabilization (thanks for adding that), so that will ensure that this crosses our radar again, and therefore we can unnominate this since there's probably nothing immediate for us to discuss here.

Please of course nominate any questions that come up for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-async-await api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api WG-async
Projects
None yet
Development

No branches or pull requests