Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Wakers need to be Send + Sync #194

Closed
Matthias247 opened this issue Nov 30, 2020 · 18 comments
Closed

Rust Wakers need to be Send + Sync #194

Matthias247 opened this issue Nov 30, 2020 · 18 comments

Comments

@Matthias247
Copy link

It is currently not possible in Rust to safely define non thread-safe Wakers. They are defined to be Send and Sync, and thereby the RawWakers must fulfill those guarantees.

This commit, as well as the general Waker implementation for glommio tasks violate those properties.

Non thread-safe Wakers had been part of the earlier async/await design in form a so called LocalWaker. However they had been removed from the design in order to simplify it, which unfortunately makes life a bit harder (or slower) for projects like this.

In addition to this Context is also Sync, which is something that wasn't initially planned for and probably still should be fixed.

You might want to consider a different wakeup mechansism than directly referencing the tasks via non-thread safe pointers. The original RFC mentioned an alternative implementation using thread-local storage and task IDs which could work. However there is also a drawback with this design: It would never allow for non-local wakeups, since the executor on the other thread would not be found (which might however be less of an issue for this project).

@andrii0lomakin
Copy link
Collaborator

Hi @Matthias247 , thank you for such useful information both for this and the previous issue. I also was worried by this misconception but did not have time yet to fully investigate this topic. I will definitely check it. That is not a top priority as for now, but IMHO has to be resolved.

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Nov 30, 2020

@Matthias247 I have checked the RFC. So what is proposed by the RFC team is to wake task using thread-local queue and check the ID of the executor by assertion. As for now we already use a thread-local queue, so the addition of the assertion check will lead to functionally the same behaviour. I understand that the state of the RawWaker still will be shared, but the addition of assertion guard, which inevitably needs to be done to ensure correct functionality of the Waker prevents non-thread-safe state changes into the Waker. I will change code shortly to satisfy this requirement. Also, I understand that ID is used to perform all the changes inside the Executor during wake up instead of doing that into the foreign thread. But assertion guard will prevent such non thread-safe changes anyway, so taking this into account, passing of the reference instead of ID does not really change the situation.

@glommer
Copy link
Collaborator

glommer commented Nov 30, 2020

Thanks for the information @Matthias247.

I would like to see this resolved before we reach the point that I consider glommio production ready. My goal is definitely to play well with the ecosystem at large.

@glommer
Copy link
Collaborator

glommer commented Nov 30, 2020

@Laa we should be probably looking at doing that to the Executor in multitask.rs, not the main executor, if I understand correctly. The multitask Executor is equivalent to an execution queue for us. It is actually confusing that both are called Executor, but I didn't know better back then =) The one in multitask is probably better off called a task queue.

Futures don't really belong to an Executor for us, but to a task queue (because they can't move to a different task queue). And if we wake up a task queue that was probably inactive at that point we make it active.

@glommer
Copy link
Collaborator

glommer commented Nov 30, 2020

@Matthias247 I am a bit confused by this. The Future trait methods still expect a Waker, and if that doesn't change it has to be Sync + Send.

Doesn't that imply that the RawWaker has to be Send + Sync too ?

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Nov 30, 2020

Hi @glommer the point is the RawTask also does reference counting and making RawWaker Send+Sync equals to making of RawTask Send+Sync. But in reality, if I understand logic underneath correctly we merely need to add assertion guard to the logic. async-task does the same. Its RawTasks for a local executor are Sync+Send but because they accept !Send+!Sync Futures they put assertion guard to satisfy this contract. The same also is recommended in before-mentioned RFC. - "tracking an executor ID or thread ID to perform a runtime assertion that a Waker wasn't sent across threads. While this solution gives inferior error messages to the LocalWaker approach (since it can't panic until wake occurs on the wrong thread, rather than panicking when LocalWaker is transformed into a Waker) ..." But I am curious about the feedback of @Matthias247 .

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Nov 30, 2020

It makes RawTask to be Send+Sync in case of !Send+!Sync future at the sense that it does not generate UB, I suppose the same idea is underneath of assertion guard for the Waker implementation.

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Nov 30, 2020

@Laa we should be probably looking at doing that to the Executor in multitask.rs, not the main executor

@glommer sure, we can try to implement delayed execution of Wake, but we can not put the task into the wrong executor. Which means that we need to do nothing if we call wake from an incorrect thread with exception of panic of course, Which makes all the rest of changes not needed because of contract violation indicated by panic.

@andrii0lomakin
Copy link
Collaborator

Found additional confirmation in the blogpost of the "withoutboats" one of the active contributors of Rust - "The second use case is more interesting. There is one thing that indeed cannot be supported without the API distinction: using an Rc, non-atomically reference-counted task. There are still other ways to implement a single-threaded event loop, however: Using the Task IDs technique instead of the reference counting technique. Panic when you wake from another thread, instead of when you move to it. This strategy works completely fine." From https://boats.gitlab.io/blog/post/wakers-ii/ .

@Matthias247
Copy link
Author

Doesn't that imply that the RawWaker has to be Send + Sync too ?

Yes and no. The struct isn't Send/Sync, but the methods it implements need to be. Creating a Waker from a RawWaker with [Waker::from_raw](https://doc.rust-lang.org/std/task/struct.Waker.html#method.from_raw) is unsafe exactly for the reason that it is only safe if the RawWaker` implements the contract correctly.

Using the Task IDs technique instead of the reference counting technique. Panic when you wake from another thread, instead of when you move to it. This strategy works completely fine."

Yeah, this can be done. Probably by storing the current thread ID (or any other kind of globally unique executor ID) in each task/Waker. If wake is called check it against the current thread-local Waker. If it matches enqueue task for running by ID. If it doesn't match, panic.

Definitely not as efficient as things ideally could be. But maybe checking a few thread-locals per wake isn't too bad either. If you implement it I'm curious about benchmarks which show the impact.

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Dec 2, 2020

I suppose we can even protect ourselves by thread id, not by executor id. The reasoning for that is very simple, we use an implicit reference to the executor to put a task into the queue, so even if wake is called outside of the thread-local scope of executor it will not make any damage. I will try both approaches, will check what is faster and will use it.

@andrii0lomakin
Copy link
Collaborator

I have tried to use executor id in assertion guard and the numbers of benchmarks (all of them, but I was mostly interested in semaphore and channel benchmarks) are slightly slower and lie in the range of statistical mistake. Sometimes numbers for the implementation with executor-id guard even better. I will send PR soon.

@andrii0lomakin
Copy link
Collaborator

andrii0lomakin commented Dec 3, 2020

I have experienced a random crash on a single run of bench task. Probably not related to the change, but I am running Valgrind to be sure that we do not have memory leaks or unsafe deallocations.
Meanwhile results of benchmarking:

before change:

cost to run task no task queue:mean - 97.6 ns. 
cost to run task in task queue: mean - 265.8 ns.
cost of sending contended local channel: mean - 92.2 ns
cost of sending uncontended local channel: mean - 11 ns
cost of checking for need_preempt: mean - 1.8 ns
cost of acquiring uncontended semaphore: mean - 11.6 ns
cost of acquiring contended semaphore: mean - 202.4 ns

after change:

cost to run task no task queue: mean - 101.5 ns, +4%
cost to run task in task queue: mean - 272.5 ns, +3%
cost of sending contended local channel: mean - 101 ns, + 10%
cost of sending uncontended local channel: mean - 11 ns, 0%
cost of checking for need_preempt: mean - 2 ns,  +10%
cost of acquiring uncontended semaphore: mean - 14 ns, +21%
cost of acquiring contended semaphore: 209.5 ns - +4%

I have been surprised by the results of "cost of acquiring uncontended semaphore" because this test does not call wake or clone of the waker. Probably that is a variation of execution inside of my server. I will check that though before final commit.

@andrii0lomakin
Copy link
Collaborator

About the current benches. Thear are bit too simple. Only single run, no warm up cycle. No stats as result. The best performance tool I have seen is the one embedded in latest versions of JDK. I would try to find similar one in Rust ecosystem and propose the change.

@glommer
Copy link
Collaborator

glommer commented Dec 3, 2020

I agree the benchmarks are too simple, but I was not very happy with any of the other rust tools for that and I was not about to write my own =)

In particular I looked at criterion. I found it not really that good for asynchronous operations, and because the operations we are measuring are too short (in the order of magnitude of a clock measurement) I found the results to be wildly different between runs.

What we have now is at least useful as a polestar to make sure things are not regressing massively and they are enough to see big deviations, which are the ones that are more interesting at this phase of the project anyway.

If you do find a tool that works better than criterion for what we need I would be happy to look at it.

glommer pushed a commit that referenced this issue Dec 8, 2020
Fix of issue #194. All changes in task status are guarded by the check of thread_id.
@glommer
Copy link
Collaborator

glommer commented Dec 10, 2020

@Laa is this fully fixed by #210 or is there anything else to do ?

it seems to me that yes, but github didn't like the way you wrote the fixed string so didn't close this

@glommer
Copy link
Collaborator

glommer commented Dec 10, 2020

I am closing now manually because it seems to me this was your intent. If anything is still missing let me know

@glommer glommer closed this as completed Dec 10, 2020
@andrii0lomakin
Copy link
Collaborator

Hi @glommer, yes, that is already closed. I have GMT +2 timezone so can not answer shortly sometimes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants