Alternative Mutex implementation #11509

bill-myers · 2014-01-13T04:44:34Z

Pull #11462 implements a Mutex using a lockless algorithm that has to use thread_yield.

This is an alternative approach that removes all yields by introducing an internal lock, while still keeping the principle that in the native case the mutex must reduce to the native mutex.

Furthermore, this implementation should be fair even when mixing native and green mutexes (assuming that the OS mutex is fair), can always detect recursive mutex locking, and support for green threads can be disabled by changing a constant to improve performance.

There is no longer a need for the MPSC intrusive queue since we now protect the queue with the internal queue_lock.

It compiles and passes tests, but of course would need review.

Contention on the internal lock with green tasks shouldn't be a problem, since such a contention would happen anyway if the tasks were native, and the MPSC queue still involves atomics on a common cacheline, which also generates contention.

The Linux kernel also uses a spinlock to protect its own sleepable mutexes in the contended case (note that in the Linux kernel mutexes all threads behave as green tasks since there is no OS mutex to fallback to, so the requirements are a bit different).

Someone with access to a large server could perhaps benchmark both approaches if desired.

This does not provide the "as soon as unlock() returns you can obliterate the mutex" guarantee, but this is unnecessary for Rust mutexes as discussed in pull #11462.

Much of the justification can be found in the large implementation comment found in the module itself.

This removes all usage of `try_send_deferred` and all related functionality. Primarily, this builds all extra::sync primitives on top of `sync::Mutex` instead of `unstable::sync::Mutex`.

alexcrichton · 2014-01-13T09:01:21Z

I greatly appreciate you taking the time to write up this implementation, it will be very useful in benchmarking and comparisons, so thank you!

I firmly believe that the guarantee you are so easily batting away is vital to have. I would encourage you to read http://lwn.net/Articles/575460/ (which a quick googling about the issue brought up). I personally have run into this race. I do not believe that "just use an arc" is a solution. In my mind, this is equivalent to saying "just use a gc" if you're having trouble freeing owned pointers.

Regardless, we need to discuss whether that guarantee is necessary for this mutex implementation or not. I would be uncomfortable merging this if we decided to discard the guarantee for a few reasons:

Lock free code is incredibly hard to get right, and there's very little documentation to what's going on inside this mutex. I attempted to closely document everything that was going on, and you seem to have just deleted all of it without replacement.
This implementation looks like it has a number of premature optimization which only serve to make the code more impenetrable to read (support_nonblocking_tasks will never be false, unsafe borrows, etc).
There's no mention of how this implementation's performance looks (I'd run green threads with RUST_THREADS=1 for now until we're able to fix scheduling to stop fanning out rust tasks to other cores so much)
I'm a little concerned about the size of the mutex, and it seems to me that if you already have a concurrent queue you might as well use it to make the structure smaller. I'm not sure whether this is a premature optimization or how well it plays with this implementation.

Finally, I would recommend discussing these issues on an original pull request before opening up a new one. Our code of conduct states:

Respect that people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer.

I personally feel in this case that you have read my opinions and immediately discounted them all as wrong. I would encourage you to work out your differences of opinion with me before brazenly deleting all my code.

thestinger · 2014-01-13T09:12:27Z

using a lockless algorithm that has to use thread_yield

So, it's just a low-priority spin lock and doesn't let the scheduler sleep? Or does it sleep in the scheduler until it's unlocked?

thestinger · 2014-01-13T09:49:28Z

I firmly believe that the guarantee you are so easily batting away is vital to have. I would encourage you to read http://lwn.net/Articles/575460/ (which a quick googling about the issue brought up). I personally have run into this race. I do not believe that "just use an arc" is a solution. In my mind, this is equivalent to saying "just use a gc" if you're having trouble freeing owned pointers.

I don't really understand why the reference counting should be the responsibility of a mutex. The OS can context-switch after you unlock but before you free it, so what does it really guarantee? The primary users of locking are going to be concurrently accessed data structures that are either static or use atomic reference counting.

bill-myers · 2014-01-13T15:22:51Z

I firmly believe that the guarantee you are so easily batting away is vital to have.
I read http://lwn.net/Articles/575460/ and that can happen mostly because C uses unsafe manual memory management (i.e. the kfree call in the code there).

In Rust, safe code cannot free memory arbitrarily and cannot be affected by this issue.

Unsafe code needs to be somewhat breaking the language to deallocate a mutex early (since it invalidates the pointer the other task has), and I can't come up with a scenario where one needs to do that.

For instance, your unsafe code in Once doesn't have this issue because you properly use an external reference count in the natural way.

support_nonblocking_tasks will never be false
In the current form yes, but it's intended to either stay as documentation, or be converted to a static mut or compile-time option, in which case a program using only native tasks could set it.

I'm a little concerned about the size of the mutex

Possible fixes:

Have an OS mutex variant that doesn't include a condition variable
Conflate queue_nonempty and either queue or queue_tail
Remove owner, just deadlock on recursive locking

Obviously removing the second lock would be ideal, but I'm not sure if that's possible.

Finally, I would recommend discussing these issues on an original pull request before opening up a new one

Since the proposed alternative implementation is very different from the original one, it seems a new pull request was the only way to avoid polluting the original one; also I don't think on Github you can just post new code in response to a pull request in a way that allows to examine it easily.

I'm not totally sure that this is the best approach, as it does have the drawbacks you mention; an approach that both doesn't yield and doesn't use an extra lock would be best, but I'm not sure if that's possible.

alexcrichton · 2014-01-13T17:13:45Z

So, it's just a low-priority spin lock and doesn't let the scheduler sleep? Or does it sleep in the scheduler until it's unlocked?

I would recommend reading the pull request for full details, it sounds like you may be conflating ideas specific to kernel locks to user-space locks (I'm unaware of any such spin lock in user space).

In Rust, safe code cannot free memory arbitrarily and cannot be affected by this issue.

You do realize that safe code in rust has no use for a mutex, right? Using a mutex implies that you're using unsafe code to share memory among more than one consumer.

Obviously removing the second lock would be ideal, but I'm not sure if that's possible.

I had an implementation which did this, as I have stated before I did not choose this option.

bill-myers · 2014-01-13T17:20:08Z

Oops, this is in fact not globally fair in the initial version, since green tasks will have precedence because we don't always drop the lock.

Uploaded a new version that is now fair and simpler, by dropping that optimization.

bill-myers · 2014-01-13T17:32:21Z

You do realize that safe code in rust has no use for a mutex, right?

That's not correct: mutexes are useful for general serialization, and sharing memory safely is only one possible use for general serialization.

For example, if you are writing a parallel web crawler, and don't want to make more one simultaneous connection to any given web server (to not overload it), you can use an HashMap mapping IP addresses to mutexes, and use the mutex to protect connections to that specific IP address.

bill-myers · 2014-01-13T17:48:08Z

BTW, I might have found a lockless algorithm that works, which would obsolete the code in this pull request.

See #11462 for the details.

bill-myers · 2014-01-13T22:36:03Z

Obsolete, now proposing pull #11520 instead, with uses a lockless algorithm and also avoids all the disadvantages mentioned in the comments here.

fix FP of let_unit_value on async fn args changelog: [`let_unit_value`]: fix the FalsePostive on async fn arguments fix rust-lang#11502

alexcrichton added 2 commits January 10, 2014 17:47

std: Introduce a standard Mutex type

fea60da

Much of the justification can be found in the large implementation comment found in the module itself.

Remove try_send_deferred plus fallout

4219fd3

This removes all usage of `try_send_deferred` and all related functionality. Primarily, this builds all extra::sync primitives on top of `sync::Mutex` instead of `unstable::sync::Mutex`.

std: rewrite Mutex to not use yield for synchronization

744b95b

bill-myers closed this Jan 13, 2014

bill-myers mentioned this pull request Jan 13, 2014

Alternative lockless Mutex implementation (2nd try) #11520

Closed

alexcrichton mentioned this pull request Jan 17, 2014

Add a Mutex, AtomicU64, and deal with lots of related fallout #11610

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative Mutex implementation #11509

Alternative Mutex implementation #11509

bill-myers commented Jan 13, 2014

alexcrichton commented Jan 13, 2014

thestinger commented Jan 13, 2014

thestinger commented Jan 13, 2014

bill-myers commented Jan 13, 2014

alexcrichton commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014

Alternative Mutex implementation #11509

Alternative Mutex implementation #11509

Conversation

bill-myers commented Jan 13, 2014

alexcrichton commented Jan 13, 2014

thestinger commented Jan 13, 2014

thestinger commented Jan 13, 2014

bill-myers commented Jan 13, 2014

alexcrichton commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014

bill-myers commented Jan 13, 2014