Implement `Mutex::try_lock` #71

bkragl · 2022-05-25T19:27:55Z

Without try_lock it was easier to justify context switches, because
acquire was a right mover (we only needed a context switch before) and
release was a left mover (we only needed a context switch after).
However, with try_lock that is not the case anymore. This commit
argues why we need a context switch at the end of lock and try_lock
(both in the success and failure case), and why we do not need a context
switch at the beginning of try_lock and MutexGuard::drop.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

tests/basic/mutex.rs

src/sync/mutex.rs

jamesbornholt

I think all the broken tests are because we have an extra yield point now, so existing schedules don't replay (in which case we should bump the minor version on the next release) and some of the tests that count context switches are now wrong.

This extra yield point also makes the clock_condvar_notify_all_dfs test extremely slow; wonder if we can do something about that.

src/sync/mutex.rs

tests/basic/mutex.rs

bkragl · 2022-06-02T13:30:55Z

I think all the broken tests are because we have an extra yield point now, so existing schedules don't replay (in which case we should bump the minor version on the next release) and some of the tests that count context switches are now wrong.

I fixed all tests.

This extra yield point also makes the clock_condvar_notify_all_dfs test extremely slow; wonder if we can do something about that.

I just did an experiment on my local machine. On the current main branch, clock_condvar_notify_all_dfs takes 60 seconds. On this PR (including the revision of doing a context switch before lock only in case we need to block) it takes "only" 40 seconds.

src/sync/mutex.rs

jorajeev · 2022-06-15T13:55:56Z

src/sync/mutex.rs

+
+            // Update the vector clock stored in the Mutex, because future threads that manage to
+            // acquire the lock have a causal dependency on this failed `try_lock`.
+            ExecutionState::with(|s| state.clock.update(s.get_clock(me)));


I just realized that I had made an implicit optimization with vector clock updates. When a thread successfully acquires a lock (using lock()), it does not update the Mutex's clock with its own clock at the time of acquisition. This was fine in the absence of try_lock, as all other threads would be blocked anyway. But with try_lock, the Mutex's clock has to be updated with the thread's clock in both the else branch below (line 142) and in the lock() call above (line 103).

In try_lock that update was already done in the successful case, so I factored it out from the if branch to be done in both cases.

jorajeev · 2022-06-15T14:04:00Z

tests/basic/dfs.rs

+/// T0 blocks). The following computation tree illustrates all interleavings.
+///
+/// ```
+/// digraph G {


tests/basic/mutex.rs

Without `try_lock` it was easier to justify context switches, because acquire was a right mover (we only needed a context switch before) and release was a left mover (we only needed a context switch after). However, with `try_lock` that is not the case anymore. This commit argues why we need a context switch at the end of `lock` and `try_lock` (both in the success and failure case), and why we do not need a context switch at the beginning of `try_lock` and `MutexGuard::drop`.

jamesbornholt · 2022-06-02T22:46:32Z

src/sync/mutex.rs

+            // Note that we only need a context switch when we are blocked, but not if the lock is
+            // available. Consider that there is another thread `t` that also wants to acquire the
+            // lock. At the last context switch (where we were chosen), `t` must have been already
+            // runnable and could have been chosen by the scheduler instead. Also, if we want to
+            // re-acquiring the lock immediately after having it released, we know that the release
+            // had a context switch that allowed other threads to acquire in between.


With this change I think we can move self.waiters.insert(me) above into the branch, and that probably lends itself to a cleaner way to handle the state here.

Also, is it true that if we don't enter this branch, then state.waiters == {me} (or {} if you move that insert)? Maybe we should assert that invariant.

jamesbornholt · 2022-06-02T22:47:04Z

src/sync/mutex.rs

+            // runnable and could have been chosen by the scheduler instead. Also, if we want to
+            // re-acquiring the lock immediately after having it released, we know that the release


Suggested change

// runnable and could have been chosen by the scheduler instead. Also, if we want to

// re-acquiring the lock immediately after having it released, we know that the release

// runnable and could have been chosen by the scheduler instead. Also, if we want to

// re-acquire the lock immediately after releasing it, we know that the release

jamesbornholt · 2022-06-02T22:48:50Z

tests/basic/condvar.rs

+
+#[test]
+#[should_panic(expected = "nothing to get")]
+fn replay_roducer_consumer_broken1() {


Suggested change

fn replay_roducer_consumer_broken1() {

fn replay_producer_consumer_broken1() {

jamesbornholt · 2022-06-02T22:49:02Z

tests/basic/condvar.rs

+
+#[test]
+#[should_panic(expected = "deadlock")]
+fn replay_roducer_consumer_broken2() {


Suggested change

fn replay_roducer_consumer_broken2() {

fn replay_producer_consumer_broken2() {

jamesbornholt · 2022-06-02T22:53:15Z

tests/basic/mutex.rs

+            add_thread.join().unwrap();
+            mul_thread.join().unwrap();
+
+            let value = *lock.try_lock().unwrap();


might be slightly more idiomatic/explicit to do

let value = Arc::try_unwrap(lock).unwrap().into_inner().unwrap();

(and same thing in the other tests) -- we're not trying to test anything about try_lock here

jamesbornholt · 2022-06-02T22:55:07Z

tests/basic/mutex.rs

+#[test]
+#[should_panic(expected = "tried to acquire a Mutex it already holds")]
+fn double_lock() {
+    check(|| {


trying to get rid of the round-robin scheduler:

Suggested change

check(|| {

check_dfs(|| {

jamesbornholt · 2022-06-02T22:55:15Z

tests/basic/mutex.rs

+
+#[test]
+fn double_try_lock() {
+    check(|| {


Suggested change

check(|| {

check_dfs(|| {

bkragl requested review from jamesbornholt and jorajeev May 25, 2022 19:27

jorajeev reviewed May 25, 2022

View reviewed changes

jamesbornholt reviewed May 26, 2022

View reviewed changes

bkragl force-pushed the try_lock branch from fce040a to e47be26 Compare June 2, 2022 11:54

jorajeev reviewed Jun 2, 2022

View reviewed changes

src/sync/mutex.rs Outdated Show resolved Hide resolved

bkragl force-pushed the try_lock branch from e47be26 to efef30d Compare June 2, 2022 20:59

bkragl force-pushed the try_lock branch 2 times, most recently from 210b857 to ca23ea6 Compare June 15, 2022 10:44

jorajeev reviewed Jun 15, 2022

View reviewed changes

bkragl force-pushed the try_lock branch from ca23ea6 to 4922596 Compare June 15, 2022 23:13

Minor tweaks

64ff2f3

jamesbornholt force-pushed the try_lock branch from 37a1ab5 to 64ff2f3 Compare June 28, 2022 22:58

jamesbornholt approved these changes Jun 28, 2022

View reviewed changes

jamesbornholt merged commit 8bd99ef into awslabs:main Jun 28, 2022

bkragl deleted the try_lock branch January 25, 2023 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `Mutex::try_lock` #71

Implement `Mutex::try_lock` #71

bkragl commented May 25, 2022

jamesbornholt left a comment

bkragl commented Jun 2, 2022

jorajeev Jun 15, 2022

bkragl Jun 15, 2022

jorajeev Jun 15, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

jamesbornholt Jun 2, 2022

		// runnable and could have been chosen by the scheduler instead. Also, if we want to
		// re-acquiring the lock immediately after having it released, we know that the release

	fn replay_roducer_consumer_broken1() {
	fn replay_producer_consumer_broken1() {

	fn replay_roducer_consumer_broken2() {
	fn replay_producer_consumer_broken2() {

Implement Mutex::try_lock #71

Implement Mutex::try_lock #71

Conversation

bkragl commented May 25, 2022

jamesbornholt left a comment

Choose a reason for hiding this comment

bkragl commented Jun 2, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Implement `Mutex::try_lock` #71

Implement `Mutex::try_lock` #71