Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint for holding locks across await points #5439

Merged
merged 5 commits into from
Apr 22, 2020
Merged

Conversation

rokob
Copy link
Contributor

@rokob rokob commented Apr 8, 2020

Fixes #4226

This introduces the lint await_holding_lock. For async functions, we iterate
over all types in generator_interior_types and look for types named MutexGuard,
RwLockReadGuard, or RwLockWriteGuard. If we find one then we emit a lint.

changelog: introduce the await_holding_lock lint

@flip1995 flip1995 requested a review from yaahc April 8, 2020 20:25
@flip1995 flip1995 added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Apr 8, 2020
Copy link
Member

@yaahc yaahc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

LL | |
LL | | let second = baz().await;
LL | |
... |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ... here from it skipping all but one of the awaits this is held thru is particularly hilarious

@yaahc yaahc requested a review from Manishearth April 8, 2020 23:44
"Inside an async function, holding a MutexGuard while calling await"
}

const MUTEX_GUARD_TYPES: [&str; 3] = ["MutexGuard", "RwLockReadGuard", "RwLockWriteGuard"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be full paths in util/paths.rs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can just use match_def_path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went this route to capture:

std::sync::mutex::MutexGuard
futures_locks::MutexGuard
parking_lot::MutexGuard

with one check. Similarly for the others.

I will change to using full paths as that certainly reduces false positives, but at the cost of false negatives for custom guard types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this in the first place for custom guard types from async executors (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought yes you wanted this for async-aware locks too, but that seems to be an incorrect assumption on my part. We can leave it to std::sync and parking_lot:: as there does not appear to be any disagreement on that being wrong.

return;
}

for ty_clause in &cx.tables.generator_interior_types {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding based on the way check_fn gets called is that cx.tables gets set up here: https://github.com/rust-lang/rust/blob/58dd1ce8383aaebcad9b6027b89a316fd868b35c/src/librustc_lint/late.rs#L184

I am pretty sure these are the same tables as:

let def_id = cx.tcx.hir().local_def_id(hir_id);
let tables = cx.tcx.typeck_tables_of(def_id);

where hir_id is passed into check_fn and cx is the LateContext.

Okay so that explains the where the TypeckTables come from. Then generator_interior_types is a vector of GeneratorInteriorTypeCause. The docs for that type says

Whenever a value may be live across a generator yield, the type of that value winds up in the GeneratorInteriorTypeCause struct.

So if a type ends up inside one of these TypeCause structs inside said vector then it is being held across a yield (which I am assuming is coming from an await).

Reading this now I realize I used the variable name "ty_clause" when I probably should have used "ty_cause".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, TIL cx.tables gets updated per function, it used to work differently

@jimblandy
Copy link

jimblandy commented Apr 12, 2020

The original post in #4226 talks about the reactor running other code while the mutex is locked, and while this is true, it's misleading about why that's a problem.

  • In an ordinary multi-threaded program, other code runs while locks are held, too. Nothing else can lock the mutex, so the data it owns is safe. So this isn't a concern.

  • If your task is spawned to another thread, or on a thread pool, then the guard could be dropped on a different thread than it was acquired on, which is UB for most mutex implementations. But MutexGuard isn't Send, so you'd get an error trying to spawn the future on such an executor anyway. So this isn't a concern.

  • Even if your task always stays on the same thread, it might race the lock-holding future against another future that tries to lock the same mutex. std::sync::Mutex panics or deadlocks if the thread that owns it tries to lock it again. This is a legitimate concern.

However, async-std and tokio both provide their own versions of Mutex that should be fine to use in asynchronous code. Their lock operations block asynchronously, they work properly when the same thread attempts to re-lock, and their guards are Send, so futures of code that holds them across awaits can be spawned onto other threads. The lint should recommend using these, instead of std::async.

This is another reason for matching the full path of the type: just because it's named Mutex doesn't mean you shouldn't hold it across an await.

@rokob
Copy link
Contributor Author

rokob commented Apr 14, 2020

@jimblandy Thank you for that extra detail. Everything you say is correct, but I want to add the additional issue as I understand it based on #4893. Most often one does not want to hold a lock across an await even if it is "safe" to do so.

My initial impression was that usually one should not be using std::sync::Mutex in async code AND not holding a MutexGuard across an await point. The former is a technical issue which can be solved with a different mutex, but the latter is a logic bug that could lead to deadlock. Maybe that is not the intention here.

For my own understanding, do we want to lint against holding a guard across an await point because it is similar to holding a lock and calling out to arbitrary code in non-async code? Or do we want to lint against it because the lock implementation itself has reentrancy issues?

At the very least we should say you don't want to use std::sync::Mutex in this context and put something like you suggested in rust-lang/rust#71072:

help: If you need to hold a mutex guard while you're awaiting, you must use an async-aware version of the Mutex type.
help: Many asynchronous foundation crates provide such a Mutex type.

But should we make the stronger statement that holding any MutexGuard is a sign of a potential logic issue?

@jimblandy
Copy link

jimblandy commented Apr 14, 2020

For my own understanding, do we want to lint against holding a guard across an await point because it is similar to holding a lock and calling out to arbitrary code in non-async code? Or do we want to lint against it because the lock implementation itself has reentrancy issues?

This is a good question - Clippy is meant to give advice, so it makes sense for Clippy to report usage that is usually ill-advised, even if not necessarily wrong.

One of the things users of mutexes are supposed to know is that they are intended for low-contention use. Mutexes are great for short, get-in-and-get-out critical sections. Threads waiting on a mutex aren't woken up in any particular order, so a highly-contended mutex can cause starvation, or unreasonable latency, at least. I think there are other reasons, too.

Awaiting is often/usually waiting for I/O or other external activities to complete. If you're holding a mutex across such an an await, that lock may be held for a long time, and you are likely to end up with a highly contended mutex, and experience avoidable sadnesses.

So there are several different potential topics for lints here:

  1. Using a std::sync::Mutex in async code:

    a) If the MutexGuard is held across an await, this is certainly a bug, because std::sync::Mutex isn't prepared for other tasks on the same thread to try to lock it. Definite lint.

    b) If the MutexGuard is not held across an await, this could be fine. If the Mutex is used in the classic get-in-and-get-out pattern, this won't even violate the principle that async tasks should not run for a long time. Unclear whether a lint is helpful or noise.

  2. Using an async-aware Mutex, like async_std::sync::Mutex or tokio::sync::Mutex:

    a) If it's not held across an await, it's not clear what the point of using this type is, instead of the standard library's. But seems barely worth linting - maybe they just want to not get in trouble if they add an await in the future.

    b) If it is held across an await, then that's not incorrect, but it suggests that you might end up contending more than you'd like. Maybe lint, pointing out that Mutexes are best for low-contention use?

(Maybe the async-aware Mutexes have additional features to mitigate the consequences of contention? Fair wakeups? If so, I don't see it in the docs.)

But should we make the stronger statement that holding any MutexGuard is a sign of a potential logic issue?

If it's an async-aware MutexGuard, I don't think it is a logic issue. At least, I don't see one. No other task (or thread) is going to get access to the data it owns until it's unlocked, and nobody's going to deadlock/panic trying to take the lock.

@jimblandy
Copy link

@Manishearth
Copy link
Member

a) If it's not held across an await, it's not clear what the point of using this type is, instead of the standard library's

backpressure, for one. plus the runtime's scheduler can be smarter about it

@tmandry
Copy link
Member

tmandry commented Apr 14, 2020

I agree we should limit the lint to non async aware Mutex types (and sorry for leading you down the wrong path before, somehow it didn't occur to me that these had the same name as their non-async counterparts!) I think there are plenty of legitimate use cases for these in async code and am worried about introducing a lint which is too noisy.

More experience may prove me wrong, but I personally don't have data to support the idea that we should lint against using async-aware mutexes in async code.

@rokob rokob force-pushed the lock-await branch 7 times, most recently from 4bbecf4 to ba18dde Compare April 17, 2020 06:53
@rokob
Copy link
Contributor Author

rokob commented Apr 17, 2020

I updated the language here and changed the method for finding what to lint so this also captures blocks and closures. I was trying to make the diagnostics more specific and came across rust-lang/rust#71137 and there is rust-lang/rust#71203 open to fix that. Having that extra span would make the diagnostics here nicer but its okay without it too.

@flip1995 flip1995 requested a review from Manishearth April 17, 2020 14:16
@Manishearth
Copy link
Member

@bors r+

thanks!

@bors
Copy link
Contributor

bors commented Apr 22, 2020

📌 Commit ba18dde has been approved by Manishearth

@bors
Copy link
Contributor

bors commented Apr 22, 2020

⌛ Testing commit ba18dde with merge 2861c78...

bors added a commit that referenced this pull request Apr 22, 2020
Lint for holding locks across await points

Fixes #4226

This introduces the lint await_holding_lock. For async functions, we iterate
over all types in generator_interior_types and look for types named MutexGuard,
RwLockReadGuard, or RwLockWriteGuard. If we find one then we emit a lint.

changelog: introduce the await_holding_lock lint
@bors
Copy link
Contributor

bors commented Apr 22, 2020

💔 Test failed - checks-action_test

rokob added 3 commits April 21, 2020 21:07
Fixes rust-lang#4226

This introduces the lint await_holding_lock. For async functions, we iterate
over all types in generator_interior_types and look for types named MutexGuard,
RwLockReadGuard, or RwLockWriteGuard. If we find one then we emit a lint.
@flip1995
Copy link
Member

@bors r=Manishearth

@bors
Copy link
Contributor

bors commented Apr 22, 2020

📌 Commit 8b052d3 has been approved by Manishearth

@bors
Copy link
Contributor

bors commented Apr 22, 2020

⌛ Testing commit 8b052d3 with merge 1d4dd3d...

@bors
Copy link
Contributor

bors commented Apr 22, 2020

☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test
Approved by: Manishearth
Pushing 1d4dd3d to master...

@bors bors merged commit 1d4dd3d into rust-lang:master Apr 22, 2020
bors added a commit that referenced this pull request Oct 25, 2020
Add lint for holding RefCell Ref across an await

Fixes #6008

This introduces the lint await_holding_refcell_ref. For async functions, we iterate
over all types in generator_interior_types and look for `core::cell::Ref` or `core::cell::RefMut`. If we find one then we emit a lint.

Heavily cribs from: #5439

changelog: introduce the await_holding_refcell_ref lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lint for using await while holding a MutexGuard
7 participants