Implement reentrance detection for std::sync::Once #72311

nbdd0121 · 2020-05-18T00:29:31Z

No description provided.

rust-highfive · 2020-05-18T00:29:34Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

ecstatic-morse · 2020-05-18T17:52:20Z

Spinning unconditionally is risky on operating systems with pre-emptive schedulers even if the critical section is short. Admittedly Once only needs to lock once during its lifetime, so the risk is somewhat mitigated in this case. Is it possible to implement reentrance detection while remaining lock-free?

nbdd0121 · 2020-05-18T18:26:08Z

Spinning unconditionally is risky on operating systems with pre-emptive schedulers even if the critical section is short. Admittedly Once only needs to lock once during its lifetime, so the risk is somewhat mitigated in this case. Is it possible to implement reentrance detection while remaining lock-free?

I would like it to remain lock-free as well but I don't think it's possible. The main hurdle is that the ThreadId must be stored somewhere.

We definitely not want it to be kept with Once, because that's at least a usize penalty for all Once variables, despite it's only used during initialization.
It cannot be a global static because a thread is allowed to recursively initialize difference Onces.

Therefore it must be kept on the stack of the initializing thread (or on the heap, but the initializing thread would own it anyway, so lifetime is the same). So the state_and_queue would need to point somewhere on this thread's stack (in my PR it is called WaiterQueue).

Now we need to avoid WaiterQueue from being freed while others might reference it. If we want to remain lock-free, it means that a thread would need to atomically read the thread ID, compare it, and insert to the linked list if there's no reentrancy. Such thing requires reading at least 2 usizes atomically, which is not possible. The only sensible approach would therefore to use a lock.

I choose a spin-lock because it happens to be encodable in the usize bits, and I think the risk is acceptable because the critical region is short and this is the cold path (as I mentioned in the commit message).

An alternative way is to use a global mutex to protect it.

BTW, some background knowledge about how most C++ runtime implements lazily initialized function-local static variables: the "once guard" is only required to hold the state. A global reentrant mutex mandates that only one thread may run initializers at one moment (even for different variables). This approach is very simple and probably requires no unsafe code to implement in Rust. It disallows multiple threads from initializing different lazy variables but it can thus detect all possible deadlocks.

ecstatic-morse · 2020-05-18T18:41:30Z

I would like it to remain lock-free as well but I don't think it's possible. The main hurdle is that the ThreadId must be stored somewhere.

I also don't have a good solution for this.

Personally, I would not choose to sacrifice lock-freedom, especially when the new lock is a spinlock, to be able to detect reentrant call_once invocations. OTOH, it seems like a good idea to detect potential misuse in the standard library, even if it may plausibly impact performance, since the average user probably cares more about finding bugs. Those that do want the lock-free implementation can always switch to a crate.

nbdd0121 · 2020-05-18T18:53:21Z

TBH I wouldn't be too worried about performance of having a lock:

As mentioned, it is on a cold path and executed at most once per thread for each Once variable.
The only two places where this lock is acquired is wait and WaiterQueueGuard::Drop: for the first case it'll park later, which is implemented as condvar wait, so it needs to hold a lock anyway; for the second case, if it races with other threads, it means there are other threads sleeping, so it need to unpark them, which will require a mutex to be acquired as well!

So the only thing to be worried is the spinlock. It can always be switched to a global mutex if that's the consensus.

ecstatic-morse · 2020-05-18T20:25:28Z

For passers-by, here's a blog post that explains the problems with user-space spin-locks. That one is Rust-specific, but many other sources come to the same conclusion. Of course, triggering the pathological behavior is more difficult in this case since it can only happen during initialization.

parking_lot::Mutex would be perfect here, since it is only a single byte. Unfortunately we can't use it in the standard library yet.

src/libstd/sync/once.rs

Diggsey · 2020-05-20T20:06:46Z

There's another strategy which does not require any additional locking:

Store a separate "chain" in a scoped thread-local variable.
Each link in the thread-local chain stores a pointer to a Once value.
Before executing the initialization function, push a pointer to the current Once onto the thread-local chain.
Before parking the current thread, iterate the chain to detect if the current Once value is already present.

Effectively we are walking up the call-stack to check if the same stack frame already exists further up. This only affects the cold path.

nbdd0121 · 2020-05-20T23:03:56Z

There's another strategy which does not require any additional locking:

Store a separate "chain" in a scoped thread-local variable.

Each link in the thread-local chain stores a pointer to a Once value.

Before executing the initialization function, push a pointer to the current Once onto the thread-local chain.

Before parking the current thread, iterate the chain to detect if the current Once value is already present.

Effectively we are walking up the call-stack to check if the same stack frame already exists further up. This only affects the cold path.

Indeed a very interesting approach! The one caveat is that if we have a chain of $n$ call_once then the performance is $O(n^2)$ rather than $O(n)$ . Checking the ThreadId, on the other hand, is constant, so it wouldn't make the complexity worse.

Given that parking/unparking would need to take locks anyway I think we could tolerate usage of a lock.

nbdd0121 · 2020-05-20T23:34:38Z

cc @matklad, because the implementation is relevant to OnceCell as well.

Dylan-DPC-zz · 2020-06-24T14:09:02Z

r? @dtolnay

dtolnay · 2020-06-25T06:06:26Z

I am not immediately sold on this change, but let me check if others feel we should do this.

@rfcbot poll libs Make std::sync::Once panic instead of deadlock on re-entrant initialization at the expense of complicating the implementation

use std::sync::Once;

static ONCE: Once = Once::new();

fn main() {
    ONCE.call_once(|| ONCE.call_once(|| {}));
}

thread 'main' panicked at 'Once instance cannot be recursively initialized', src/libstd/sync/once.rs:417:39

rfcbot · 2020-06-25T06:06:27Z

Team member @dtolnay has asked teams: T-libs, for consensus on:

Make std::sync::Once panic instead of deadlock on re-entrant initialization at the expense of complicating the implementation

withoutboats · 2020-06-25T12:58:43Z

I think its a net good to turn programmer errors from deadlocks into panics, but I'd want to see concretely the other side of this trade off. It seems like the thread hasn't landed on a consensus for the best implementation. Once its clear what the trade off is I think we should consider if this change is worth it.

bors · 2020-07-18T16:14:23Z

☔ The latest upstream changes (presumably #74468) made this pull request unmergeable. Please resolve the merge conflicts.

spastorino · 2020-07-22T18:50:50Z

This is right now tagged as waiting on team libs-impl but I think it's really waiting on T-libs team consensus. Going to add T-libs label but please feel free to correct me if I'm wrong.

nbdd0121 · 2020-07-22T19:10:21Z

Rebased to solve conflicts from #72414. This PR now automatically detects recursive initialization of SyncLazy as well.

#![feature(once_cell)]
use std::lazy::SyncLazy;
static A: SyncLazy<u32> = SyncLazy::new(|| *A);
fn main() {
    println!("{}", *A);
}

gives

thread 'main' panicked at 'Once instance cannot be recursively initialized', src/libstd/sync/once.rs:418:39
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

bors · 2020-07-28T03:03:51Z

☔ The latest upstream changes (presumably #73265) made this pull request unmergeable. Please resolve the merge conflicts.

The old implementation uses lock-free linked list. This new implementation uses spin-lock so it will allow extra auxillary information to be passed from the running thread to other waiting threads. Using spin-lock may sounds horrible but it totally fine in this scenario: the critical region protected by this is very short (just adding a node to the linked list); and also this is only executed on the cold path (only if Once isn't already completed).

Let the fast path (COMPLETE) to use the constant 0. This may only reduces code sizes by 1 byte on x86/64 but it can reduce an instruction on RISC ISAs.

Dylan-DPC-zz · 2020-09-06T00:57:58Z

Created a thread on zulip to discuss this further.

Amanieu · 2020-09-13T17:20:27Z

I think that we definitely don't want to introduce spin-locking in the standard library. Have you considered trying to implement re-entrance detection on parking_lot::Once? We could try reviving the effort to replace the standard library primitives with parking_lot.

Dylan-DPC-zz · 2020-10-07T23:11:06Z

@nbdd0121 any updates on this?

Dylan-DPC-zz · 2020-10-16T01:06:48Z

@nbdd0121 thanks for taking the time to contribute. I have to close this due to inactivity. If you wish and you have the time you can open a new PR with these changes and we'll take it from there. Thanks

rust-highfive assigned sfackler May 18, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 18, 2020

nbdd0121 changed the title ~~Implement reentrance detection to std::sync::Once~~ Implement reentrance detection for std::sync::Once May 18, 2020

lcnr reviewed May 20, 2020

View reviewed changes

src/libstd/sync/once.rs Outdated Show resolved Hide resolved

dtolnay added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 28, 2020

Elinvynia added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2020

rust-highfive assigned dtolnay and unassigned sfackler Jun 24, 2020

Muirrum added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 17, 2020

spastorino added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 22, 2020

nbdd0121 force-pushed the once_reentry branch from 49d612e to 7d6d2d1 Compare July 22, 2020 19:06

nbdd0121 added 3 commits July 28, 2020 19:54

Add reentrancy detection to std::sync::Once

b3626a9

Swap the encoding used in std::sync::Once

11e6cc5

Let the fast path (COMPLETE) to use the constant 0. This may only reduces code sizes by 1 byte on x86/64 but it can reduce an instruction on RISC ISAs.

nbdd0121 force-pushed the once_reentry branch from 7d6d2d1 to 11e6cc5 Compare July 28, 2020 20:20

crlf0710 added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Aug 14, 2020

crlf0710 added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Sep 4, 2020

Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Sep 15, 2020

camelid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 7, 2020

Dylan-DPC-zz closed this Oct 16, 2020

Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement reentrance detection for std::sync::Once #72311

Implement reentrance detection for std::sync::Once #72311

nbdd0121 commented May 18, 2020

rust-highfive commented May 18, 2020

ecstatic-morse commented May 18, 2020 •

edited

Loading

nbdd0121 commented May 18, 2020 •

edited

Loading

ecstatic-morse commented May 18, 2020 •

edited

Loading

nbdd0121 commented May 18, 2020

ecstatic-morse commented May 18, 2020 •

edited

Loading

Diggsey commented May 20, 2020

nbdd0121 commented May 20, 2020

nbdd0121 commented May 20, 2020

Dylan-DPC-zz commented Jun 24, 2020

dtolnay commented Jun 25, 2020

rfcbot commented Jun 25, 2020 •

edited by Amanieu

Loading

withoutboats commented Jun 25, 2020

bors commented Jul 18, 2020

spastorino commented Jul 22, 2020 •

edited

Loading

nbdd0121 commented Jul 22, 2020

bors commented Jul 28, 2020

Dylan-DPC-zz commented Sep 6, 2020

Amanieu commented Sep 13, 2020

Dylan-DPC-zz commented Oct 7, 2020

Dylan-DPC-zz commented Oct 16, 2020

Implement reentrance detection for std::sync::Once #72311

Implement reentrance detection for std::sync::Once #72311

Conversation

nbdd0121 commented May 18, 2020

rust-highfive commented May 18, 2020

ecstatic-morse commented May 18, 2020 • edited Loading

nbdd0121 commented May 18, 2020 • edited Loading

ecstatic-morse commented May 18, 2020 • edited Loading

nbdd0121 commented May 18, 2020

ecstatic-morse commented May 18, 2020 • edited Loading

Diggsey commented May 20, 2020

nbdd0121 commented May 20, 2020

nbdd0121 commented May 20, 2020

Dylan-DPC-zz commented Jun 24, 2020

dtolnay commented Jun 25, 2020

rfcbot commented Jun 25, 2020 • edited by Amanieu Loading

withoutboats commented Jun 25, 2020

bors commented Jul 18, 2020

spastorino commented Jul 22, 2020 • edited Loading

nbdd0121 commented Jul 22, 2020

bors commented Jul 28, 2020

Dylan-DPC-zz commented Sep 6, 2020

Amanieu commented Sep 13, 2020

Dylan-DPC-zz commented Oct 7, 2020

Dylan-DPC-zz commented Oct 16, 2020

ecstatic-morse commented May 18, 2020 •

edited

Loading

nbdd0121 commented May 18, 2020 •

edited

Loading

ecstatic-morse commented May 18, 2020 •

edited

Loading

ecstatic-morse commented May 18, 2020 •

edited

Loading

rfcbot commented Jun 25, 2020 •

edited by Amanieu

Loading

spastorino commented Jul 22, 2020 •

edited

Loading