Remove lazy_static dependancy #52

josephlr · 2019-07-01T23:10:57Z

EDIT1: We make the following changes:

Replace wasm32-stdweb's use of lazy_static with static mut and std::sync::Once
Add a sync_init method to LazyUsize complementing unsync_init.
Add a LazyFd abstraction around LazyUsize to work with Option<i32> values, were we either have None or Some(x) with x non-negative.
Use the LazyFd in use_file, eliminating our last use of lazy_static.

EDIT2: Note that this does not try to remove all std dependencies from use_file.rs. That is for a followup CL.

Centril · 2019-07-02T03:26:55Z

cc @RalfJung, @eddyb -- can you please have a look at this in terms of static mut? (rust-lang/rust#62082 wants to use getrandom in libstd)

Centril · 2019-07-02T03:37:47Z

also cc @alexcrichton

src/use_file.rs

josephlr · 2019-07-02T12:25:31Z

Here's my basic idea for adding a one-time synchronization method to LazyUsize.

// Runs the init() function exactly once.
pub fn sync_init(&self, init: impl FnOnce() -> usize, mut wait: impl FnMut()) -> usize {
    loop {
        match self.0.compare_and_swap(Self::UNINIT, Self::ACTIVE, Relaxed) {
            Self::UNINIT => {
                let val = init();
                self.0.store(val, Relaxed);
                return val;
            }
            Self::ACTIVE => wait(),
            val => return val,
        }
    }
}

Where wait would be sleep_ms(1) or something, and:

// The initialization is not completed.
pub const UNINIT: usize = usize::max_value();
// The initialization is currently running.
pub const ACTIVE: usize = usize::max_value() - 1;

Then we can just use the fact that non-negative RawFd's will fit in the range [0, usize::max_value() - 2].

newpavlov · 2019-07-02T12:44:52Z

Looks good! Small nitpicks:

I think we can use impl Fn() for wait and init functions.
I am not sure about Relaxed ordering. Especially considering that you will use it together with LazyBool/LazyUsize on some targets. @HadrienG2 what do you think?
Zero is a valid value for file descriptor. It does not matter for the provided snippet, but you've used "non-zero RawFd's" in your comment.

HadrienG2 · 2019-07-02T12:55:31Z

@newpavlov Relaxed is correct for synchronizing atomic accesses to a single memory location, which I understand is what you are doing here.

If you want to synchronize accesses to multiple memory locations (e.g. non-atomically writing data to a shared UnsafeCell and using a separate shared AtomicBool flag to tell other threads when the write is done), then you need stronger memory orderings like Acquire/Release pairs.

josephlr · 2019-07-02T12:56:36Z

I think we can use impl Fn() for wait and init functions.

Isn’t FnMut a weaker trait bound than Fn?

I am not sure about Relaxed ordering. Especially considering that you will use it together with LazyBool/LazyUsize on some targets. @HadrienG2 what do you think?

Does this matter if we’re only using a single atomic value?

Zero is a valid value for file descriptor. It does not matter for the provided snippet, but you've used "non-zero RawFd's" in your comment.

Typo, now fixed.

newpavlov · 2019-07-02T13:12:53Z

Isn’t FnMut a weaker trait bound than Fn?

Yes, but our callbacks will not contain any state, so stronger bound will work as well. It's just my preference, you can leave it as-is.

Does this matter if we’re only using a single atomic value?

On some targets you use it together with LazyBool/LazyUsize which also use relaxed load/store under the hood. But I guess since use of the second atomic resides behind branch execution of which is explicitly dependent on a certain non-default value of the first atomic it should not create any problems.

dhardy · 2019-07-02T13:49:31Z

Good point. I think on Linux we could just use a single AtomicUsize if we allow another special value to imply "use getrandom syscall", but on Solaris I believe we will need either multiple atomic vars or one "gate" atomic plus one or more data statics.

HadrienG2 · 2019-07-02T13:54:23Z

Does this matter if we’re only using a single atomic value?

On some targets you use it together with LazyBool/LazyUsize which also use relaxed load/store under the hood. But I guess since use of the second atomic resides behind branch execution of which is explicitly dependent on a certain non-default value of the first atomic it should not create any problems.

I'm not sure if this is what you are talking about, but loads depending on relaxed loads can actually still cause problems:

if a.load(Ordering::Relaxed) == SOMETHING {
    return b.load(Ordering::Relaxed);
}

Because compiler and CPU trickery can effectively transform the program into this:

let tmp = b.load(Ordering::Relaxed);
if a.load(Ordering::Relaxed) == SOMETHING {
    return tmp;
}

(Not to mention that on the producer side, writes can also be reordered)

If that kind of transformation is undesirable, as is the case when using an atomic as a "ready flag" to tell that other shared memory regions were written to, then you need Acquire when loading from a on the consumer side and Release when writing to it on the producer side.

dhardy · 2019-07-02T14:00:28Z

Thanks @HadrienG2 — I believe this example posted earlier is correct however.

HadrienG2 · 2019-07-02T14:01:47Z

@dhardy Yup, it uses stronger Acquire/Release memory orderings to avoid the aforementioned reordering effects.

As pointed out by @RalfJung on that example, using Relaxed ordering on the CAS instead of AcqRel ordering would actually be fine as long as init_data() does not read shared data that is synchronized as a side-effect of using an Acquire load on the flag.

newpavlov · 2019-07-02T15:36:34Z

@HadrienG2
Our code can be represented as:

// `a` is initialized with C1, and `b` with `C2`
let val = match a.load(Relaxed) {
    C1 => {
        let val = init1();
        a.store(val, Relaxed);
        val
    }
    val => val,
};

if val == 1 {
    do_stuff1();
} else {
    loop {
        let val = match b.compare_and_swap(C2, C3, Relaxed) {
            C2 => {
                let val = init2();
                b.store(val, Relaxed);
                val
            }
            C3 => continue,
            val => val,
        };
        do_stuff2(val);
    }
}

At the first glance it shouldn't have any problems, since compare_and_swap is a side-effect operation and can not be executed before checking value of the first atomic.

newpavlov · 2019-07-02T15:44:28Z

BTW compare_and_swap compiles down to a lock instruction, while relaxed load compiles to a simple mov. Maybe it's worth to add code like this before the loop?

let val = self.0.load(Relaxed);
if val != Self::UNINIT && val != SELF::ACTIVE {
    return val;
}

dhardy · 2019-07-02T16:01:19Z

@newpavlov IIUC x86 is strongly ordered, implying most aquire/release boundaries are unnecessary anyway. You should therefore check code for another target like ARM.

In your example, I think it doesn't actually matter if b is loaded before a is stored anyway — they are independent. (But if not, the read to a must be acquire and the store to a must be release, I believe.)

HadrienG2 · 2019-07-03T07:53:10Z

@dhardy As a minor aside, I would advise caution against using "what the hardware does" as a reference for atomic ordering correctness, because the compiler can introduce additional reorderings on top of those that hardware performs, in a manner that can change in future releases even on a given architecture.

@newpavlov Here's my understanding/analysis of the code summary that you posted:

// `a` is initialized with C1
let val = match a.load(Relaxed) {
    C1 => {
        let val = init1();
        a.store(val, Relaxed);
        val
    }
    val => val,
};

First, a number of threads will read C1 from a, concurrently execute init1(), and race to write the output of that function in a. From my understanding, the correctness conditions at this point are:

It must be correct for multiple executions of init1() to occur in parallel
It must be correct for only one output of these executions to be remembered and the others silently discarded (but still upheld by the threads that produced them).
From my understanding of the code's logic, init1() must probably also be prevented from outputting the value C1, and I will assume that this is true in the following.

if val == 1 {
    do_stuff1();
} else {
    /* ... */
}

After this first initialization stage, any thread which reads 1 from a, or produces that value and writes it to a, will then execute do_stuff1() without synchronizing with the init1() call that preceded the initialization of a. This is my understanding of the correctness conditions here:

Either do_stuff1() must not depend on any side-effect/write to shared memory from init1(), or the two functions must implement their own synchronization protocol to synchronize with each other. If both of these sentences are incorrect, then you must provide your own synchronization by using a.load(Ordering::Acquire) and a.store(Ordering::Release) in the previous code snippet.
It must be correct for do_stuff1() and init1() to run in parallel.
If multiple calls to init1() can return different values, then it must be correct for do_stuff1() and the entire loop from the else branch to run in parallel.

if val == 1 {
    /* ... */
} else {
    // `b` is initialized with `C2`
    loop {
        let val = match b.compare_and_swap(C2, C3, Relaxed) {
            C2 => {
                let val = init2();
                b.store(val, Relaxed);
                val
            }
            C3 => continue,
            val => val,
        };
        do_stuff2(val);
    }
}

If a does not end up holding the value 1 but another value (which again I assume to be different from C1), then the threads move to a more classic lazy initialization logic where they race to replace the initial value of b, called C2, with another special value called C3. The winner of the race will get the privilege of running init2() and writing its output to b, while the other threads will be waiting in a spin loop. Assuming that the output of init2() is neither C2 nor C3, every thread involved will then proceed to call do_stuff2() with that output of that parameter, without synchronizing with the side-effects of init2(). Thus, my understanding is that...

It must be correct for init1() and either of init2() or do_stuff2() to run in parallel.
If init1() is allowed to sometimes return 1 and sometimes return something else, then it must additionally be correct for do_stuff1() and either of init2() or do_stuff2() to run in parallel.
Either do_stuff2() must not depend on any side_effect of init2() or these side effects must be taken care of by the two functions. If that's not the case, then you must also switch to an Acquire/Release synchronization protocol here (Acquire on the CAS, Release on the store).
It is almost certainly incorrect for init2() to return either C2 or C3.

Hope this attempt at summarizing the underlying assumptions helps.

On an unrelated note, I agree that you should probably check the value of b before doing a compare and swap. From my personal microbenchmarks, atomic read-modify-write instructions like CAS are ~10x more expensive than loads and stores on x86 in serial code, and here you're additionally running them in a loop on multiple threads which will additionally cause cache invalidation and contention...

josephlr · 2019-07-04T00:57:31Z

Hey everyone, I've updated this PR and the description to incorporate your recommendations, and I split up the commits to make review easier.

@HadrienG2 thanks so much for the rundown. I've tried to address all of your points by explictly putting your concerns in the comments. BTW 8566c39 contains the changes to LazyStatic and sync_init() which are the tricky parts of this PR. The other commits are mostly straightforward wrappers.

It must be correct for multiple executions of init1() to occur in parallel

It must be correct for do_stuff1() and init1() to run in parallel.

It must be correct for init1() and either of init2() or do_stuff2() to run in parallel.

If multiple calls to init1() can return different values, then it must be correct for do_stuff1() and the entire loop from the else branch to run in parallel.

For our implementation, these parallelism assumptions are correct. Also, in the documentation on unsync_init, I note that the init() function could run in parallel with other callers.

If init1() is allowed to sometimes return 1 and sometimes return something else, then it must additionally be correct for do_stuff1() and either of init2() or do_stuff2() to run in parallel.

In the documentation on unsync_init, I note that the init() function should always return the same value, if it succeeds. This is the case for all of our current use cases.

It must be correct for only one output of these executions to be remembered and the others silently discarded (but still upheld by the threads that produced them).

Either do_stuff1() must not depend on any side-effect/write to shared memory from init1(), or the two functions must implement their own synchronization protocol to synchronize with each other. If both of these sentences are incorrect, then you must provide your own synchronization by using a.load(Ordering::Acquire) and a.store(Ordering::Release) in the previous code snippet.

Either do_stuff2() must not depend on any side_effect of init2() or these side effects must be taken care of by the two functions. If that's not the case, then you must also switch to an Acquire/Release synchronization protocol here (Acquire on the CAS, Release on the store).

I note in the description of LazyUsize that we should only depend on the value returned by init() (along with including a code example). This is the case for all of our current use cases. We only depend on the value of init(), and code in init() that runs before we generate the value.

From my understanding of the code's logic, init1() must probably also be prevented from outputting the value C1, and I will assume that this is true in the following.

It is almost certainly incorrect for init2() to return either C2 or C3.

Agreed on init2() returning C2, which I now mention in the comments for sync_init. However, I think it should be fine for init() to return UNINIT. This will just return UNINIT to the caller and future calls to sync_init and unsync_init will just run again.

My comments on LazyUsize explain this "retrying" behavior, and the assumptions necessary to make it work. PTAL

On an unrelated note, I agree that you should probably check the value of b before doing a compare and swap.

Done, and good idea.

HadrienG2 · 2019-07-04T09:51:31Z

LGTM :)

src/use_file.rs

newpavlov · 2019-07-04T18:28:09Z

Regarding CI tests for Redox we can bring them back after rust-lang/libc#1420 gets fixed.

RalfJung · 2019-07-11T17:53:52Z

src/util.rs

-    // multiple times. If init() returns UNINIT, future calls to unsync_init()
-    // will always retry. This makes UNINIT ideal for representing failure.
+    // of init(). Multiple callers can run their init() functions in parallel.
+    // init() should always return the same value, if it succeeds.
    pub fn unsync_init(&self, init: impl FnOnce() -> usize) -> usize {
        // Relaxed ordering is fine, as we only have a single atomic variable.


This assumes that a file descriptor consists only of that integer and no other user-space data. I suppose that is accurate on Unixes?

Yup this is correct, libc::open always returns a positive int (link is to the Linux docs, but the function is a POSIX standard).

So a file descriptor will always fit in a positive int on unix

RalfJung · 2019-07-11T17:55:01Z

src/util.rs

+            match self.0.compare_and_swap(Self::UNINIT, Self::ACTIVE, Relaxed) {
+                Self::UNINIT => {
+                    let val = init();
+                    self.0.store(val, Relaxed);


IMO it would be a good idea to assert! here that init did not return UNINIT or ACTIVE. This is not hot code, so always asserting seems fine.

Oh, seems like the function is actually allowed to return UNINIT to indicate "just try again"? I guess this is implied by the doc comment not ruling out that case, but might be worth calling out explicitly.

ACTIVE can still be asserted.

Oh, seems like the function is actually allowed to return UNINIT to indicate "just try again"? I guess this is implied by the doc comment not ruling out that case, but might be worth calling out explicitly.

This is called out, just in the docs for LazyUsize.

// Both methods support init() "failing". If the init() method returns UNINIT, // that value will be returned as normal, but will not be cached.

ACTIVE can still be asserted.

Good idea. Thanks @RalfJung for taking a look at this stuff.

I think it may be better to write something like this:

let mut val = init(); if val == Self::ACTIVE { val = Self::UNINIT } self.0.store(val, Relaxed);

And describe this behavior in docs.

RalfJung · 2019-07-11T17:57:32Z

src/util_libc.rs

+                None => LazyUsize::UNINIT,
+            },
+            || unsafe {
+                libc::usleep(1000);


Wow that's a long sleep, isn't it? A comment seems always warranted with such a magic number.

Ya this sleep should probably be shorter, any recommendations? My guess is that we would want it to be longer than the normal latency for an open(2) syscall.

On my laptop libc::open takes ~7µs and on playground ~8µs, so I think 10µs will be a good value. (it's a very coarse measurement, since it also includes time to retrieve current time)

I'm sorry this is outside my range of experience.^^ 1ms just seemed like a long time. /dev/random doesn't even do "real" I/O, so it shouldn't have huge latency.

It can be a huge latency if system entropy pool has not initialized, IIRC up to several minutes.

Sure, but that's not the common case.

Ideally wait would do something like double the wait time each time around the loop, starting at e.g. 1µs and maxing out at 1s or so. But that seems like overkill.^^

josephlr · 2019-07-11T19:19:07Z

@newpavlov @RalfJung this is a good discussion, but might be better had on #60 instead of this closed PR.

josephlr mentioned this pull request Jul 1, 2019

Add Structures for lock-free initialization #51

Merged

josephlr force-pushed the static branch from 439e982 to 6f69984 Compare July 2, 2019 01:59

josephlr mentioned this pull request Jul 2, 2019

Use getrandom crate rust-lang/rust#62082

Closed

newpavlov reviewed Jul 2, 2019

View reviewed changes

src/use_file.rs Outdated Show resolved Hide resolved

josephlr force-pushed the static branch from 6f69984 to 8129fb3 Compare July 3, 2019 09:00

Remove lazy_static dep for wasm32

5d3482f

josephlr force-pushed the static branch 3 times, most recently from 1548e7f to 3615137 Compare July 4, 2019 00:43

josephlr added 3 commits July 3, 2019 17:54

Add sync_init and update LazyStatic docs

8566c39

Add LazyFd

442fb43

Remove lazy_static dependancy

c199187

josephlr force-pushed the static branch from 3615137 to c199187 Compare July 4, 2019 00:54

Disable CI for Redox (it cannot build right now).

b44549e

newpavlov reviewed Jul 4, 2019

View reviewed changes

src/use_file.rs Outdated Show resolved Hide resolved

Use io::Error::last_os_error

925cf56

newpavlov merged commit 0c72017 into rust-random:master Jul 4, 2019

josephlr deleted the static branch July 4, 2019 22:24

RalfJung reviewed Jul 11, 2019

View reviewed changes

josephlr mentioned this pull request Jul 11, 2019

Incorperate feedback on Lazy helpers #60

Merged

This was referenced Jul 30, 2019

v0.1.x: rand, parking_lot and related updates tokio-rs/tokio#1358

Closed

ThreadSanitizer detected data race on getrandom 0.1.7 upgrade #68

Closed

newpavlov mentioned this pull request Jun 7, 2022

Use AtomicPtr instead of AtomicUsize for Weak #263

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove lazy_static dependancy #52

Remove lazy_static dependancy #52

josephlr commented Jul 1, 2019 •

edited

Loading

Centril commented Jul 2, 2019

Centril commented Jul 2, 2019

josephlr commented Jul 2, 2019 •

edited

Loading

newpavlov commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 •

edited

Loading

josephlr commented Jul 2, 2019

newpavlov commented Jul 2, 2019

dhardy commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 •

edited

Loading

dhardy commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 •

edited

Loading

newpavlov commented Jul 2, 2019

newpavlov commented Jul 2, 2019

dhardy commented Jul 2, 2019 •

edited

Loading

HadrienG2 commented Jul 3, 2019 •

edited

Loading

josephlr commented Jul 4, 2019

HadrienG2 commented Jul 4, 2019

newpavlov commented Jul 4, 2019

RalfJung Jul 11, 2019

josephlr Jul 11, 2019

RalfJung Jul 11, 2019

RalfJung Jul 11, 2019

josephlr Jul 11, 2019

newpavlov Jul 11, 2019

RalfJung Jul 11, 2019

josephlr Jul 11, 2019

newpavlov Jul 11, 2019 •

edited

Loading

RalfJung Jul 11, 2019

newpavlov Jul 11, 2019

RalfJung Jul 11, 2019

josephlr commented Jul 11, 2019

Remove lazy_static dependancy #52

Remove lazy_static dependancy #52

Conversation

josephlr commented Jul 1, 2019 • edited Loading

Centril commented Jul 2, 2019

Centril commented Jul 2, 2019

josephlr commented Jul 2, 2019 • edited Loading

newpavlov commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 • edited Loading

josephlr commented Jul 2, 2019

newpavlov commented Jul 2, 2019

dhardy commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 • edited Loading

dhardy commented Jul 2, 2019

HadrienG2 commented Jul 2, 2019 • edited Loading

newpavlov commented Jul 2, 2019

newpavlov commented Jul 2, 2019

dhardy commented Jul 2, 2019 • edited Loading

HadrienG2 commented Jul 3, 2019 • edited Loading

josephlr commented Jul 4, 2019

HadrienG2 commented Jul 4, 2019

newpavlov commented Jul 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

newpavlov Jul 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josephlr commented Jul 11, 2019

josephlr commented Jul 1, 2019 •

edited

Loading

josephlr commented Jul 2, 2019 •

edited

Loading

HadrienG2 commented Jul 2, 2019 •

edited

Loading

HadrienG2 commented Jul 2, 2019 •

edited

Loading

HadrienG2 commented Jul 2, 2019 •

edited

Loading

dhardy commented Jul 2, 2019 •

edited

Loading

HadrienG2 commented Jul 3, 2019 •

edited

Loading

newpavlov Jul 11, 2019 •

edited

Loading