Figure out a story for non-determinism and external communication #653

gnzlbg · 2019-03-07T18:01:04Z

extern crate rand;

pub fn foo() {
    let _ = rand::thread_rng();
}

fn main() {
    foo();
}

yet many tests use a random number generator. We might want to intercept the FFI calls involved in this, and do something with them (probably forward them? no idea).

RalfJung · 2019-03-08T09:31:44Z

Yeah -- this is blocked on having some flag and infrastructure for non-determinism.

Let's re-purpose this issue to discuss that.

Related issues: #224 #641

RalfJung · 2019-03-08T09:44:17Z

So, here are my thoughts on that: Miri will have two modes. One is deterministic, that's the current mode. Then there is the non-deterministic mode. In non-deterministic mode, there is a seed that is used as the source for all non-determinism in Miri -- IOW, Miri has an RNG that it seeds and then uses whenever the program makes a non-deterministic choice.

In non-deterministic mode, when the NR_GETRANDOM syscall is made, we generate a new random number from Miri's RNG and return that. We also (eventually) use this same RNG to pick base a address for allocations, so that we can support the full range of pointer arithmetic. If/when Miri supports running multiple threads, this same RNG will control the behavior of the scheduler.

Ideally, we'd have the following invariants:

Miri with a fixed seed is still deterministic, meaning its behavior is fully reproducible (across platforms). However we do not guarantee that different versions of Miri will make the same program run the same way with a fixed seed! More non-deterministic choices might be added, so this guarantee of reproducibility only applies to a fixed version of Miri.
When deterministic Miri runs a program without error, then the program will behave the same under non-deterministic Miri.
Even stronger: the only case where deterministic and non-deterministic Miri differ in behavior is when deterministic Miri shows an Error in the "unsupported" category (see Improve error message: Unsupported operation vs. undefined behavior #417).

Some problems / open questions:

Can't call foreign function: clock_gettime #641 is about supporting clock_gettime, that would be a kind of non-determinism not governed by the RNG and hence would violate invariant (1). Should such operations be subject to yet another flag? We'd have three "levels" of Miri: deterministic, seeded non-determinism, and the last level would also allow interaction with the outside world and entirely give up on reproducibility.
Our current pointer equality test violates invariant (2). Maybe we should just declare pointer equality unsupported in almost all cases when running in deterministic mode?
What is the default mode? At least for cargo miri, probably the mode that supports most things should be default. However, note that non-determinism (and even more so external interaction) weakens the checks Miri does: just because it didn't find UB with one seed, doesn't mean there is no UB with other seeds.

(I should add that I personally thing test suites should always use RNGs with a fixed seed, to avoid spurious failures. But (a) clearly many people have a different opinion, and (b) there are other sources of non-determinism that we definitely want to support, so we might as well support this one too.)

oli-obk · 2019-03-08T09:51:59Z

is about supporting clock_gettime, that would be a kind of non-determinism not governed by the RNG and hence would violate invariant (1). Should such operations be subject to yet another flag?

I think we could just seed the time with the random number generator and just increment by a random nonzero value every time clock_gettime is called.

Maybe we should just declare pointer equality unsupported in almost all cases when running in deterministic mode?

Seems reasonable

However, note that non-determinism (and even more so external interaction) weakens the checks Miri does: just because it didn't find UB with one seed, doesn't mean there is no UB with other seeds.

We can just always choose a random seed if none is supplied, and emit a message before running the code that dumps the seed and informs the user how to rerun with said seed and a disclaimer mentioning that it's now nondeterministic and might accidentally skip over certain kinds of UB.

RalfJung · 2019-03-08T09:55:03Z

I think we could just seed the time with the random number generator and just increment by a random nonzero value every time clock_gettime is called.

That seems extremely confusing. And people will likely ask for file system or network access eventually. I feel like accessing the clock falls into the same category.

gnzlbg · 2019-03-08T10:54:40Z

Both for the clock, the random number generator, etc. I think miri needs to provide two things.

First, the ability to initialize these in such a way that the results are "deterministic". Second, the ability to choose a non-deterministic "source" that miri uses to initialize these.

E.g. by default miri's clock can always start at zero, but maybe an user wants to initialize the clock from the system random device to use miri as a "fuzzing" tool (independently of whether that's a good or bad idea). If miri finds an issue, it would be helpful for the user to input whatever seed miri obtained from the random device - that is, miri needs to both output this information on error, and allow the user to somehow input it.

The seeds for the different sources of non-determinism are not stable API-wise. E.g. different PRNGs have differently-sized seeds, and miri should not be stuck on a particular choice of PRNG forever.

RalfJung · 2019-03-08T10:59:27Z

Both for the clock, the random number generator, etc. I think miri needs to provide two things.

First, the ability to initialize these in such a way that the results are "deterministic". Second, the ability to choose a non-deterministic "source" that miri uses to initialize these.

I disagree for the clock. The clock falls into the category "communication with the external world", which is different from "pure" non-determinism. I think we should not conflate the two.

The seeds for the different sources of non-determinism are not stable API-wise. E.g. different PRNGs have differently-sized seeds, and miri should not be stuck on a particular choice of PRNG forever.

I don't understand the point of this comment. Nothing I said sticks Miri to any particular choice of PRNG. And what do seed sizes have to do with anything?
Miri does not care about what PRNG your program uses. It will intercept FFI calls to, e.g., libc::syscall(NR_GETRANDOM, ..) and follow the contract of that syscall by filling in some bytes non-deterministically. It will use its own PRNG for that purpose, but that has nothing to do with which PRNG the interpreted program uses.

gnzlbg · 2019-03-08T11:02:45Z

I don't understand the point of this comment.

If I get an error and want to reproduce, I need to be able to pass the seed used by miri for its own internal PRNG, e.g., [1, 2, 3, 4]. I want to test for this error, so I add cargo miri test --with-prng-seed=1,2,3,4 to my CI and call it a day.

Later, miri gets updated to use a different PRNG, which requires a 16 element seed. This breaks my CI because I'm not passing --with-prng-seed enough arguments.

RalfJung · 2019-03-08T12:09:27Z

All RNGs can be seeded from a byte slice, right? I was just going to suggest -Zmiri-seed=<hex string>. Or we might just make it a 64bit hex number and use seed_from_u64.

gnzlbg · 2019-03-08T13:15:09Z

All RNGs can be seeded from a byte slice, right?

Yes.

Or we might just make it a 64bit hex number and use seed_from_u64.

Maybe use -Zmiri-seed-u64= for that ? rand recommends at least a 100-bit wide seed, but this often depends on the PRNG underneath.

RalfJung · 2019-03-08T13:27:39Z

Actually this brings me to another concern I forgot to mention: if people run crypto code in Miri and expect that to be secure, we have a problem. Given that we want to start everything from a fixed seed, the randomness from NR_GETRANDOM will not be suited for crypto.

If we accept that Miri shouldn't run crypto, it's really not a big deal to initialize with 64bits instead of more.

I think -Zmiri-seed-u64 exposes implementation details unnecessarily. The seed is going to be a hex string, most likely; we just have to decide whether we interpret this hex string as u64 (meaning it has a max length, but seed_from_u64 makes sure that the RNG will be well-seeded even if you use, say, 1) or whether we interpret it as [u8] and feed the entire thing to the seed directly (meaning you can properly seed arbitrarily-sized RNGs, but if you use 1 as the seed the randomness is going to be pretty bad).

Part of rust-lang#653 This allows us to properly implement getrandom(), which unlocks the default HashMap type (e.g. HashMap<K, V>) with RandomState) This commit adds a new '-Zmiri-seed=<seed>' option. When present, this option takes a 64-bit hex value, which is used as the seed to an internal PRNG. This PRNG is used to implement the 'getrandom()' syscall. When '-Zmiri-seed' is not passed, 'getrandom()' will be disabled. To allow using 'rand' properly, I bumped the patch versions of a few dependencies, which brings them up to the current version of the 'rand' crate

Part of rust-lang#653 This allows us to properly implement getrandom(), which unlocks the default HashMap type (e.g. HashMap<K, V>) with RandomState) This commit adds a new '-Zmiri-seed=<seed>' option. When present, this option takes a 64-bit hex value, which is used as the seed to an internal PRNG. This PRNG is used to implement the 'getrandom()' syscall. When '-Zmiri-seed' is not passed, 'getrandom()' will be disabled.

RalfJung · 2019-04-26T06:57:57Z

Seeding RNGs with system entropy now works in Miri. However, the question remains what our story shall be for other kinds of interaction with the OS, like accessing the file system or querying the current time.

gnzlbg · 2019-04-26T07:24:56Z

I've opened an issue in the UCGs to track up to which level floating-point arithmetic should be deterministic: rust-lang/unsafe-code-guidelines#123

E.g. always non-deterministic vs some degree of determinisms going from "always deterministic" to "for the same inputs, host, target, compiler options, and toolchain", or "for the same input, host, target, and compiler options (only - independent of toolchain)" or "for the same input, host, target, and default compiler options (independent of toolchain - some compiler options might introduce non-determinism, excluding optimization levels)" or "same as before, but where optimization level is allowed to introduce non-determinism", etc.

I think it would be interesting to hear what the limits would be with respect to what would be feasible to implement in miri.

CryZe · 2019-04-26T07:26:17Z

What's the story in regard to pointer math, considering that is also non-deterministic? I've been playing around with cargo miri test and noticed that some dependencies use pointer math to align pointers for some speedups (in particular memchr), which is where miri refuses to continue. Maybe there already is a solution to that, but at least the way it is written there, breaks miri (https://github.com/BurntSushi/rust-memchr/blob/master/src/fallback.rs#L64)

bjorn3 · 2019-04-26T07:45:43Z

@CryZe there is https://doc.rust-lang.org/stable/std/primitive.pointer.html#method.align_offset.

CryZe · 2019-04-26T13:23:42Z

That only somewhat solves the issue. While miri does accept this, it returns usize::max_value() indicating that it failed aligning the pointer, which probably doesn't solve the problem.

RalfJung · 2019-04-26T13:27:41Z

What's the story in regard to pointer math, considering that is also non-deterministic?

That would be #224.

I think it would be interesting to hear what the limits would be with respect to what would be feasible to implement in miri.

Well, basically anything someone is willing to actually implement. ;) The moment things get symbolic, though, everything becomes very painful. We did that for allocations because we have to anyways for CTFE; not sure if this is worth the cost for floating point.

gnzlbg · 2019-04-29T09:15:08Z

Well, basically anything someone is willing to actually implement.

So, does that mean that it would be "ok" (just hard to implement), for the results of floating-point math to start depending on LLVM's inlining heuristics? That would mean that a "good" miri implementation would need to replicate the behavior of LLVM's optimization pipeline, optimizing the MIR accordingly, and evaluating the MIR after LLVM optimizations, including backend optimizations, and would probably need to do so for the different LLVM versions that Rust supports today (from LLVM 6 to LLVM 9). I suppose this would need a MIR -> LLVM-IR -> opt (with backend opts) -> LLVM-IR -> MIR pass in miri or similar (preserving types, lifetimes, etc.).

RalfJung · 2019-04-29T10:59:26Z

So, does that mean that it would be "ok" (just hard to implement), for the results of floating-point math to start depending on LLVM's inlining heuristics?

I don't think that would be a reasonable language specification. But discussing language spec is outside the scope of a Miri issue. that would be a UCG discussion.

I think proper handling of floating point needs some foundational work to have a spec before we can implement anything reasonable in Miri. Incidentally, some of my colleagues at MPI are working on exactly that. ;)

RalfJung · 2019-06-29T11:09:03Z

Closing in favor of #800: the discussion here mixed up the non-determinism story (which we are close to resolving, with intptrcast becoming the default mode soon) and external communication. Also, RNGs (which the issue here was originally about) have worked for quite some time already now.

RalfJung changed the title ~~rand::thread_rng() fails under miri~~ Figure out a story for non-determinism Mar 8, 2019

RalfJung mentioned this issue Mar 8, 2019

implement intptrcast model #224

Closed

10 tasks

RalfJung added C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement A-interpreter Area: affects the core interpreter labels Mar 8, 2019

Aaron1011 mentioned this issue Apr 7, 2019

Implement non-deterministc mode #683

Merged

RalfJung mentioned this issue Apr 9, 2019

Can't call foreign function: clock_gettime #641

Closed

RalfJung changed the title ~~Figure out a story for non-determinism~~ Figure out a story for non-determinism and external communication Apr 26, 2019

RalfJung mentioned this issue May 25, 2019

Allow passing through ioctls #743

Open

This was referenced Jun 24, 2019

dlsym not implemented for: __pthread_get_minstack #789

Closed

(When) Should we do intptrcast (and other opt-in features) per default? #785

Closed

Plan for external communication / isolation #800

Closed

RalfJung closed this as completed Jun 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out a story for non-determinism and external communication #653

Figure out a story for non-determinism and external communication #653

gnzlbg commented Mar 7, 2019

RalfJung commented Mar 8, 2019 •

edited

Loading

RalfJung commented Mar 8, 2019 •

edited

Loading

oli-obk commented Mar 8, 2019

RalfJung commented Mar 8, 2019

gnzlbg commented Mar 8, 2019

RalfJung commented Mar 8, 2019 •

edited

Loading

gnzlbg commented Mar 8, 2019 •

edited

Loading

RalfJung commented Mar 8, 2019 •

edited

Loading

gnzlbg commented Mar 8, 2019

RalfJung commented Mar 8, 2019

RalfJung commented Apr 26, 2019

gnzlbg commented Apr 26, 2019

CryZe commented Apr 26, 2019

bjorn3 commented Apr 26, 2019

CryZe commented Apr 26, 2019

RalfJung commented Apr 26, 2019

gnzlbg commented Apr 29, 2019 •

edited

Loading

RalfJung commented Apr 29, 2019

RalfJung commented Jun 29, 2019

Figure out a story for non-determinism and external communication #653

Figure out a story for non-determinism and external communication #653

Comments

gnzlbg commented Mar 7, 2019

RalfJung commented Mar 8, 2019 • edited Loading

RalfJung commented Mar 8, 2019 • edited Loading

oli-obk commented Mar 8, 2019

RalfJung commented Mar 8, 2019

gnzlbg commented Mar 8, 2019

RalfJung commented Mar 8, 2019 • edited Loading

gnzlbg commented Mar 8, 2019 • edited Loading

RalfJung commented Mar 8, 2019 • edited Loading

gnzlbg commented Mar 8, 2019

RalfJung commented Mar 8, 2019

RalfJung commented Apr 26, 2019

gnzlbg commented Apr 26, 2019

CryZe commented Apr 26, 2019

bjorn3 commented Apr 26, 2019

CryZe commented Apr 26, 2019

RalfJung commented Apr 26, 2019

gnzlbg commented Apr 29, 2019 • edited Loading

RalfJung commented Apr 29, 2019

RalfJung commented Jun 29, 2019

RalfJung commented Mar 8, 2019 •

edited

Loading

RalfJung commented Mar 8, 2019 •

edited

Loading

RalfJung commented Mar 8, 2019 •

edited

Loading

gnzlbg commented Mar 8, 2019 •

edited

Loading

RalfJung commented Mar 8, 2019 •

edited

Loading

gnzlbg commented Apr 29, 2019 •

edited

Loading