Crate refactor #161

dhardy · 2017-07-28T17:12:07Z

This is a rough preview of what these RFCs may result in:

This is a major break-anything-I-like refactor of rand, in an attempt to fix a bunch of issues I and others have found. I don't expect it to be merged as-is, but hopefully a view of what rand could be will help answer questions about how it should be.

There are a few bits unfinished (see TODO and FIXME); also some changes from other PRs should be integrated. Possibly there are a couple of things which could be cleaned up further. But mostly this is ready for review now.

Comments welcome. Each commit should be fairly clean with a low number of changes, but put together there are a lot of changes.

ghost · 2017-07-29T14:07:52Z

README.md

+use rand::thread_rng;
+
+let x: u32 = thread_rng().next_u32();
+println!("{}", u32)


println!("{}", x)

ghost · 2017-07-29T14:10:03Z

src/dist/default.rs

+
+use Rng;
+use dist::Distribution;
+use dist::uniform::{SampleUniform, /*SampleUniform01,*/ codepoint};


what's the purpose of this comment?

To avoid an unused item warning (see FIXME further down)

Lokathor · 2017-07-29T20:04:43Z

Why is Rand gone. That was the best part about things.

bluss · 2017-07-29T21:21:32Z

src/sequence.rs

+    /// println!("{:?}", choices[..].choose(&mut rng));
+    /// assert_eq!(choices[..0].choose(&mut rng), None);
+    /// ```
+    fn choose<R: Rng>(self, rng: &mut R) -> Option<T>;


About dynamic dispatch: The new trait looks a lot better prepared for trait objects already. Wide use of <R: ?Sized + Rng> through the crate should make it possible to use dynamic dispatch, it's just a major breaking change of Rand and other trait implementors, but if that's not an issue, we can do it.

(There's two ways to go about it, using R: ?Sized + Rng with a &mut R parameter or using R: Rng and an R parameter, both work the same way if you have a blanket implementation of Rng for &mut R where R: Rng; I think the &mut R parameter is just as good a choice, but the second form could make it easier if you have Rng "adaptors" or whatever..)

Either way, better usability of Rng from trait objects is something I've wished for, but it's not relevant for all use cases.

withoutboats · 2017-07-30T00:41:07Z

As someone not deeply invested in the differences between different random number generators & distributions and so forth, the way I want to use the rand crate is this:

#[derive(Rand)]
rand::random()

This seems ot have removed both of these features, and its not clear how they were replaced. How do I prepare a random value of my type with this API? What was wrong with these two features?

It would be helpful, with a huge change like this, to host the rustdoc output so users can easily see the API change.

EDIT: To be a bit clearer, the main way I use rand is to create values when writing unit tests, for which I want a "reasonable default" source of randomness & to avoid thinking about it.

bluss · 2017-07-30T09:17:49Z

@withoutboats The Rand trait and its derive seems like an independent part, that can easily be its own crate IMO

Lokathor · 2017-07-30T17:53:39Z

It would be extremely silly for the rand crate to not include the trait for making random values of arbitrary types. Having some sort of sub-crate for just the generators might be fine, but the main crate "rand" should be the one-stop place for all of the common randomness needs.

portal-chan · 2017-07-30T18:40:30Z

I'd also much rather prefer for the Rand trait to stay as opposed to being arbitrarily split off into it's own crate. I don't think it would be much of an improvement to start having to rely on 2 (or 3, if you use rand-derive) crates for pretty basic functionality.

withoutboats · 2017-07-30T18:50:17Z

Rand and Rng are a handshake protocol, analogous to Serialize and Serializer or Hash and Hasher. Rand should be in rand not only for convenience but to establish the protocol as normative.

bluss · 2017-07-30T19:45:09Z

Sure, it's silly to toss out Rand from the rand crate, clearly, but it's an add-on concept that stands separately. The rng is the core. The current Rand trait has not been useful in numerical applications, it's not the part of the crate that gets used IME.

(The Rand in the proposal on internals could be better.)

Edit: See, no Rand trait in the rand-core rfc

burdges · 2017-07-30T22:24:16Z

Internals threads are a mess to read due to internals screwing up the browser's search functionality, but that proposal gives Rand a type parameter for the range, yes? You'd presumably change the gen method like :

pub trait Rand<Dist> : Sized {
    fn rand<R: Rng>(rng: &mut R, dist: Dist) -> Result<Self, RngError>;
}

pub trait Rng {
    type RngError : Sized;
    fn fill(&mut self, dest: &mut [u8]) -> Result<(), RngError>;
    ...
    fn next_u32(&mut self) -> Result<u32, RngError> { ... }
    fn next_u64(&mut self) -> Result<u64, RngError> { ... }
    ...
    fn gen<D,T>(&mut self,d: D) -> Result<T,RngError> where T: Rand<D> { Rand::rand(self,d) }
    ...
}

Lokathor · 2017-07-31T01:16:11Z

I think it's unfortunate that we need to worry about RNGError in all cases. Some RNGs simply will never error, and it's a shame that they're having to take on that weight. Can we arrange things into RNGs that will sometimes error (I guess certain crypto RNGs?) and RNGs that will never error?

nagisa · 2017-07-31T04:48:50Z

Result<T, !>. Although it does need a bit of work to be a "solution".

…

On Jul 31, 2017 4:16 AM, "Lokathor" ***@***.***> wrote: I think it's unfortunate that we need to worry about RNGError in all cases. Some RNGs simply will never error, and it's a shame that they're having to take on that weight. Can we arrange things into RNGs that will sometimes error (I guess certain crypto RNGs?) and RNGs that will never error? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#161 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApc0obZskdmb7_7BsKo_m7keY8TGnTJks5sTSrcgaJpZM4Om2-3> .

dhardy · 2017-07-31T06:53:59Z

Lots of comments, good...

Thanks @bluss , I will look at dynamic dispatch.

DefaultDist is supposed to be the replacement for Rand (currently it doesn't support f32/f64 because of a compiler bug). If so many of you want type-specific generation, maybe Rand deserves to stay as a first class citizen; my beef with it is that it has no good way of documenting what the distribution is for simple types (e.g. char in the crate master generates any code point, even unprinting ones — is this actually what you want?).

It seems better from my POV to write a little more code to specify the distribution of each value when interacting with the generator.

rand::random() went because it depended on gen() which I removed. But it could probably be implemented using DefaultDist — however, as above, should this implement an arbitrary distribution for char, Option, etc?

dhardy · 2017-07-31T07:14:22Z

It would be helpful, with a huge change like this, to host the rustdoc output so users can easily see the API change.

@withoutboats yeah, I wanted to. Any suggestion how?

withoutboats · 2017-07-31T07:35:22Z

I'd recommend just hosting it in a Github Pages repo https://pages.github.com/

dhardy · 2017-07-31T09:43:08Z

Done: https://dhardy.github.io/rand/rand/index.html

burdges · 2017-07-31T09:56:27Z

Another issue : Is an associated RngError type sufficient to support an Rng that operates as a future?

Afaik, there are no Rgns that never error @Lokathor : OsRng sets RngError to io::Error. If you've a single task, like say running some KDF to make a packets along with a Ziggurat algorithm to give a delay, then you could seed a ChaChaRng with output from OsRng, but technically ChaChaRng could still error if you exceed the ChaCha stream length. You can unwrap() if you know you'll never do this. In fact, this should always be the case since since ChaChaRng should only be used for a "single task", but maybe it's safer to make the developer read instructions on ChaChaRng that warn them not to use it too broadly.

Also, if I understand, we must keep the next_u32, next_u64, etc. because these help us build Rngs with different output types. We cannot use fancier tricks involving specialization on both the Rng and the T: Rand because doing so means some trait for the tuple (Rng,Rand) which likely breaks object safety. Yes?

Lokathor · 2017-07-31T11:19:13Z

@burdges I am not an RNG expert, but I would expect most seedable RNGs that are not for cryptography to just never error out once they've been properly seeded. The seeding process might have a problem somehow, but once that works you've just got some state and some step function and that's it. What errors are left? I mean your number stream will loop, but that's not an error.

I guess what I'm trying to get at is that the rand crate needs to support the idea that you get some tiny amount of entropy from the OS (say, one or two u64 values) and then you make your own generator and you run it as often as you want without ever talking to the OS about it again, and you just accept the fact that it'll loop eventually. You want this to program video games with. I think we all understand that "you can program video games with it" is one of the biggest drivers of a programming language's growth.

burdges · 2017-07-31T11:38:47Z

I'd say even non-cryptographic random number generators should error rather than loop, provided their design permits detecting the loop, although not all designs permit this.

We do want Rng to be usable for cryptography regardless, so that requires a Result<T,RngError>. Anyone else can just unwrap() or maybe figure out some scheme by which Result<T,!> could actually be T.

We could just add an InfallibleRng that one impls like a marker trait, or even automagically impls for RngError = !, but actually includes a bunch of default methods that do the unwrap()s.

burdges · 2017-07-31T11:49:40Z

I've said this elsewhere but Rng includes some methods that involve comparisons, like gen_range. These are not ideal for cryptography unless those comparisons are constant time, and many applications do not need that, so Rng cannot be ideal for cryptography throughout.

I think the right solutions might be : Rng's raw-ish methods like next_u64, fill, etc. should be as suitable for cryptography as the underlying impl permits, which should be good enough in the case of OsRng. Above that, there should be a crate, maybe called subtle-rand, that provides a wrapper struct CTRng<R>(R) where R: Rng that fixes all the higher level default methods. Any crates that provide an Rng suitable for cryptography should either fix these default methods themselves, check that CTRng<TheirRng> does, or use specialization to fix CTRng<TheirRng>.

dhardy · 2017-07-31T13:02:01Z

Wait, I didn't notice the error part before?

fn next_u32(&mut self) -> Result<u32, RngError>

Most PRNGs are pretty simple functions that won't error. Detecting a cycle isn't as simple as detecting the same output value because internal state is bigger than that; the only trivial way to do it would be to copy the entire internal state. But for games and simulators you wouldn't want it to error, because most of the time it wouldn't matter: the game/sim state would be quite different by this point, and likely the repeated values would get used for different things, so it wouldn't be noticeable. For scientific simulations it might be useful to get a warning when an RNG repeats itself, but that's about it.

For me, making generators return a result is a complete no-go. If crypto people really want this, maybe there should be separate crypto and numeric generators — but that would be a huge pain with duplicate code (some generators maybe, some distributions), and randomised algorithms could probably work with either — except in a few cases predictable numbers in randomised algs might allow DoS attacks.

Maybe what could work instead is a trait SecureRng: Rng with a method fn error_state -> Option<Error>, and a wrapper function fn try_fill(..) -> Result<(), Error>.

dhardy · 2017-07-31T13:39:07Z

@bluss: There's two ways to go about it, using R: ?Sized + Rng with a &mut R parameter or using R: Rng and an R parameter ...

I don't think they are equivalent: an R parameter isn't Copy and so can't be passed into a function expecting R while also being retained. On the other hand, the compiler has no problems with &mut R being passed into a function and retained even though it's not Copy.

On the other hand the +?Sized everywhere approach seems to work.

Lokathor · 2017-07-31T13:41:14Z

This is reminding me of the Out Of Memory discussions happening elsewhere in the rust world lately. The "normal" case not wanting to be burdened with all the result checking that would come from being absolutely technically correct.

I think that splitting things up probably is the best route

Generator provides next_u32(&mut self) -> u32 and friends
FallibleGenerator provides try_next_u32(&mut self) -> Result<u32,RNGError> and friends
CryptoGenerator is a marker trait that you add to your generator when it is believed that an implementation is cryptographically secure (if you read the manual and seed it right and all that). Probably only applies to various kinds of FallibleGenerator, but I'll let the crypto folks make the call on that one.

Then the Rand trait would (probably?) just be written for Generator. If all you have on hand is a FallibleGenerator you can simply run it however many times to make a Generator and use that to execute your Rand building function. We could probably even provide a helper function or default method somewhere to do this. I suspect that most people willing to put up with using a FallibleGenerator just want the raw numbers though.

I don't recall it getting much attention, but right now there's also a SeedableRng, but I'm not entirely sure on what the point of that trait is compared to just making a new Rng from scratch. Unless it's just providing a uniform way to make generators instead of each one having their own new function? We don't have to throw it out, but maybe it could get a little docs love to explain what it's doing and/or signifying. The StdRng is seedable, and the ThreadRng is not seedable. Why is it that way? That should maybe go in the docs.

EDIT: Spelling I guess, whoops.

burdges · 2017-07-31T14:26:21Z

I just copied RngError from the internals thread, but made it an associated type. I'm not familiar with the OsRng implementation for different systems, but some systems provide random number generation options that can fail or block.

As an aside, I doubt even all CSPRNGs error eventually, maybe even SHAKE fails that. We'll obviously want an Rng for SHAKE eventually, so it's not like even all cryptographic Rngs should be responsible for detecting exhaustion. I only meant that Rngs that use counters like ChaCha do error eventually, but even if they never error you still cannot use them indefinitely. Looping is bad.

Anyways..

You cannot create an infallable PRNG from a failable one @Lokathor except by panicing because the infallable PRNG cannot know how many times some probabilistic algorithm, like the Ziggurat used in every distribution, will call the failable PRNG. You'd damage the entropy pool by sucking out megabytes.

In principle, you could've method that asks if the PRNG can provide enough output. In that way, a developer can give an upper bound on the entropy they require without consuming anything until needed. I suppose the trait might look like

pub trait Rng {
    fn available_entropy(&self, usize)  -> bool { true }
    fn fill(&mut self, dest: &mut [u8]);
    ...
    fn next_u32(&mut self) -> u32 { ... }
    fn next_u64(&mut self) -> u64 { ... }
    ...
    fn gen<D,T>(&mut self,d: D) -> T where T: Rand<D> { Rand::rand(self,d) }
    ...
}

so the available_entropy function can simply default to true, but an OsRng or ChaChaRng can replace it. I donno if all implementations of OsRng can actually do this check, but at worse they can return true, and warn about being used in startup, or even check the system uptime. I removed the RngError entirely because any Rng that needs it can provide it via an inherent method.

About the only issue I forsee here would be if developers start writing rng.available_entropy(1000000000) everywhere, but some OsRng is too honest and says not enough entropy remains. Imho, this works well for temporary SCPRNGs like ChaChaRng because if you even get close to exhausting one of those then you need to redesign your protocol.

dhardy · 2017-07-31T14:37:17Z

@burdges please read this post on the internals thread about why user-space CSPRNGs are a bad idea and the side notes about how "available entropy" is at best a guess.

By the way, fallible is spelt with an i...

... but I still prefer the design I mentioned earlier:

Rng (or Generator) never fails
SecureRng (or CryptoGenerator) has a try_next_.. or try_fill method which can fail, but also implements the Rng trait (probably panicking on error)

burdges · 2017-07-31T16:02:26Z

That's an unrelated conversation. Yes, user-space CSPRNGs used for many or unrelated tasks suck, but that irrelevant. CSPRNGs play an essential role in any protocol with commitments.

Actually virtually every cryptographic protocol incorporates a CSPRNGs, namely its KDFs, usually it merely needs to convert non-uniform randomness into uniform randomness, but sometimes it needs much more. And sometimes it needs to sample non-uniform distributions, like anything lattice-base or anything that must generate an agreed upon delay (anonymity protocols).

Anyway, the reason properly used CSPRNGs need to be recreated isn't to limit the quantity of output, but because they only serve a specific role where some protocol needs determinism.

Also, if OsRng really needs to block, then folks may need futures to manage it eventually.

burdges · 2017-07-31T16:48:38Z

Anyways, if I understand, the worries over CSPRNGs mean OsRng must be present, and ideally the default, but OsRng needs error handling on some platforms, and arguably should on all platforms. In principle, these constraints rule out the fundamental Rng trait being infallible, no? The question should be how to do the error handling, no?

Now backwards compatibility may require InfallibleRng be named Rng, while all PRNGs actually implement another trait, say RawRng. A SecureRng marker trait doesn't work great if the rand crate uses the infallible trait everywhere internally though. Also, you should not assert an Rng is secure just because it wants to return errors.

dhardy · 2017-07-31T16:57:53Z

Yeah, I get that from the crypto perspective, RNGs should be able to return errors by default.

But: can that be implemented without forcing numerical users using infallible RNGs to unwrap everywhere, and without significant overhead?

Otherwise it seems very difficult to find a design without significant compromise from one point of view.

Backwards compatibility isn't the issue; compatibility between these two types of RNG is. My proposal was to make fallible RNGs compatible with the (infallible) Rng trait by moving error handling to a separate method, but I know that's not ideal either.

Anyway: I've implemented the new Rand and restored random(). This is close to the API @aturon suggested aside from error handling now. API doc updated (see opening message).

Lokathor · 2017-07-31T18:42:37Z

@burdges I think you misunderstood what I meant when I suggested that a FallibleGenerator be able to just make a Generator any time it needs to run Rand code. I wasn't suggesting some sort of bridge that calls the FallibleGenerator internally. I'm suggesting that, for example, a ChaChaRng can make a XorShiftRng using 4x u32 as the seed, and (assuming that the ChaChaRng has enough entropy for that) from there it can create the Rand value based on the as many calls to the XorShiftRng as the type needs. At the end of it you just throw out the XorShiftRng and return the Rand value as if it only used up 16 bytes of random output generation.

@dhardy your updated docs show random as a top level function of the crate, but don't show the Rand trait as a top level trait. Might want to put a pub on the use that brings Rand into scope. That's a minor thing though.

I'm not fully up on how the thread local storage works, but could thread_rng() and things that depend on it be made generic, so that programs can pick the generator type that they want to have be used? It looks like "no" but I thought I'd ask. It would be nice to be able to pick the RNG used for that, if possible.

That is kind of the goal here :-)

Rename `AsciiWordChar` and optimize `Codepoint`

Add HC-128 RNG

And a little cleanup around the init functions

Replace `convert_slice_{32,64}` with `read_u{32,64}_into`

Restrict the seed type to a few more array sizes

Make u128 range use widening multiply

* Move the check if it is time to reseed out of the `try_reseed_if_necessary` and make sure that function does not get inlined. * Invert the counter direction. This way we can compare against 0 instead of `self.threshold` * Doing the reseed check after generating a value turns out to be a bit faster.`

Also minor doc fix

Reseeding perf

pitdicker · 2018-01-08T12:41:00Z

This PR is not the way you want to merge the work from your branch, right? Do you want to keep it open? I get a mail for every commit that happens in https://github.com/dhardy/rand because of it 😄

dhardy · 2018-01-08T12:50:34Z

Sorry, didn't realise I was spamming everyone via this PR! You're right, lets close this.

ghost reviewed Jul 29, 2017

View reviewed changes

bluss reviewed Jul 29, 2017

View reviewed changes

pitdicker and others added 27 commits December 14, 2017 19:46

Add HC-128 RNG

e18d489

impl CryptoRng for Hc128Rng

0ac7b86

That is kind of the goal here :-)

Add test for fill_bytes

b9f7123

Rename AsciiWordChar to AlphanumericChar

5741e16

Optimize Codepoint distribution

a0deb78

Merge pull request #75 from pitdicker/char_distrs

e3c8ca1

Rename `AsciiWordChar` and optimize `Codepoint`

Merge branch 'master' into dh-master

4058522

Merge: reorder use/mod declarations

f58b2f1

Fix comment

4e088f1

Remove log dependency; make i128_support transitive

729644c

Merge pull request #74 from pitdicker/hc-128

eaeee11

Add HC-128 RNG

Restrict Seed type

4f1deb7

Replace convert_slice_{32,64} with read_u{32,64}_into

92831f3

And a little cleanup around the init functions

Fold from_rng into SeedableRng

cc6da7a

Merge pull request #77 from pitdicker/blockrng

5d9acdc

Replace `convert_slice_{32,64}` with `read_u{32,64}_into`

Fix sealed implementation of SeedRestriction to prevent extension

e9359ac

Make u128 range use widening multiply

945e8ff

Restrict the seed type to a few more array sizes

4ea098b

Merge pull request #80 from pitdicker/seed_sizes

45ace88

Restrict the seed type to a few more array sizes

Merge pull request #79 from pitdicker/range_128

1f9ce3a

Make u128 range use widening multiply

Add benchmarks for ReseedingRng

402e673

Simplify reseeding erro logic

07717bc

Add rand_core::le::test_read unit test

8990da2

Also minor doc fix

Benchmark reseeding HC-128 instead of Xorshift

14f02a2

Merge pull request #76 from pitdicker/reseeding_perf

d5d9c75

Reseeding perf

Merge in upstream

a4ff5d7

dhardy closed this Jan 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crate refactor #161

Crate refactor #161

dhardy commented Jul 28, 2017 •

edited

Loading

ghost Jul 29, 2017

ghost Jul 29, 2017

dhardy Jul 29, 2017

Lokathor commented Jul 29, 2017 •

edited

Loading

bluss Jul 29, 2017

withoutboats commented Jul 30, 2017 •

edited

Loading

bluss commented Jul 30, 2017

Lokathor commented Jul 30, 2017

portal-chan commented Jul 30, 2017

withoutboats commented Jul 30, 2017 •

edited

Loading

bluss commented Jul 30, 2017 •

edited

Loading

burdges commented Jul 30, 2017

Lokathor commented Jul 31, 2017

nagisa commented Jul 31, 2017 via email

dhardy commented Jul 31, 2017

dhardy commented Jul 31, 2017

withoutboats commented Jul 31, 2017

dhardy commented Jul 31, 2017

burdges commented Jul 31, 2017

Lokathor commented Jul 31, 2017

burdges commented Jul 31, 2017

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017 •

edited

Loading

dhardy commented Jul 31, 2017

Lokathor commented Jul 31, 2017 •

edited

Loading

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017

burdges commented Jul 31, 2017

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017

Lokathor commented Jul 31, 2017

pitdicker commented Jan 8, 2018

dhardy commented Jan 8, 2018

Crate refactor #161

Crate refactor #161

Conversation

dhardy commented Jul 28, 2017 • edited Loading

ghost Jul 29, 2017

Choose a reason for hiding this comment

ghost Jul 29, 2017

Choose a reason for hiding this comment

dhardy Jul 29, 2017

Choose a reason for hiding this comment

Lokathor commented Jul 29, 2017 • edited Loading

bluss Jul 29, 2017

Choose a reason for hiding this comment

withoutboats commented Jul 30, 2017 • edited Loading

bluss commented Jul 30, 2017

Lokathor commented Jul 30, 2017

portal-chan commented Jul 30, 2017

withoutboats commented Jul 30, 2017 • edited Loading

bluss commented Jul 30, 2017 • edited Loading

burdges commented Jul 30, 2017

Lokathor commented Jul 31, 2017

nagisa commented Jul 31, 2017 via email

dhardy commented Jul 31, 2017

dhardy commented Jul 31, 2017

withoutboats commented Jul 31, 2017

dhardy commented Jul 31, 2017

burdges commented Jul 31, 2017

Lokathor commented Jul 31, 2017

burdges commented Jul 31, 2017

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017 • edited Loading

dhardy commented Jul 31, 2017

Lokathor commented Jul 31, 2017 • edited Loading

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017

burdges commented Jul 31, 2017

burdges commented Jul 31, 2017

dhardy commented Jul 31, 2017

Lokathor commented Jul 31, 2017

pitdicker commented Jan 8, 2018

dhardy commented Jan 8, 2018

dhardy commented Jul 28, 2017 •

edited

Loading

Lokathor commented Jul 29, 2017 •

edited

Loading

withoutboats commented Jul 30, 2017 •

edited

Loading

withoutboats commented Jul 30, 2017 •

edited

Loading

bluss commented Jul 30, 2017 •

edited

Loading

dhardy commented Jul 31, 2017 •

edited

Loading

Lokathor commented Jul 31, 2017 •

edited

Loading