Revise `Rng` methods #293

dhardy · 2018-03-10T12:56:36Z

Summary:

deprecate gen_iter and gen_ascii_chars
possibly move shuffle, choose and choose_mut: topic of Sequence sampling: seq, WeightedChoice dhardy/rand#82
deprecate gen_weighted_bool
add gen_bool
no other changes (@vks disagrees on this, but seems to be alone)

Original post:

Rng has:

fill(dest), try_fill(dest) — will slices/arrays; mostly added to avoid need to use RngCore for its fill_bytes
sample(distribution) — alternative to distribution.sample(rng); possibly not very useful
gen() — sample anything supporting the Uniform distribution; @clarcharr suggested removing this but I don't think that is likely
gen_iter(), gen_ascii_chars() — deprecated
gen_range(low, high) — shortcut for Range::sample_single(low, high, rng); probably worth keeping
gen_weighted_bool(n) — simple extension of gen_range(); likely not used massively but simple and clear, so I don't see a motive to remove
choose(slice), choose_mut(slice) — get a ref to a random element; simple and somewhat useful
shuffle(slice) — shuffling algorithm; since this is an actual algorithm and not so directly related to RNGs it may be better removing from Rng and adding some shuffle trait instead (allowing slice.shuffle(rng))

Possible additions:

p(p) / chance(p) / gen_bernoulli(p) — sample a boolean with given chance (i.e. roughly rng.gen() < p, but we might implement a distribution with more accurate sampling for small p)

Summary (suggested changes):

gen_iter and gen_ascii_chars are being removed
shuffle may be removed
sample and gen could be removed, but less likely
a method to sample booleans may be added

The text was updated successfully, but these errors were encountered:

pitdicker · 2018-03-10T15:26:38Z

I always though of gen_weighted_bool(n) as a strange function. It will only work for small fractions like 1/2, 1/3, 1/4, 1/5. It can't be used to generate bools with something like 40, 60 or 80% chance. And I don't imagine it ever getting used with a large n, because the difference between 1/99 and 1/100 is not often meaningful.

For such small integers the implementation of gen_weighted_bool(n) to use gen_range is maybe not even the best choice. For gen_weighted_bool(5) it can be much faster to just compare a generated value against u32::MAX / 5 (instead of using gen_range), with just about unmeasurable bias.

Or gen_weighted_bool() could be changed to use a floating point fraction between 0.0 and 1.0. That wouls also be faster and more flexible to what we have now.

Because using gen_weighted_bool is not all that usable at the moment beyond chances between 1/2 and 1/6, and because writing things manually is just as short and easy, I would suggest removing it.

Examples of what to replace rng.gen_weighted_bool(3) with (from fastest to slowest):

rng.gen() < std::u32::MAX / 3
rng.gen() < (1.0f32 / 3.0)
rng.gen_range(0, 3) < 1

pitdicker · 2018-03-10T15:46:09Z

Forgot to say, good overview!

sample(distribution) — alternative to distribution.sample(rng); possibly not very useful

I think it is very useful, for the reason that it makes it easy to use distributions. And I see it as the counterargument against removing gen() because it makes using other kinds of distributions as easy as Uniform.

shuffle(slice) — shuffling algorithm; since this is an actual algorithm and not so directly related to RNGs it may be better removing from Rng and adding some shuffle trait instead (allowing slice.shuffle(rng))

I don't think the reason "this is an actual algorithm and not so directly related to RNGs" is very strong. Especially because I don't see even the possibility of a different algorithm (although we could make it faster by making much better use of the bits generated by the RNG). But making it possible to write slice.shuffle(rng) seems like a win.

I would definitely like to keep gen() and gen_range(low, high), and choose(slice) and choose_mut(slice) also seem like a good idea to keep.

If we add some other method to sample from booleans, gen_weighted_bool(n) definitely has to go. I wouldn't pick names like p(p) or gen_bernoulli(p). If you have a mathematical background these names are fine, but I would pick a name that speaks to more people. Maybe just gen_bool(p). But I am happy to have none.

dhardy · 2018-03-11T08:23:36Z

Fair point about gen_weighted_bool; I'm happy to lose that.

I don't think we should support rng.shuffle(slice) and slice.shuffle(rng). If we prefer the latter, we have to add another trait that users have to import.

I guess gen_bool is as good a name as any. We don't have to add this, but the alternative is that users tend to write rng.gen() < p a lot, which is not quite so clear and doesn't let us try to improve precision near 0.

TheIronBorn · 2018-03-11T09:04:05Z

One more reason to keep `shuffle` is to discourage people from writing their own, incorrect, implementation. I see far too many bad ones. Giving users a convenient, fast, correct option makes that less likely.

…

Sent from my mobile device

On Mar 11, 2018, at 00:23, Diggory Hardy ***@***.***> wrote: Fair point about gen_weighted_bool; I'm happy to lose that. I don't think we should support rng.shuffle(slice) and slice.shuffle(rng). If we prefer the latter, we have to add another trait that users have to import. I guess gen_bool is as good a name as any. We don't have to add this, but the alternative is that users tend to write rng.gen() < p a lot, which is not quite so clear and doesn't let us try to improve precision near 0. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

See rust-random#293.

vks · 2018-03-12T15:56:58Z

We could move shuffle and choose to rand::algorithms or similar, under a different trait or as free functions. This would be less ergonomic however. On the other hand, slice.shuffle(rng) is closer to how sort and friends work.

Generating weighted booleans seems very niche and could be supported more cleanly by distributions, so I thinks it makes sense to remove gen_weighted_bool.

sample(distribution) — alternative to distribution.sample(rng); possibly not very useful

Note that it is also possible to use Distribution::sample(&dist, &mut rng) and Rng::sample(&mut rng, dist). They are not equivalent, because the first burrows the distribution while the last consumes it. Currently, this does not make a difference in rand, because all distributions seem to be zero-sized types or implement Copy.

Personally, I prefer Distribution::sample, because this is where the more important documentation is (the RNGs are interchangeable, the distributions not so much). Also, this is where I would expect the other statistical functions/properties of the distribution that might be added in the future (see #290). So I think we should remove Rng::sample as well as Rng::gen and Rng::gen_range. (FWIW, this is similar to the C++ standard library.)

pitdicker · 2018-03-12T16:44:01Z

To put one more idea in the mix: how would we describe the trait?

I would describe Rng as an ergonomic interface to various common uses of randomness. And RngCore the trait with basic functionality RNGs should implement.

Would it then make sense to rename Rng to Random, and RngCore to Rng?

pitdicker · 2018-03-12T19:01:10Z

What about making a trait and module for use with slices? Than you can do both: slice.shuffle(&mut rng), and rng.shuffle(&mut slice). And the implementation could live in the specialized module, not in the Rng trait.

dhardy · 2018-03-12T19:45:07Z

Yes, putting this all in one place makes sense, though I don't know that we need it in Rng too. See: dhardy#82 (comment)

vks · 2018-03-13T09:35:45Z

If we do that and move the sampling to distributions, we could get rid of the the Rng/RngCore distinction.

dhardy · 2018-03-13T10:21:57Z

In theory yes, but I'm not inclined to remove gen or fill.

dhardy · 2018-03-17T18:58:03Z

There are two minor issues with renaming Rng → Random:

Breakage — the current renaming was carefully structured so that a lot of existing code using the old Rng will continue to work without modification
Random as a name less obviously describes the purpose — e.g. foo<R: Random>(rand: &mut R) -> u32 { rand.gen() } is less clear than with the current names IMO

The idea has some merit, but needs more discussion than as a little question in a vaguely-related issue. If you want to push the idea further create a new issue and try to catch more attention (labels, maybe a reddit post, examples of user code). I kind of like the current approach though because RngCore is obviously more of a details name, so instantly says the type is less likely of interest to end users than Rng.

pitdicker · 2018-03-17T19:42:30Z

You are right. I only thought about the imports that would have to change, but this also impacts things like trait bounds. And there is something to say for a clear 'details' trait name as RngCore.

pitdicker · 2018-03-25T13:36:37Z

I think this issue is solved?

See #293.

dhardy added E-question Participation: opinions wanted B-API Breakage: API labels Mar 10, 2018

dhardy mentioned this issue Mar 11, 2018

Tracker: planned changes for 0.5 #232

Closed

33 tasks

vks added a commit to vks/rand that referenced this issue Mar 12, 2018

Remove random()

0396bc5

See rust-random#293.

dhardy mentioned this issue Mar 12, 2018

Sequence sampling: seq, WeightedChoice dhardy/rand#82

Closed

5 tasks

vks mentioned this issue Mar 13, 2018

Deprecate random and weak_rng, add SmallRng #296

Merged

dhardy mentioned this issue Mar 14, 2018

Add Bernoulli distribution #300

Closed

dhardy added F-new-int Functionality: new, within Rand P-medium and removed E-question Participation: opinions wanted labels Mar 14, 2018

pitdicker mentioned this issue Mar 17, 2018

Deprecate Rng::gen_weighted_bool #308

Merged

dhardy closed this as completed Mar 26, 2018

pitdicker pushed a commit that referenced this issue Apr 4, 2018

Remove random()

d606208

See #293.

vks mentioned this issue Apr 9, 2018

Should gen_bool be implemented as a distribution? #380

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise `Rng` methods #293

Revise `Rng` methods #293

dhardy commented Mar 10, 2018 •

edited by pitdicker

Loading

pitdicker commented Mar 10, 2018

pitdicker commented Mar 10, 2018

dhardy commented Mar 11, 2018

TheIronBorn commented Mar 11, 2018 via email

vks commented Mar 12, 2018

pitdicker commented Mar 12, 2018

pitdicker commented Mar 12, 2018

dhardy commented Mar 12, 2018

vks commented Mar 13, 2018

dhardy commented Mar 13, 2018

dhardy commented Mar 17, 2018

pitdicker commented Mar 17, 2018

pitdicker commented Mar 25, 2018

Revise Rng methods #293

Revise Rng methods #293

Comments

dhardy commented Mar 10, 2018 • edited by pitdicker Loading

pitdicker commented Mar 10, 2018

pitdicker commented Mar 10, 2018

dhardy commented Mar 11, 2018

TheIronBorn commented Mar 11, 2018 via email

vks commented Mar 12, 2018

pitdicker commented Mar 12, 2018

pitdicker commented Mar 12, 2018

dhardy commented Mar 12, 2018

vks commented Mar 13, 2018

dhardy commented Mar 13, 2018

dhardy commented Mar 17, 2018

pitdicker commented Mar 17, 2018

pitdicker commented Mar 25, 2018

Revise `Rng` methods #293

Revise `Rng` methods #293

dhardy commented Mar 10, 2018 •

edited by pitdicker

Loading