Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Rng::gen_weighted_bool #308

Merged
merged 4 commits into from
Mar 21, 2018

Conversation

pitdicker
Copy link
Contributor

As discussed in #293.

@dhardy
Copy link
Member

dhardy commented Mar 17, 2018

I'd prefer to have gen_bool aka Bernoulli before deprecating this.

@pitdicker
Copy link
Contributor Author

Implemented gen_bool.

I went the easiest route here.
I copied over the latest iteration of the code to generate a float high precision, but have not adapted it to use more than 32/64 bits per float. Also I have not made a separate Bernoulli distribution (#300), that just feels like 'too much' to me for such a simple function.

Thanks for the suggestion to use my higher precision code here, that gives an excuse to add it 😄. But I'll hold of a little while longer for the range code, we would need to have some discussion first for how to expose it...

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some little doc issues but the code looks good.

src/lib.rs Outdated
fn gen_weighted_bool(&mut self, n: u32) -> bool {
// Short-circuit after `n <= 1` to avoid panic in `gen_range`
n <= 1 || self.gen_range(0, n) == 0
}

/// Return a bool with a `p` probability of being true.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swap p and probability


impl Distribution<$ty> for HighPrecision01 {
/// Generate a floating point number in the open interval `(0, 1)`
/// (not including either endpoint) with a uniform distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is actually half-open (your old open version used rejection sampling; this code will yield 0 when the sampled u32 is zero). For sample < p this is what we want anyway, so it's just the comment (and possibly the distribution name, I'm not sure on this yet) to change.

/// Generate a floating point number in the open interval `(0, 1)`
/// (not including either endpoint) with a uniform distribution.
///
/// This is different from `Uniform` in that it it uses all 32 bits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double 'it'

/// use rand::distributions::HighPrecision01;
///
/// let val: f32 = SmallRng::new().sample(HighPrecision01);
/// println!("f32 from (0,1): {}", val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0,1)

@dhardy
Copy link
Member

dhardy commented Mar 18, 2018

Maybe also document that the smallest non-zero value which can be generated is 0.00000000023283064 = 2.3e-10 (f32) or 0.00000000000000000005421010862427522
= 5.4e-20 as well as that precision is reduced below 0.00195 (f32) or 0.000244 (f64). This is significant for both HighPrecision01 and gen_bool (though of course the method used by the latter may change).

I see the distribution HighPrecision01 is publicly visible but not documented; I think at this point it would probably be best to document it but with a clear warning that it may be adjusted or removed in the future (or we could add an 'experimental' feature and feature-gate it... but we have too many feature gates already really).

@dhardy dhardy added X-enhancement B-API Breakage: API F-new-int Functionality: new, within Rand P-high Priority: high D-review Do: needs review labels Mar 18, 2018
@pitdicker
Copy link
Contributor Author

Thank you for the close read!

as well as that precision is reduced below 0.00195 (f32) or 0.000244 (f64).

This is not really true, as the precision reduces when the number get higher. And it changes with every (negative) power of two. But I have tried to write some documentation with similar intent.

@dhardy
Copy link
Member

dhardy commented Mar 18, 2018

Okay, I'm happy for this to be merged. I'd prefer you wait 2-3 days from first opening however in case anyone else has a concern.

@pitdicker
Copy link
Contributor Author

Added a commit to use gen_bool in the Binomial distribution.

@pitdicker
Copy link
Contributor Author

pitdicker commented Mar 20, 2018

@dhardy I am getting second thoughts about this implementation of gen_bool. I tried to figure out exactly how much the extra accuracy buys us, and if it is possible to make it faster.

What are out limitations of gen_bools accuracy?

  • The accuracy of p. p depends on the 'sliding accuracy scale' (just made up the term 😄) of the floating point format. In the worst case it is accurate to 1 in 2^53 for f64, and 1 in 2^24 for f32. 1 in 2^24 means a bias could be introduced every once in 16.7 million. That seems like an amount that might just influence the correctness of some result, so that rules out f32 for p.
  • The accuracy of the generated float. This accuracy only matters directly around the value of p. If the generated value is clearly lower or higher, the accuracy doesn't matter. HighPrecision01 will only improve the result in the cases where the value is within EPSILON (2^-52) of p.
  • The accuracy of the generated integer by the RNG. We can go with an u32, u64 or even use multiple integers.

When can accuracy be problem when it is too little?
Suppose we have an accuracy of 1 in 2^52 (the accuracy of gen::<u64>()). Then only once in every 2^52 (~4,5e15) runs of gen_bool the result could introduce a bias. So this will only matter to an algorithm that runs for much more than 2^52 iterations, and that depends for the correctness of the result on the accuracy of gen_bool.

I would say that is something very rare. And if you have an algorithm where the result depends on the accuracy of a function you are going to run for more then 2^52 times, I think it is your responsibility to glance over that function and determine if it fits your use.

So an accuracy of 2^24 is sometimes too little. 2^52 is very more than enough. Could 2^32 be reasonable? That would allow us to use a single u32 from the RNG, potentially making it about 2× faster. It would give enough accuracy that it only starts to matter a bit after quite some more than 2^32 rounds, 4 billion. That seems very reasonable to me for a common method as gen_bool.

Now I think the following implementation can be interesting, because it turns into a comparison against a constant if p is constant. What we do is not convert the integer from the RNG to a float, but convert p to an integer.

fn gen_bool(&mut self, p: f64) -> bool {
	assert!(p >= 0.0 && p <= 1.0);
	let p_int = (p * core::u32::MAX as f64) as u32;
	self.gen() < p_int
}

@dhardy
Copy link
Member

dhardy commented Mar 20, 2018

That sounds reasonable (though it should be self.gen() <= p_int if you consider p=1.0).

1 in 2^32 bias is probably okay. I have been involved in experiments which may have used around 2^32 Bernoulli samples, but I doubt a single sample error would have had much effect on the results.

I was wondering if the equivalent using u64 would even be possible; I'm not sure: u64::MAX is not representable exactly, hence conversion to f64, [multiplication by 1] and conversion back to u64 might result in overflow.

@pitdicker
Copy link
Contributor Author

though it should be self.gen() <= p_int

Thank you, meant to write that... 😄

The equivalent with u64 will not give 64 bits of precision. We would certainly have rounding rounding issues going from float to integer, and I remember them to be unpredictable (i.e. is is possible to change the rounding mode of the CPU). I would not use this method for u64.

@pitdicker
Copy link
Contributor Author

pitdicker commented Mar 20, 2018

Changed gen_bool as discussed.

When I tried to benchmark (does it really perform as hoped?) I had some trouble with SmallRng, so added a benchmark for that one too.

test misc_gen_weighted_bool          ... bench:       4,176 ns/iter (+/- 406) (deprecated)
test misc_gen_bool                   ... bench:       4,043 ns/iter (+/- 172) (before)

test gen_u32_xorshift                ... bench:       1,101 ns/iter (+/- 115) = 3633 MB/s
test misc_gen_bool                   ... bench:       1,557 ns/iter (+/- 157)
test misc_gen_bool_var               ... bench:       1,594 ns/iter (+/- 62)

src/lib.rs Outdated
@@ -551,7 +551,8 @@ pub trait Rng: RngCore {
/// ```
fn gen_bool(&mut self, p: f64) -> bool {
assert!(p >= 0.0 && p <= 1.0);
self.sample::<f64, _>(distributions::HighPrecision01) < p
let p_int = (p * core::u32::MAX as f64) as u32;
p_int > self.gen()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why swap left-right sides and comparator now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise type inference couldn't figure it out... the alternative was self.gen::<u32>() <= p_int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, really? But you need to use >= then.

@@ -864,7 +865,7 @@ impl SeedableRng for StdRng {
}

/// An RNG recommended when small state, cheap initialization and good
/// performance are required. The PRNG algorithm in `SmallRng` is choosen to be
/// performance are required. The PRNG algorithm in `SmallRng` is chosen to be
/// efficient on the current platform, **without consideration for cryptography
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Xorshift has good next_u32 performance but not as good next_u64 performance as several other generators. Wait, something's wrong:

test gen_u32_xorshift      ... bench:       4,635 ns/iter (+/- 198) = 862 MB/s
test gen_u64_xorshift      ... bench:       2,840 ns/iter (+/- 93) = 2816 MB/s

Impossible that next_u64 is faster than next_u32. Anyway, be careful comparing benchmarks for gen_bool: I would imagine it most useful in heavy numerical simulators which would likely either use native 64-bit generators or buffered generators, i.e. a u32 may not be half the price of a u64.

Copy link
Contributor Author

@pitdicker pitdicker Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, sometimes it will be about half the price, sometimes it just makes no difference.

And you are getting bitten again (I think) by the rust bug with multiple codegen units and benchmarks harness. Can you retry with export RUSTFLAGS="-C codegen-units=1"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; didn't affect most tests (including xorshift bytes with around 900MB/s) but:

test gen_u32_xorshift      ... bench:       1,045 ns/iter (+/- 59) = 3827 MB/s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that was with another change calling black_box less frequently. Without that I get approx 3000 MB/s. The u64 results only change by about 50MB/s however.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be nice if you cleaned this up and pulled the HighPrecision01 stuff into a separate PR, since it's not directly connected any more.

/// let mut rng = thread_rng();
/// println!("{}", rng.gen_bool(1.0 / 3.0));
/// ```
fn gen_bool(&mut self, p: f64) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a unit test, at the very least testing that gen_bool(1.0) doesn't panic, or better testing that both 1.0 and 0.0 produce the expected outputs a few times over.

@pitdicker
Copy link
Contributor Author

It would also be nice if you cleaned this up and pulled the HighPrecision01 stuff into a separate PR, since it's not directly connected any more.

Can't I sneak in anything quietly? 😄

@pitdicker
Copy link
Contributor Author

Rebased, removed the addition of HighPrecision01, and combined a few commits, now that they only changed a single line etc.

I have added a test for gen_bool, and changed the benchmarks to also use black_box less.

@pitdicker
Copy link
Contributor Author

The benchmark results have changed a bit, but finally realistic:

test misc_gen_bool                   ... bench:       1,184 ns/iter (+/- 3)
test misc_gen_bool_var               ... bench:       4,129 ns/iter (+/- 41)

A floating point multiply is quite expensive compared to a couple of shifts and XORs, as it should be. Still when p is variable performance is similar to the old gen_weighted_bool and the method converting to floats.

@dhardy
Copy link
Member

dhardy commented Mar 21, 2018

Looks good, apart from using p_int >= self.gen() als already suggested

@pitdicker
Copy link
Contributor Author

Ah, the comment was collapsed.

self.gen::<u32>() <= p_int is the same as p_int > self.gen() (not an >=). But I can change the line if you like.

@dhardy
Copy link
Member

dhardy commented Mar 21, 2018

No it's not.

@pitdicker
Copy link
Contributor Author

You are right. What was I confusing it with??? (updated)

@pitdicker
Copy link
Contributor Author

Ready to merge?

@dhardy dhardy merged commit 4acef1b into rust-random:master Mar 21, 2018
@pitdicker pitdicker deleted the deprecate_weighted_bool branch March 21, 2018 18:34
pitdicker pushed a commit that referenced this pull request Apr 4, 2018
@vks vks mentioned this pull request Apr 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API D-review Do: needs review F-new-int Functionality: new, within Rand P-high Priority: high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants