UniformInt::sample_single leading zeros approximation is biased #661

TheIronBorn · 2018-12-09T21:43:48Z

For every possible input, each possible output should be equally likely:

const RANGE: u8 = 3;
let zone = RANGE << RANGE.leading_zeros();
let mut map = [0; RANGE as usize];
for r in 0..=u8::max_value() {
    let (h, l) = r.wmul(RANGE);
    if l <= zone { map[h as usize] += 1; }
}
// must be uniform for every value, produces [65, 64, 64]
assert_eq!(map, [map[0]; RANGE as usize]);

(playground link for verification: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=485be3f36a790b767a1d7366dd1de364)

I think all that's required is changing the comparison to l < zone

The text was updated successfully, but these errors were encountered:

TheIronBorn · 2018-12-09T21:47:44Z

modulo for comparison: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=880d7924cf998b68427ba1f88d5d81f6

dhardy · 2018-12-10T09:38:23Z

Well spotted. We could do with tests for this kind of thing but unfortunately I don't think it's practical.

I think the error is here:

                let zone =
                    if ::core::$unsigned::MAX <= ::core::u16::MAX as $unsigned {
                        // Using a modulus is faster than the approximation for
                        // i8 and i16. I suppose we trade the cost of one
                        // modulus for near-perfect branch prediction.
                        let unsigned_max: $u_large = ::core::$u_large::MAX;
                        let ints_to_reject = (unsigned_max - range + 1) % range;
                        unsigned_max - ints_to_reject
                    } else {
                        // conservative but fast approximation
                       range << range.leading_zeros()
                    };

Here ints_to_reject is the number of integers to reject from the range 0..(2.pow(BIT_SIZE)); i.e. the first case would be exclusive from the range 0..(MAX - ints_to_reject + 1), though since ints_to_reject may be zero we can't write that (unless we allow wrapping and avoid rejection for zone == 0).

Instead we consider the upper-bound inclusive which avoids that problem, however this isn't compatible with the second case: there we should subtract 1 if the upper-bound is inclusive. However, what if range == 0? We want MAX which is 0 - 1 with wrapping arithmetic, so this works if we use:

(range << range.leading_zeros()).wrapping_sub(1)

I think this is the best way of doing things since it avoids having to trap for special cases. Alternatively we could drop the range << range.leading_zeros() optimisation and always use modulus; we should benchmark before doing that.

It looks like this only affects UniformInt::sample_single.

Want to make a PR?

TheIronBorn · 2018-12-10T23:01:34Z

Note that the bitmasking method has the same rejection probabilities as our leading zeros method:
https://docs.google.com/spreadsheets/d/1qjBfi4w0IvIdz2MECLyC0w4H1mMPqI8K5r7YmyZJ0XU/edit?usp=sharing
Bitmasking doesn't use multiplication so it's probably worth benchmarking as well

TheIronBorn · 2018-12-11T00:46:52Z

We could do with tests for this kind of thing but unfortunately I don't think it's practical.

We could just do like I have and have tests for smaller integers. The logic is unlikely to change for larger integers.

fix #661

dhardy added X-bug Type: bug report T-distributions P-high Priority: high labels Dec 10, 2018

TheIronBorn added a commit to TheIronBorn/rand that referenced this issue Dec 11, 2018

fix rust-random#661

aa4181c

TheIronBorn mentioned this issue Dec 11, 2018

fix https://github.com/rust-random/rand/issues/661 #662

Merged

dhardy closed this as completed in #662 Dec 14, 2018

dhardy added a commit that referenced this issue Dec 14, 2018

Merge pull request #662 from TheIronBorn/patch-10

033c253

fix #661

jongiddy mentioned this issue Mar 20, 2020

gen_range long-running loop #951

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UniformInt::sample_single leading zeros approximation is biased #661

UniformInt::sample_single leading zeros approximation is biased #661

TheIronBorn commented Dec 9, 2018

TheIronBorn commented Dec 9, 2018

dhardy commented Dec 10, 2018

TheIronBorn commented Dec 10, 2018

TheIronBorn commented Dec 11, 2018

UniformInt::sample_single leading zeros approximation is biased #661

UniformInt::sample_single leading zeros approximation is biased #661

Comments

TheIronBorn commented Dec 9, 2018

TheIronBorn commented Dec 9, 2018

dhardy commented Dec 10, 2018

TheIronBorn commented Dec 10, 2018

TheIronBorn commented Dec 11, 2018