Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: implement a biased rand(a:b) (fix #20582) #28987

Closed
wants to merge 1 commit into from
Closed

Conversation

rfourquet
Copy link
Member

See #20582 for details. This basically uses a+floor(Int,n*rand()) to get a random number in a:a+n-1. This can be significantly faster than regular rand(a:b), but introduces a bias. I don't have a strong opininion about this, but thought that having a PR could move the discussion forward.

This also implements rand(fast(a)) where a is an array, but for some reason this allocates, which trumps the benefits. I have no idea why this is so, as the implementation is simpler for rand(fast(a)) than for rand(a)... any help to sort this out would be appreciated!

This uses a faster method than in rand(a:b), which can be biased,
depending on the length of a:b.
@rfourquet rfourquet added performance Must go faster randomness Random number generation and the Random stdlib labels Aug 31, 2018
@Keno
Copy link
Member

Keno commented Sep 3, 2018

Have we verified that we can't make the algorithm in #29004 faster than this? I'm a bit surprised that generating a random Float would be faster than that.

@rfourquet
Copy link
Member Author

Have we verified that we can't make the algorithm in #29004 faster than this

I think that in the linked discourse thread it was shown to be slighly slower than the biased algo, but faster if the input is converted to Int32 first.
I will investigate a bit more. Maybe the algorithm in #29004 should replace our current default implementation.

I'm a bit surprised that generating a random Float would be faster than that.

I guess it depends on the used RNG. For MersenneTwister, Float64 is what is generated natively, and generating a UInt64 integer requires more random bits than for a Float64.

@affans
Copy link
Contributor

affans commented Sep 4, 2018

I think the algorithm in #29004 should be used. It's worth it to have a miniscule reduction in speed if it means removing the bias. Some applications/scientific simulations may need to generate millions of random numbers, and I don't think it's a good idea if rand() will introduce bias.

If we want to go with the biased implementation then it needs to documented.

@StefanKarpinski
Copy link
Member

I have to say I don't really like this approach. Sure, I get that this just makes the tradeoff of fast and biased versus slower and unbiased explicit, but I'd much rather have my cake and eat it too, by using something like #29004 / https://discourse.julialang.org/t/rand-1-10-vs-int-round-10-rand/14339 to do faster unbiased sampling. It seems premature to declare failure and force the user to chose between fast and correct here.

@affans
Copy link
Contributor

affans commented Sep 17, 2018

@rfourquet Would you be able to implement the new algorithm? I am new to contributing but I think I could probably give this a try. Is there a way I can push to my commits to this PR or do I have to make my own?

@rfourquet
Copy link
Member Author

Welcome @affans ! Seeing activity in the discourse thread in the last two days prompted me to implement this, so I will open another PR soon.

@affans
Copy link
Contributor

affans commented Sep 17, 2018

@rfourquet Thanks! would you remember to tag me so that I can follow along. I feel like this is an intermediate PR I can follow and learn the internals of Julia.

@rfourquet rfourquet changed the title RFC: implement rand(fast(a:b)) (fix #20582) RFC: implement a biased rand(a:b) (fix #20582) May 17, 2019
@rfourquet
Copy link
Member Author

I think this should be in a package at best.

@rfourquet rfourquet closed this May 17, 2019
@rfourquet rfourquet deleted the rf/rand/fast branch May 17, 2019 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants