-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make rand work with AbstractArray instead of only with Range #8309
Conversation
@@ -160,24 +160,24 @@ maxmultiple(k::Uint128) = div(typemax(Uint128), k + (k == 0))*k - 1 | |||
maxmultiplemix(k::Uint64) = convert(Uint64, div((k >> 32 != 0)*0x0000000000000000FFFFFFFF00000000 + 0x0000000100000000, k + (k == 0))*k - 1) | |||
|
|||
immutable RandIntGen{T<:Integer, U<:Unsigned} | |||
a::T # first element of the range | |||
k::U # range length or zero for full range | |||
k::U # range length (or zero for full range) | |||
u::U # rejection threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still get any benefit from having both T
and U
as type parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The random generation logic is implemented only with 3 types (parameter U
): Uint32
, Uint64
, Uint128
, and the generation for all types (param. T
) is implemented in term of those. An instance r
of RandIntGen needs to know what type of number to produce, in a call like rand(r)
, this is given by T
. It's probably possible to have only one type parameter, but I felt that was beyond the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That explains it. Thanks!
This change looks good to me. Any objections? |
Why are you removing the |
The |
I updated the title of this PR as it had been set automatically according to the non very descriptive branch name. |
Cc: @andreasnoack |
Actually, I also wanted to restrict the possibility for the field |
The I think that our |
Updated PR. Yes |
I would like to switch the argument order for |
+1 for @simonster's suggestion. That will mean that we can't extend the signature for |
I don't think it's likely to cause problems if we simultaneously deprecate |
@simonster I prefer your solution to do it now! however it's then more a breaking change forcing users to change their code all at once instead of having a depreciation warning. So should I revert this changed |
My point is that you can still have the deprecation warning. There is no valid signature for |
OK, I missed that there would both be |
d7d89cb
to
e859850
Compare
I removed |
78a1f14
to
91b58f7
Compare
Rebased. |
RandIntGen{T<:Unsigned}(r::UnitRange{T}) = isempty(r) ? error("range must be non-empty") : RandIntGen(first(r), convert(T,last(r) - first(r) + 1)) | ||
# generator API | ||
# randintgen(k) returns an object generating random integers in the range 1:k | ||
randintgen{T<:Unsigned}(k::T) = k<1 ? error("range must be non-empty") : RandIntGen(T, k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message should be more descriptive to the current situation where the range is gone.
Should the invariant (k>0
) be checked in a inner constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivarne: you are right that the message is not ideal, but I didn't know the best way to change it. The invariant in an inner constructor seems like a good idea. From the user POV, the range is gone not because there is no an a
field anymore, but because she can use other things (array-like) than ranges. The message could be "the collection passed to rand/rand! must not be empty" ("collection" is used in the docs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I did not suggest what the message should be. RandIntGen
doesn't know there is a collection, so that feels wrong too. How about?
error("No integers in the range [1, $k]. Did you try to pick an element from a empty collection?")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this solution.
Bump! Any objections to me merging this in a few days? |
I think this looks good to merge. Perhaps give a day for @lindahua @andreasnoack to take a look? |
91b58f7
to
1adbe3e
Compare
This is not ready for merging. I rebased to accomodate with recent @JeffBezanson changes, so please review if I didn't mess up with The tests don't pass now: the problem is that |
Once the problem of length is fixed for "small" integers, my preferred solution would be to simply remove failing tests: for example the range |
This is based on @ivarne idea (cf. JuliaLang#8255 (comment)), and continues commit 48f27bc in "avoiding duplicating the getindex logic". The API to get a RandIntGen object is now to call randintgen(n). This allows any Integer type to implement this function (e.g. BigInt). Previously, a call like rand(big(1:10)) caused a stack overflow, it is now a "no method matching" error, until randintgen(::BigInt) is implemented, possibly using a new type similar to RandIntGen.
Sorry I missed the discussion, but I think
is not the right perspective. The full range isn't "not really a range", in fact a lot of work was done to make full ranges working as other ranges. Also #5550. |
I don't think I said that a full range is "not really a range", and I want |
I was taking about the sentiment "full ranges are not really a range", you are quoted with ">". Anyway, thanks for planning to fix this, it took me some work. The old version - you saw it - used a representation corresponding to a function |
So if I understand, this commit makes some calls to I'm all for generality, but removing functionality in the mean time doesn't help. This also caused a performance problem (#8563). It really looks like this should be reverted. |
OK, I reverted this in 6117328 |
The other aspects of this change are fine, if they can be done in a way |
@ivarne no I'm sorry, it's all on me! |
Am I missing something or can't we have just both? |
@mschauer yes of course! I'm now working on a proposition, which could also handle more ranges, like |
Unfortunately, we can't just weigh functionality gained against functionality lost – they are asymmetrical since people may be depending on existing functionality. Thus we can't break things that currently work without serious consideration. |
This is a rewrite of 6d329ce, 9d0282e and d9814ff (from JuliaLang#8309, which was reverted).
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange are handled differently so that those with overflowing length still work. In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange are handled differently so that those with overflowing length still work. In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one. Previously, RangeGenerator objects could create (scalar of array of) random values in a range a:b, taking care of creating first a random value in 0:b-a and then adding a. Choosing a random value in an AbstractArray A was then using A[rand(1:length(A))]. Here, RangeGenerator is changed to only handle the creation of random values in a range 0:k, and determining the right value of k (length(A)-1 or b-a) and picking the right element using the random value v (A[1+v] or a+v) is left to separate (and minimal) methods. Hence Range and AbstractArray are handled as uniformly as possible, given that we still want to support ranges like typemin(Int):typemax(Int) for which length overflows.
This is essentially two changes:
1:n
is via a call torandintgen(n)
. This is to be able in particular to extend the mechanism toBigInt
for which the existing typeRandIntGen
doesn't fit well (cf. fix infinite loop inrand(::Range(BigInt))
#8255). In the process,RandIntGen
was simplified by making the first element of the range equal to 1: this was only used with this value AFAICS, and this was duplicating indexing logic as noted by @ivarne (who found that this change should not break packages in METADATA.jl). It seems this was first implemented in commit ab97911 by @lindahua: was there a use for the first element field.a
that I overlooked?rand(r)
to work for any indexabler
, not only with ranges (e.g.rand(["head", "tail"])
. It is really the continuation of commit 48f27bc, and overlaps a bit with rand vs. sample #6003. This change is so small that I think it is harmless, whatever is done wrt rand vs. sample #6003.ps: I think the random.jl tests are not very safe as the seed is set the same at each run. Could the content of
__init__
function from base/random.jl be made a public function named e.g.srand()
, which could be used in the tests?