-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rand of AbstractArray #8649
rand of AbstractArray #8649
Conversation
@@ -374,6 +375,8 @@ imag{T<:Real}(x::AbstractArray{T}) = zero(x) | |||
|
|||
getindex(t::AbstractArray, i::Real) = error("indexing not defined for ", typeof(t)) | |||
|
|||
cget(t::AbstractArray, i::Real) = @inbounds return t[i+1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll not get this use of @inbounds
past Jeff. #8227 is still open about being able to propagate @inbounds
declarations trough functions in a meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but then I'm not sure what to do: rename this method to cget_unsafe
instead? (in my mind, the "c" in "cget" was already a warning, and this is meant for internal use only). Or remove @inbounds
and hope that there is no/little overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a simple definition, I'm not sure what benefit there is to even having it if its strictly for internal use. Similarly, cendof
just seems like a pointless private definition mixed in with a bunch of things that are public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions cendof
and cget
are needed only for their specializations for ranges (where the definition is slightly less trivial). In order for this PR's extended rand
functionality to work with ranges like rand(typemin(Int):typemax(Int))
, there needs to be a way to index into every position of such a range (which is not particularly specific to rand
). Implementing cget
/cendof
was the simplest solution I found, but I humbly request for help to improve the design of these functions or to find alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stefan has a good point. In fact, cget
is only used in two places, so it looks like it would be just as easy to specialize those cases for Ranges rather than introduce cget and cendof. At the very least I would move all the code for cget
to random.jl, since I certainly hope we won't see increasing use/need of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to start in the Random
module in random.jl and move to Base
in abstractarray.jl if we need it in more places. range.jl
is also a possible place, as the problem is really that we want UnitRange
to support a length of typemax(Uint) + 1
, and other uses of UnitRange
might also need to support full ranges.
How about the names length_0
and getindex_0
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think length_0
fits, as whatever the first indice is, the length remains the same, unlike the last indince; I could rename to endof_0
and getindex_0
(I first thougth of length_minus_1
or even length_1
, taking "_" as meaning "minus", but "end of" describes better what is needed).
I can certainly move the code for cget/cendof
in random.jl. I didn't do that at first as rand has nothing to do with these implementations (rand
needs it only not to break previously working functionality), and these could be used elsewhere also (e.g. sum(1:typemax(Uint))
works but not sum(0:typemax(Uint))
which should give the same answer, but I admit it's not a very convincing example). However I'd rather not have rand
have a specialization for ranges.
Updated PR: |
It would be nice to have rand on long ranges as a separate PR. Given all the work done recently on random, I think a fresh PR would be a good idea, and perhaps the Closing this one for now. |
@ViralBShah I'm not sure to understand what you mean: the fact that rand on long ranges was not supported anymore is the reason why #8309 was reverted, and the |
Actually, I need to fully understand what's going on here, and I did not originally understand the PR. I'd love to see what some others have to say on this. |
My motivation in this PR was that |
@StefanKarpinski What is a good solution to the long range issue here? Introducing special versions of
|
As long as nobody wants to use |
@ivarne : do you mean that it would be OK that |
As long as We have seen a specific request for |
I agree with @ivarne that making It is good if it works. If it is difficult to make it work or that making it work requires an approach that compromises performance, then I would say just leave it as it is. |
Most functions that work with I don't see why |
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange are handled differently so that those with overflowing length still work. In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange are handled differently so that those with overflowing length still work. In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one.
This is another approach to the (non-merged) refactoring part of #8309 and #8649. The goal is to reduce to the minimum the different code paths taken by UnitRange and AbstractArray (UnitRange ranges are handled differently so that those with overflowing length still work). In particular two rand! method are merged into one. Previously, RangeGenerator objects could create (scalar of array of) random values in a range a:b, taking care of creating first a random value in 0:b-a and then adding a. Choosing a random value in an AbstractArray A was then using A[rand(1:length(A))]. Here, RangeGenerator is changed to only handle the creation of random values in a range 0:k, and determining the right value of k (length(A)-1 or b-a) and picking the right element using the random value v (A[1+v] or a+v) is left to separate (and minimal) methods. Hence Range and AbstractArray are handled as uniformly as possible, given that we still want to support ranges like typemin(Int):typemax(Int) for which length overflows.
This undoes revert of #8309, and re-enables
rand
of long ranges, which was impossible after #8309 (hence the revert).In this new PR,
rand(typemin(Int):typemax(Int))
remains valid, but there are new possibilities, like e.g.rand(typemin(Int):1:typemax(Int))
.It also seems that rand is now slightly faster in some cases (see for example small benchmark in #8563, i.e.
@time for i = 1:10000000; rand(1:4); end
).The reason I didn't implement this in #8309 is that I don't think I'm the qualified person to introduce the needed functionalities, which is here provided by
cendof
andcget
: these work likeendof
andgetindex
respectively but with 0-based index (like in C, hence the name; I found the namecgetindex
too ugly as it has to be used in code, unlikegetindex
which has a nice sugar syntax).The idea is that currently a range like
r=typemin(Int):typemax(Int)
is such thatr[endof(r)]
raises anOverflowError
. So we need a (lower-level) way to request the element at any position of such ranges. With the new functions,cget(r, 0) == typemin(Int)
andcget(r, cendof(r)) == typemax(Int)
withcendof(r) == typemax(Uint)
.I expect this PR to be rejected but hope that in this case someone will implement alternative functions.
cc @ivarne (BTW, I removed the
T
type paramater, cf. your comment #8309 (comment)).