Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use single call to dsfmt_gv_genrand_uint32() for small ranges #5578

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion base/random.jl
Original file line number Diff line number Diff line change
Expand Up @@ -161,10 +161,13 @@ immutable RandIntGen{T<:Integer, U<:Unsigned}
u::U # maximum multiple of k within the domain of U

RandIntGen(a::T, k::U) = new(a, k, div(typemax(U),k)*k)

RandIntGen(a::Uint64, k::Uint64) = new(a::T, k::U, div((k >> 32 != 0)*0xFFFFFFFF00000000 + 0x00000000FFFFFFFF, k)*k)
RandIntGen(a::Int64, k::Uint64) = new(a::T, k::U, div((k >> 32 != 0)*0xFFFFFFFF00000000 + 0x00000000FFFFFFFF, k)*k)

end

RandIntGen{T<:Unsigned}(r::Range1{T}) = RandIntGen{T,T}(first(r), convert(T, length(r)))

# specialized versions
for (T, U) in [(Uint8, Uint32), (Uint16, Uint32), (Int8, Uint32), (Int16, Uint32),
(Int32, Uint32), (Int64, Uint64), (Int128, Uint128),
Expand All @@ -173,6 +176,24 @@ for (T, U) in [(Uint8, Uint32), (Uint16, Uint32), (Int8, Uint32), (Int16, Uint32
@eval RandIntGen(r::Range1{$T}) = RandIntGen{$T, $U}(first(r), convert($U, length(r)))
end

# this function uses 32 bit entropy for small ranges of length <= typemax(Uint32)
# the constructor of RandIntGen is responsible for providing the right value of k
function rand{T<:Union(Uint64, Int64)}(g::RandIntGen{T,Uint64})
local x::Uint64
if g.k >> 32 == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation of this function seems to be a little bit messy. Would you please clean up following julia's four-space convention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, some hidden tabs removed.

x = rand(Uint32)
while x >= g.u
x = rand(Uint32)
end
else
x = rand(Uint64)
while x >= g.u
x = rand(Uint64)
end
end
return convert(T, g.a + rem(x, g.k))
end

function rand{T<:Integer,U<:Unsigned}(g::RandIntGen{T,U})
x = rand(U)
while x >= g.u
Expand Down
2 changes: 1 addition & 1 deletion test/linalg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ C = randn(2,2)

for elty in (Float32, Float64, Complex64, Complex128, Int)
if elty == Int
srand(61516384)
srand(61516300)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to change this number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because with the change, in this test the random matrices on 64 bit systems are the same as the random matrices on 32 bit systems before this change. This would introduce #5472 on 64 bit systems and break the tests.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a comment in the code for this kind of magic numbers. This discussion will be hard to find later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ceases to be a magic number once 5472 gets fixed, presumably by changing

   @test_approx_eq W*v F*v
    iFv = F\v
    @test_approx_eq W\v iFv
    @test_approx_eq det(W) det(F)

to use a hand picked epsilon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a proper testing criterion that is unlikely to fail in general is better than relying on a magic random seed.

cc: @jiahao @andreasnoackjensen

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might seem easier to just run the tests several times to fix a random seed with which the tests do not fail. However, this may cause problems in a long run. Suppose at some point of time in future, we update the RNG (upgrade the version or choose a better library), if the threshold is set too small, many of the tests may fail and we will end up re-choosing a lot of the seeds again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we may simply multiply the threshold by 1e3 and drastically reduce the risk of running into testing failure, and the threshold will be still be small enough for the purpose of ensuring correctness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe the current random seed was picked as a "good enough" first choice, but as we have continued to add linalg tests, tests further down in the test suite are picking up new matrices that are downstream in the rng.

For now we could just up the fuzz factor on failing tests, but I do hope to put in proper error bounds at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiahao I agree there are cases (e.g. matrices that are nearly singular) where this can become really tricky. For such cases, setting a magic seed is a reasonable stopgap.

However, I think it is still worth considering better ways of setting up those tests in the long run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's continue the discussion in #5605

d = rand(1:100, n)
dl = -rand(0:10, n-1)
du = -rand(0:10, n-1)
Expand Down
19 changes: 19 additions & 0 deletions test/random.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,23 @@ if sizeof(Int32) < sizeof(Int)
r = rand(int32(-1):typemax(Int32))
@test typeof(r) == Int32
@test -1 <= r <= typemax(Int32)
@test all([div(typemax(Uint64),k)*k == Base.Random.RandIntGen(uint64(1:k)).u for k in 13 .+ int64(2).^(32:62)])
@test all([div(typemax(Uint64),k)*k == Base.Random.RandIntGen(int64(1:k)).u for k in 13 .+ int64(2).^(32:61)])

end

#same random numbers on for small ranges on all systems

seed = rand(Uint) #leave state nondeterministic as above
srand(seed)
r = int64(rand(int32(97:122)))
srand(seed)
@test r == rand(int64(97:122))

srand(seed)
r = uint64(rand(uint32(97:122)))
srand(seed)
@test r == rand(uint64(97:122))

@test all([div(typemax(Uint32),k)*k == Base.Random.RandIntGen(uint64(1:k)).u for k in 13 .+ int64(2).^(1:30)])
@test all([div(typemax(Uint32),k)*k == Base.Random.RandIntGen(int64(1:k)).u for k in 13 .+ int64(2).^(1:30)])