-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ntuple(f, n) and ntuple(f, Val{n}) throw ArgumentError when n < 0 #21697
Conversation
I would personally prefer an error to an empty tuple. |
This seems like a good idea. It looks like the branch gets elided by the compiler anyway, so there probably isn't any cost to checking this. |
This could be an error, but for consistency it should also be an error for |
We could avoid backporting the error for that method to 0.5 and 0.6, tolerating an inconsistency for those releases. |
I had the same thought so tracked the non-Val version to see how long it's been that way. Apparently always 97b2e50. |
So the consensus is to make it a |
Changed it to throwing |
base/tuple.jl
Outdated
@@ -105,7 +105,8 @@ julia> ntuple(i -> 2*i, 4) | |||
``` | |||
""" | |||
function ntuple(f::F, n::Integer) where F | |||
t = n <= 0 ? () : | |||
(n >= 0) || throw(DomainError()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe ArgumentError
? A quick search through base suggests DomainError
is mostly used when the value is out of domain for mathematical functions (e.g. sqrt
, log
etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change it to ArgumentError
if there would be more thumbs up than for DomainError
(yours is counted) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave the check over _ntuple
so the branch gets delayed after the small cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DomainError
changed into ArgumentError
.
I'd leave the n
check where it is, because it's much cleaner than burying it inside another routine. Anyway, in performance-sensitive context one should use Val{N}
alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconding @pabloferz's recommendation. Using the Val{N}
methods is not always possible, and ntuple
's performance is important. Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be surprised if LLVM did not just hoist the branch to the end, but this change can't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've benchmarked the early and the late n
check variants with
@benchmark ntuple(identity, n) setup=(n = rand(0:10))
The late one (inside _ntuple()
) was 1ns (~10%) faster, so I've updated the PR.
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Probably the CI has to be rerun. Travis got stuck (no output for 10 minutes) in one configuration, AppVeyor failed to install the package in one configuration. |
Travis failed for xcode8 in 1 test of "spawn", but it doesn't look like PR-related. All other test configurations were ok. |
The CI tests pass from the 3rd attempt. 🎉 |
@nanosoldier |
@@ -142,6 +146,7 @@ ntuple(f, ::Type{Val{15}}) = (@_inline_meta; (f(1), f(2), f(3), f(4), f(5), f(6) | |||
|
|||
function ntuple(f::F, ::Type{Val{N}}) where {F,N} | |||
Core.typeassert(N, Int) | |||
(N >= 0) || throw(ArgumentError(string("tuple length should be ≥0, got ", N))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine this branch is handled at compile time where N
is known at compile time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the theory. That's also what I see with @code_llvm ntuple(identity, Val{-1})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also I hope with e.g. @code_llvm ntuple(identity, Val{1})
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Actually, for each 0<=N<=15
there's explicit ntuple(identity, Val{N})
method. The check for N<0
is in a generic method.
In fact, ntuple(f, Val{N})
is substantially slower for N>=16
. On the 11 days old master (b838f2e)
julia> @benchmark ntuple(identity, Val{15})
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 3.365 ns (0.00% GC)
median time: 3.379 ns (0.00% GC)
mean time: 3.400 ns (0.00% GC)
maximum time: 16.001 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark ntuple(identity, Val{16})
BenchmarkTools.Trial:
memory estimate: 288 bytes
allocs estimate: 2
--------------
minimum time: 418.864 ns (0.00% GC)
median time: 423.166 ns (0.00% GC)
mean time: 448.653 ns (2.32% GC)
maximum time: 6.745 μs (87.57% GC)
--------------
samples: 10000
evals/sample: 199
Few nanoseconds could be explained by Base.@_inline_meta
absence, but not hundreds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Actually, for each 0<=N<=15 there's explicit ntuple(identity, Val{N}) method. The check for N<0 is in a generic method.
Ah, right! Wonseok's recent change. Thanks for reminding me. For @code_llvm ntuple(identity, Val{16})
then :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up --- the branch gets handled at compile time for e.g. @code_llvm ntuple(identity, Val{16})
? Assuming that holds (and perhaps even if it doesn't, if the marginal cost of the branch is low) the present incarnation lgtm! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, apparently, LLVM understands that the check is a constant expression and eliminates it.
The nanosoldier regressions look like noise to me given their high variation between the runs.
Hmm, I was using the following benchmark to estimate the overhead of checking _ntuple_before(f, n) = (Base.@_noinline_meta; ([f(i) for i = 1:n]...))
function ntuple_before(f::F, n::Integer) where F
(n >= 0) || throw(ArgumentError(string("tuple length should be ≥0, got ", n)))
t = n == 0 ? () :
n == 1 ? (f(1),) :
n == 2 ? (f(1), f(2)) :
n == 3 ? (f(1), f(2), f(3)) :
n == 4 ? (f(1), f(2), f(3), f(4)) :
n == 5 ? (f(1), f(2), f(3), f(4), f(5)) :
n == 6 ? (f(1), f(2), f(3), f(4), f(5), f(6)) :
n == 7 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7)) :
n == 8 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8)) :
n == 9 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9)) :
n == 10 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9), f(10)) :
_ntuple_before(f, n)
return t
end
function _ntuple_after(f, n)
Base.@_noinline_meta
(n >= 0) || throw(ArgumentError(string("tuple length should be ≥0, got ", n)))
([f(i) for i = 1:n]...)
end
function ntuple_after(f::F, n::Integer) where F
t = n == 0 ? () :
n == 1 ? (f(1),) :
n == 2 ? (f(1), f(2)) :
n == 3 ? (f(1), f(2), f(3)) :
n == 4 ? (f(1), f(2), f(3), f(4)) :
n == 5 ? (f(1), f(2), f(3), f(4), f(5)) :
n == 6 ? (f(1), f(2), f(3), f(4), f(5), f(6)) :
n == 7 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7)) :
n == 8 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8)) :
n == 9 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9)) :
n == 10 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9), f(10)) :
_ntuple_after(f, n)
return t
end
using BenchmarkTools
@benchmark ntuple_before(identity, n) setup=(n = rand(0:10))
@benchmark ntuple_after(identity, n) setup=(n = rand(0:10)) But the results behave reproducibly different in different Julia sessions.
Sometimes the late one:
|
Perhaps benchmarking without dependence on random variables would be worthwhile, or at least worth excluding as a potential source of inconsistency? |
Thanks for going the distance by the way! :) Performance tuning of |
You're probably hitting Line 27 in 25f241c
Any16 fallback like we use for map , just to prevent compile time from exploding. (Try ntuple(identity, Val{1000}) to see what I mean.) That will, of course, slow down the runtime, but it prevents catastrophic behavior.
|
@timholy Yes, from |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
It could be merged without squashing the commits. c6f458e ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The present incarnation lgtm! :)
Perhaps @pabloferz could have another look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too
The individual commits are all ok, so I'd say no squashing is necessary, and without squashing, we at least keep the option of an easy backport. |
otherwise Julia would stuck in an endless type inference loop
Rebased to resolve conflicts with the updated |
@Sacha0 should be ready to go :) |
Following |
ntuple(i -> i, -1)
returns an emptyTuple{}
, butntuple(i -> i, Val{-1})
sends Julia (both 0.5 and 0.6) into an endless type inference(?) loop consuming CPU and memory (I didn't dare to wait until the end).The fix is to return an empty tuple immediately.
Probably should be backported to 0.5 and 0.6.