-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove bounds checks in iterate(::Tuple) #28847
Conversation
Seems like a bugfix, c.f. |
Well, if it's considered a bug fix we could merge this, right? :) For reference, beyond the julia> g() = 9000 in ntuple(identity, Val(512))
julia> @btime g() # before
324.355 ns (0 allocations: 0 bytes)
false
julia> @btime g() # after this PR
235.632 ns (0 allocations: 0 bytes)
false |
Could you elaborate how that comment is related. |
Oh nice, I missed that discussion. This is great for GPU support, too 🙂 |
Yes, this seems worth doing. |
Oh that's a fun bootstrapping puzzle. Is it possible that the iteration required would only be on two-tuples or less? Would just a few definitions for small tuples be enough to get us over the hump? E.g., something like: iterate(::Tuple{}, state::Nothing=nothing) = nothing
iterate(t::Tuple{Any}) = (@inbounds t[1], nothing)
iterate(t::Tuple{Any}, ::Nothing) = nothing
iterate(t::Tuple{Any,Any}) = (@inbounds t[1], true)
iterate(t::Tuple{Any,Any}, s::Bool) = s ? (@inbounds t[2], false) : nothing |
Not really, iteration is over all bit integer types + bool: Another solution which seems tempting is to avoid indexing:
but then the state is type unstable, and it does not work well :(. At any rate, the implementation of this PR seems good enough to me. |
I know Nanosoldier is down at the moment, but it'd be great to even just run the Tuple benchmarks locally. |
You don't need |
Right. To get it compiling I used
Turns out this performs exactly the same, so I'll stick to the current changes in this PR. |
@mbauman: iteration of tuples is not benchmarked in BaseBenchmarks.jl it seems -- all loops over tuples use indexing. Also if I run the benchmarks locally they're a bit flaky. Let me just summarize results from my benchmark of a linear scan: # Linear scan through Array of 32 / 256 / 1024 elements (reference)
26.789 ns, 159.156 ns, 583.630 ns
# Linear scan through Tuple of 32 / 256 / 1024 elements
29.265 ns, 142.273 ns, 583.619 ns # this branch
31.882 ns, 198.612 ns, 768.126 ns # julia 1.0 |
(cherry picked from commit 35c67e5)
(cherry picked from commit 35c67e5)
(cherry picked from commit 35c67e5)
This caused a regression for #28764 (comment). MWE perf_setindex!(A, val, inds) = setindex!(A, val, inds...)
using BenchmarkTools
s = 2
A = rand(Float64, ntuple(one, s)...)
y = one(eltype(A))
i = length(A)
@btime perf_setindex!($(fill!(A, y)), $y, $i) |
Thanks for pointing that out! I'll see if I can understand this. |
Fixed by #29133. |
wow, that was impressively fast 😅 |
Good change—but probably shouldn't have been marked for back-porting. Need to re-open this as a new PR now, and look into #29147 in conjunction |
I think this should be marked for backporting since otherwise
|
|
(cherry picked from commit 35c67e5)
This fix seems to help address #28844 partially to get compile-time optimizations
Before:
After this PR:
It also adds
@nospecialize
like in the rest of the function definitions in tuple.jl.Potentially there's a breaking change here, since
iterate((1,2), -1) === nothing
now, whereas it used to throw aBoundsError
.