sum(function, tuple) is slow #30465

cossio · 2018-12-20T16:25:25Z

julia> hsum(x...) = sum(abs2.(float.(x)));
julia> gsum(x...) = sum(y -> abs2(float(y)), x);

julia> @btime gsum(1.,2.,3.,4.,5.,6.,7.,8.)
  10.907 ns (0 allocations: 0 bytes)
204.0
julia> @btime hsum(1.,2.,3.,4.,5.,6.,7.,8.)
  2.436 ns (0 allocations: 0 bytes)
204.0

gsum should not be so slow compared to hsum, right?

The text was updated successfully, but these errors were encountered:

KristofferC · 2018-12-20T16:38:02Z

#30421, they use different summation implementation.

stevengj · 2018-12-20T16:50:28Z

Maybe we should have a specialized sum(function, tuple) method similar to our specialized sum(tuple)?

cossio · 2018-12-20T16:51:47Z

@KristofferC Here I don't have a sum of a generator, but of a Tuple. I'm not sure it's the same as the other issue.

stevengj · 2018-12-20T17:44:54Z

I think that when you do sum(f, tuple) it actually constructs a generator?

nalimilan · 2018-12-20T17:55:45Z

I think that when you do sum(f, tuple) it actually constructs a generator?

No, it calls mapfoldl(y -> abs2(float(y)), Base.add_sum, x). The difference is that we have a special sum method for tuples:

sum(x::Tuple{Any, Vararg{Any}}) = +(x...)

cossio · 2018-12-20T17:58:37Z

What about defining:

sum(f, x::Tuple{Any, Vararg{Any}}) = +(f.(x)...)

cossio · 2018-12-20T20:05:20Z

This allocates for some reason. Any ideas?

julia> sum1(f, x::Tuple}) = +(f.(x)...)
julia> @btime sum1(sin, (1.,2.,3.,4.,5.))
  562.674 ns (10 allocations: 288 bytes)
0.1761616497223787
julia> @btime sum(sin, (1.,2.,3.,4.,5.))
  56.020 ns (0 allocations: 0 bytes)
0.1761616497223787

stevengj · 2018-12-20T20:17:55Z

Probably sufficiently large tuples get heap-allocated?

cossio · 2018-12-20T20:24:37Z

Probably sufficiently large tuples get heap-allocated?

julia> @btime sum1(sin, (1.,2.))
  20.082 ns (0 allocations: 0 bytes)
1.7507684116335782

julia> @btime sum1(sin, (1.,2.,3.))
  30.955 ns (0 allocations: 0 bytes)
1.8918884196934453

julia> @btime sum1(sin, (1.,2.,3.,4.))
  40.153 ns (0 allocations: 0 bytes)
1.1350859243855171

julia> @btime sum1(sin, (1.,2.,3.,4.,5.))
  474.749 ns (10 allocations: 288 bytes)
0.1761616497223787

I don't understand this.

stevengj · 2018-12-20T20:26:41Z

I don't know why the threshold is a 5-tuple, but suppose you have a tuple of 1000 elements — where would the result of f.(x) be stored? Clearly there must be some threshold beyond which the temporary tuple is heap-allocated, no?

cossio · 2018-12-20T20:47:56Z

I think there should be no temporary tuple at all. The operation can be done in-place.

stevengj · 2018-12-20T20:50:54Z

It seems too optimistic to hope that the compiler should always be able to figure out that it can eliminate the tuple construction.

The following is based on the Base.afoldl code used in Base.+(a,b,c,d...), and seems to always be fast without allocating:

mapafoldl(F,op,a) = F(a)
mapafoldl(F,op,a,b) = op(F(a),F(b))
mapafoldl(F,op,a,b,c...) = op(op(F(a),F(b)), mapafoldl(F, op, c...))
function mapafoldl(F,op,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,qs...)
    y = op(op(op(op(op(op(op(op(op(op(op(op(op(op(op(F(a),F(b)),F(c)),F(d)),F(e)),F(f)),F(g)),F(h)),F(i)),F(j)),F(k)),F(l)),F(m)),F(n)),F(o)),F(p))
    for x in qs; y = op(y,F(x)); end
    y
end
mysum(f, x::Tuple) = mapafoldl(f, +, x...)

stevengj · 2018-12-20T20:53:22Z

(Indeed, it seems like we could use the above to define an efficient mapfoldl for tuples, which would give us a fast mapreduce, sum, etcetera.)

stevengj · 2018-12-20T21:20:56Z

(Note that the above is not quite right for a general mapfoldl because it is not left-associative. I'll put together a PR with a proper version.)

Nosferican · 2018-12-26T19:04:26Z

sum(f, x::Tuple{Any, Vararg{Any}}) = +(f.(x)...)

should probably be

sum(f, x::Tuple{Any, Vararg{Any}}) = sum(f, x for x in x)

to avoid the unnecessary materialization.

nalimilan · 2018-12-26T19:05:57Z

I don't think it matters for tuples, no allocation happens anyway.

chethega · 2018-12-26T21:06:52Z

This allocates for some reason. Any ideas?

You have an argument of type <:Function that does not appear in head position in the body of the function definition. Therefore it does not get specialized (@code_native sum1(...) shows you the fully realized code instead of the actually existing code).

julia> using BenchmarkTools
julia> sum1(f, x::Tuple) = +(f.(x)...)
julia> sum2(f::FF, x::Tuple) where FF = +(f.(x)...)

julia> @btime sum1(sin, (1.,2.,3.,4.,5.))
  597.669 ns (10 allocations: 288 bytes)
0.1761616497223787

julia> @btime sum2(sin, (1.,2.,3.,4.,5.))
  62.396 ns (0 allocations: 0 bytes)
0.1761616497223787

cossio · 2019-01-03T16:34:40Z

@chethega I don't understand the difference. I thought ::F where F was equivalent to ::Any.

KristofferC · 2019-01-03T17:56:10Z

It works the same w.r.t dispatch but it forces specialization on the argument.

cossio mentioned this issue Dec 20, 2018

fix hypot with more than two arguments #30301

Closed

stevengj added the performance Must go faster label Dec 20, 2018

cossio changed the title ~~sum(y -> f(g(x)), x) slow~~ sum(y -> f(g(x)), x) slow Dec 20, 2018

stevengj mentioned this issue Dec 20, 2018

faster mapfoldl for tuples #30471

Merged

1 task

stevengj changed the title ~~sum(y -> f(g(x)), x) slow~~ sum(function, tuple) is slow Dec 21, 2018

ararslan closed this as completed in #30471 Dec 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sum(function, tuple) is slow #30465

sum(function, tuple) is slow #30465

cossio commented Dec 20, 2018

KristofferC commented Dec 20, 2018

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018

nalimilan commented Dec 20, 2018

cossio commented Dec 20, 2018

cossio commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018

stevengj commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018

Nosferican commented Dec 26, 2018

nalimilan commented Dec 26, 2018

chethega commented Dec 26, 2018

cossio commented Jan 3, 2019 •

edited

Loading

KristofferC commented Jan 3, 2019

sum(function, tuple) is slow #30465

sum(function, tuple) is slow #30465

Comments

cossio commented Dec 20, 2018

KristofferC commented Dec 20, 2018

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018 • edited Loading

stevengj commented Dec 20, 2018

nalimilan commented Dec 20, 2018

cossio commented Dec 20, 2018

cossio commented Dec 20, 2018 • edited Loading

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018

stevengj commented Dec 20, 2018

cossio commented Dec 20, 2018

stevengj commented Dec 20, 2018 • edited Loading

stevengj commented Dec 20, 2018 • edited Loading

stevengj commented Dec 20, 2018

Nosferican commented Dec 26, 2018

nalimilan commented Dec 26, 2018

chethega commented Dec 26, 2018

cossio commented Jan 3, 2019 • edited Loading

KristofferC commented Jan 3, 2019

cossio commented Dec 20, 2018 •

edited

Loading

cossio commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018 •

edited

Loading

stevengj commented Dec 20, 2018 •

edited

Loading

cossio commented Jan 3, 2019 •

edited

Loading