Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmap(f, arr) is slower than tmap(f, OutputEltype, arr) or tmap!(f, output, arr) #131

Open
alfaromartino opened this issue Dec 2, 2024 · 4 comments

Comments

@alfaromartino
Copy link

It seems that tmap is substantially slower than creating a variable output and update it using tmap!. Is this behavior expected or a bug?

function foo(x)
    output = similar(x)
    tmap!(log, output, x)
end

@btime map(log, $x)
@btime tmap(log, $x)
@btime foo($x)

in my machine gives

# x = rand(1_000_000)
  3.185 ms (2 allocations: 7.629 MiB)
  3.031 ms (568 allocations: 16.958 MiB)
  339.091 μs (152 allocations: 7.642 MiB)

# x = rand(10_000_000)
  47.702 ms (2 allocations: 76.294 MiB)
  53.858 ms (569 allocations: 222.665 MiB)
  4.796 ms (152 allocations: 76.307 MiB)
@MasonProtter
Copy link
Member

Yes, this is expected, although the result appears to be quite a bit worse on your machine than mine:

julia> let x = rand(1_000_000)
           @btime map(log, $x)
           @btime tmap(log, $x)
           @btime foo($x)
       end;
  4.012 ms (3 allocations: 7.63 MiB)
  1.809 ms (223 allocations: 28.47 MiB)
  683.822 μs (58 allocations: 7.63 MiB)

julia> let x = rand(10_000_000)
           @btime map(log, $x)
           @btime tmap(log, $x)
           @btime foo($x)
       end;
  45.237 ms (3 allocations: 76.29 MiB)
  27.756 ms (222 allocations: 354.39 MiB)
  9.755 ms (58 allocations: 76.30 MiB)

What's going on here is that tmap(f, x) doesn't know ahead-of-time what the output type of the array will end up being, so what it does instead is it does a separate map call on each task and then append!!s those output arrays together to create the final result. This means that for an f that takes very little time like log, a very large percentage of the runtime will be spent on just allocating those arrays.

The easiest way around this would be to just provide the OutputEltype argument to tmap:

julia> let x = rand(1_000_000)
           @btime tmap(log, Float64, $x)
       end;
  701.749 μs (59 allocations: 7.63 MiB)

julia> let x = rand(10_000_000)
           @btime tmap(log, Float64, $x)
       end;
  10.808 ms (59 allocations: 76.30 MiB)

which basically does the same thing as your foo (i.e. pre-allocate an array of the appropriate size and eltype and then populate it with tmap!).

This is mentioned in the docstring of tmap where we say:

The optional argument OutputElementType will select a specific element type for the returned container, and will generally incur fewer allocations than the version where OutputElementType is not specified.

but maybe this could have been more explicit.

@MasonProtter
Copy link
Member

That said, I do think there's room here for us to make tmap(f, x) faster for situations like this by using type inference to check if f(::eltype(x)) is known to produce a concretely typed result (i.e. a Float64 in this example), so I'm going to keep this issue open for now, but change the title to something more useful

@MasonProtter MasonProtter changed the title Bug or expected behavior of tmap? tmap(f, arr) is slower than tmap(f, OutputEltype, arr) or tmap!(f, output, arr) Dec 2, 2024
@MasonProtter
Copy link
Member

The idea would just be that tmap(f, x) just needs to check

inferred_output_eltype = Core.Compiler.return_type(f, Tuple{eltype(x)})
if isconcretetype(inferred_output_eltype)
    tmap(f, inferred_output_eltype, x)
else
    # do what we currently do
end

which is a pretty easy change. The main thing is that we have to carefully go through the internal call-chain and carefully make sure that we're not going to encounter any jumps in world-age that could cause the invocation of return_type to be invalid.

@alfaromartino
Copy link
Author

Got it. Many thanks!! Great package btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants