Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising performance for ^(::Float64, ::Int) vs ^(::Complex128, ::Int) #23804

Closed
saschatimme opened this issue Sep 21, 2017 · 4 comments
Closed

Comments

@saschatimme
Copy link
Contributor

saschatimme commented Sep 21, 2017

With Julia 0.6 I get the following results

julia> y = rand();
julia> @benchmark ^($y, $3)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     22.666 ns (0.00% GC)
  median time:      25.019 ns (0.00% GC)
  mean time:        27.440 ns (0.00% GC)
  maximum time:     665.613 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     995
julia> x = rand(Complex128);
julia> @benchmark ^($x, $3)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     7.788 ns (0.00% GC)
  median time:      8.363 ns (0.00% GC)
  mean time:        9.346 ns (0.00% GC)
  maximum time:     90.948 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

So the complex case is faster than the float! #2741 I assume that the powi LLVM instruction isn't used by default, but it seems to be used for the complex case (I couldn't find the implementation :/) which is quite odd.
Update: Found the complex implementation.

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 21, 2017

Some cross refs: #19872, #23751.

Also, note:

julia> @btime ^($y, 3);
  1.301 ns (0 allocations: 0 bytes)

julia> @btime ^($y, $3);
  47.925 ns (0 allocations: 0 bytes)

@saschatimme
Copy link
Contributor Author

The PR in #19890 says:

The powi intrinsic optimization over calling powf is that it is inaccurate.
We don't need that.

When it is equally accurate (e.g. tiny constant powers),
LLVM will already recognize and optimize any call to a function named powf,
and produce the same speedup.

So I assume the constant powers have to be inferred during compile time. So we could still need the powi optimization for the cases where it is precise enough (which seems to be for integers between -1 and 43 by comment from @simonbyrne in #19872)

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 21, 2017

For exponents of 2 and 3 this is done (by a lowering step):

julia> f(x) = x^3
f (generic function with 1 method)

julia> @code_llvm f(2.0)

define double @julia_f_61496(double) #0 !dbg !5 {
top:
  %1 = fmul double %0, %0
  %2 = fmul double %1, %0
  ret double %2
}

See also, #20637

@laborg
Copy link
Contributor

laborg commented Feb 21, 2022

This has been fixed. See also #24240

@laborg laborg closed this as completed Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants