Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected code generation for multivariate monomials #23118

Closed
hofmannmartin opened this issue Aug 3, 2017 · 2 comments
Closed

unexpected code generation for multivariate monomials #23118

hofmannmartin opened this issue Aug 3, 2017 · 2 comments
Labels
compiler:codegen Generation of LLVM IR and native code

Comments

@hofmannmartin
Copy link

I wanted to write functions of the form x^m*y^n for integer m and n. For m and n less than 4 I the code generated seems reasonable

f1(x,y) = x^3*y^3
@code_llvm f1(1.0,2.0)

define double @julia_foo_60944(double, double) #0 !dbg !5 {
top:
%2 = fmul double %0, %0
%3 = fmul double %2, %0
%4 = fmul double %1, %1
%5 = fmul double %4, %1
%6 = fmul double %3, %5
ret double %6
}

As soon as either m or n are 4 or higher, function evaluations get orders of magnitudes slower and the code seems messed up, though I am no llvm expert.

f2(x,y) = x^4*y^3
@code_llvm f2(1.0,2.0)

define double @julia_foo_60943(double, double) #0 !dbg !5 {
top:
%2 = call double @llvm.pow.f64(double %0, double 4.000000e+00)
%3 = fadd double %0, 4.000000e+00
%notlhs = fcmp ord double %2, 0.000000e+00
%notrhs = fcmp uno double %3, 0.000000e+00
%4 = or i1 %notrhs, %notlhs
br i1 %4, label %L12, label %if

if: ; preds = %top
call void @jl_throw(i8** inttoptr (i64 140367321804096 to i8**))
unreachable

L12: ; preds = %top
%5 = fmul double %1, %1
%6 = fmul double %5, %1
%7 = fmul double %6, %2
ret double %7
}

For me this seems to be a bug as I would expect f2 to produce code similar to f3

f3(x,y) = x*x*x*x*y*y*y
@code_llvm f3(1.0,2.0)

define double @julia_foo_60945(double, double) #0 !dbg !5 {
top:
%2 = fmul double %0, %0
%3 = fmul double %2, %0
%4 = fmul double %3, %0
%5 = fmul double %4, %1
%6 = fmul double %5, %1
%7 = fmul double %6, %1
ret double %7
}

This behaviour has been produced using the following setup:

Julia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)
@kshyatt kshyatt added the compiler:codegen Generation of LLVM IR and native code label Aug 3, 2017
@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Aug 3, 2017

x-ref: #20637. A couple of observations: writing pow(x, 4) in C does not result in three multiplications, it results in a call to the libm pow function, just like in Julia. The expression x*x*x*x is numerically distinguishable from and less accurate than calling pow(x, 4):

julia> f1(x) = x^4
f1 (generic function with 1 method)

julia> f2(x) = x*x*x*x
f2 (generic function with 1 method)

julia> f1(0.1)
0.00010000000000000002

julia> f2(0.1)
0.00010000000000000003

Note that this is also distinct from (x*x)*(x*x) since floating-point is not associative:

julia> f3(x) = (x*x)*(x*x)
f3 (generic function with 1 method)

julia> f3(0.1)
0.00010000000000000005

This last f3 version is even more efficient (only two multiplies), but even less accurate than f2:

; Function f2
; Location: REPL[71]
define double @julia_f2_63066(double) #0 {
top:
; Location: REPL[71]:1
  %1 = fmul double %0, %0
  %2 = fmul double %1, %0
  %3 = fmul double %2, %0
  ret double %3
}

; Function f3
; Location: REPL[79]
define double @julia_f3_63070(double) #0 {
top:
; Location: REPL[79]:1
  %1 = fmul double %0, %0
  %2 = fmul double %1, %1
  ret double %2
}

@StefanKarpinski
Copy link
Sponsor Member

I'm going to close this is "not a bug" since this behavior is correct and intentional, and just leave #20637 as the issue tracking making code generation for literal power expressions generate more efficient (and ideally no less accurate) code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

No branches or pull requests

3 participants