Use powi from x^p #43

tkf · 2019-01-06T00:20:36Z

There was a problem in the dispatch of ^ and Vec{4,Float64}(1)^2 was evaluated as Vec{4,Float64}(1)^Vec{4,Float64}(2). This PR fixes it. (Originally found in #41)

Before this PR (uses float pow):

julia> @code_llvm Vec{4,Float64}(1)^2

;  @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:1059 within `^'
define void @"julia_^_12563"({ <4 x double> }* noalias nocapture sret, { <4 x double> } addrspace(11)* nocapture nonnull readonly dereferenceable(32), i64) {
top:
; ┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:95 within `Type'
; │┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:97 within `macro expansion'
; ││┌ @ float.jl:60 within `Type'
     %3 = sitofp i64 %2 to double
; ││└
    %4 = insertelement <4 x double> undef, double %3, i32 0
    %5 = shufflevector <4 x double> %4, <4 x double> undef, <4 x i32> zeroinitializer
; └└
;  @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:1059 within `^' @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:985
; ┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:538 within `llvmwrap' @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:538
; │┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:557 within `macro expansion'
; ││┌ @ sysimg.jl:18 within `getproperty'
     %6 = getelementptr inbounds { <4 x double> }, { <4 x double> } addrspace(11)* %1, i64 0, i32 0
; ││└
    %7 = load <4 x double>, <4 x double> addrspace(11)* %6, align 16
    %res.i = call <4 x double> @llvm.pow.v4f64(<4 x double> %7, <4 x double> %5)
; └└
;  @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:1059 within `^'
  %.sroa.0.0..sroa_idx = getelementptr inbounds { <4 x double> }, { <4 x double> }* %0, i64 0, i32 0
  store <4 x double> %res.i, <4 x double>* %.sroa.0.0..sroa_idx, align 32
  ret void
}

After this PR (uses int powi):

julia> @code_llvm Vec{4,Float64}(1)^2

;  @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:1018 within `^'
define void @"julia_^_13198"({ <4 x double> }* noalias nocapture sret, { <4 x double> } addrspace(11)* nocapture nonnull readonly dereferenceable(32), i64) {
top:
; ┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:566 within `llvmwrap' @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:566
; │┌ @ /home/takafumi/.julia/dev/SIMD/src/SIMD.jl:584 within `macro expansion'
; ││┌ @ sysimg.jl:18 within `getproperty'
     %3 = getelementptr inbounds { <4 x double> }, { <4 x double> } addrspace(11)* %1, i64 0, i32 0
; ││└
    %4 = load <4 x double>, <4 x double> addrspace(11)* %3, align 16
    %res.i = call <4 x double> @llvm.powi.v4f64(<4 x double> %4, i64 %2)
; └└
  %.sroa.0.0..sroa_idx = getelementptr inbounds { <4 x double> }, { <4 x double> }* %0, i64 0, i32 0
  store <4 x double> %res.i, <4 x double>* %.sroa.0.0..sroa_idx, align 32
  ret void
}

codecov-io · 2019-01-06T00:56:34Z

Codecov Report

Merging #43 into master will increase coverage by 0.15%.
The diff coverage is 89.47%.

@@            Coverage Diff             @@
##           master      #43      +/-   ##
==========================================
+ Coverage   82.96%   83.12%   +0.15%     
==========================================
  Files           1        1              
  Lines         763      782      +19     
==========================================
+ Hits          633      650      +17     
- Misses        130      132       +2

Impacted Files	Coverage Δ
src/SIMD.jl	`83.12% <89.47%> (+0.15%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e4d17c...e2fec95. Read the comment docs.

tkf · 2019-01-06T01:32:10Z

test/runtests.jl

@@ -235,7 +235,7 @@ using Test, InteractiveUtils
                ==, !=, <, <=, >, >=,
                +, -, *, /, ^, copysign, flipsign, max, min, rem)
            @test op(42, V4F64(v4f64)) === op(V4F64(42), V4F64(v4f64))
-            @test op(V4F64(v4f64), 42) === op(V4F64(v4f64), V4F64(42))


It looks like x^42 was too large for 32 bit machines? https://ci.appveyor.com/project/eschnett/simd-jl/builds/21409891

This test fails for 64-bit machines and succeeds for 32-bit machines.

Oops, I misread the table. But still, 4^42 (note: v4f64[end] == -4) is much larger than maxintfloat() so I suppose decreasing the exponent makes sense?

But actually, I don't understand why it works in 64 bit Linux and not in 64 bit Windows. Is it something we should worry about? Is there any explanation? Maybe Julia/LLVM emits different machine code just because they are on different machines (Travis log says ivybridge and Appveyor log says haswell)?

eschnett

The expression 4^42 is not evaluated. The base is Float64, so the expression is 4.0^42, which does not overflow at all.

I don't know why this would fail on 64-bit Windows.

eschnett · 2019-01-06T03:21:27Z

test/runtests.jl

@@ -235,7 +235,7 @@ using Test, InteractiveUtils
                ==, !=, <, <=, >, >=,
                +, -, *, /, ^, copysign, flipsign, max, min, rem)


The operator ^ should not be in this test, since this test tests type promotion, and type promotion should not happen for ^. There should be a separate test for ^ testing integer exponents. The original test can then remain unchanged.

The new test for ^ could compare the result with a result obtained by repeated multiplication.

eschnett · 2019-01-06T03:23:17Z

src/SIMD.jl

+# `^(::ScalarTypes, v2::Vec)`.
+@inline Base.:^(v1::Vec{N,T}, x2::IntegerTypes) where {N,T<:FloatingTypes} =
+    llvmwrap(Val{:powi}, v1, Int(x2))
+@inline Base.:^(v1::Vec{N,T}, x2::Integer) where {N,T<:FloatingTypes} =


Why are you using Base.:^ instead of Base. ^ here?

I just thought it's better since Base.:^ is more explicit. FYI, it looks like julia code base prefer Base.:^:

$ git grep 'Base\. ^' $ git grep 'Base\.:^' base/compiler/ssair/show.jl:^(s::String, i::Int) = Base.:^(s, i) base/mathconstants.jl: Base.:^(::Irrational{:ℯ}, x::T) = exp(x) doc/src/base/math.md:Base.:^(::Number, ::Number) doc/src/base/strings.md:Base.:^(::AbstractString, ::Integer) stdlib/LinearAlgebra/docs/src/index.md:Base.:^(::AbstractMatrix, ::Number) stdlib/LinearAlgebra/docs/src/index.md:Base.:^(::Number, ::AbstractMatrix) stdlib/LinearAlgebra/src/dense.jl:Base.:^(b::Number, A::AbstractMatrix) = exp!(log(b)*A) stdlib/LinearAlgebra/src/dense.jl:Base.:^(::Irrational{:ℯ}, A::AbstractMatrix) = exp(A) test/math.jl:Base.:^(x::Number, y::Float22716) = x^(y.x)

But I don't mind using Base. ^. Let me know if I need to switch to it.

I haven't seen this syntax before. If it works then it's fine. I'm mostly worried about consistency of style in the source code; people shouldn't wonder why there is ^ in one and :^ in other places. If you prefer :^, then I'd prefer a pull request that changes this everywhere.

It looks like this is the only place Base.:^ is used but there are other similar cases like Base. %; I replaced them all.

See: eschnett#43 (comment)

tkf · 2019-01-06T21:06:45Z

Thanks for the review. I updated the test (and also rebased).

eschnett · 2019-01-06T21:49:47Z

test/runtests.jl

+        # Make sure our dispatching rule does not select floating point `pow`.
+        # See: https://github.com/eschnett/SIMD.jl/pull/43
+        ir = llvm_ir(^, (V4F64(v4f64), 2))
+        @test occursin("@llvm.powi.v4f64", ir)


tkf · 2019-01-06T22:20:51Z

src/SIMD.jl

+        Base.:~(b::$Boolsz) = $Boolsz(~b.int)
+        Base.:!(b::$Boolsz) = ~b
+        Base.:&(b1::$Boolsz, b2::$Boolsz) = $Boolsz(b1.int & b2.int)
+        Base.:|(b1::$Boolsz, b2::$Boolsz) = $Boolsz(b1.int | b2.int)


These are inside the comment so I wasn't sure if it's better to change them or leave them as-is. Let me know if I need to revert them (I'll remove this from the patch and then force-push).

vchuravy · 2019-01-17T15:56:31Z

src/SIMD.jl

@@ -560,6 +560,33 @@ end
    end
 end

+# Functions taking two arguments, second argument is a scalar
+@generated function llvmwrap(::Type{Val{Op}}, v1::Vec{N,T1},


You should be able to just use ccall("@llvm.powi.v4f64", llvmcall, RT, (AT...,), args...)

Is it specific to @llvm.powi? Can all other llvmwrap methods be implemented with ccall? Maybe only if there is only one instruction other than declare?

tkf commented Jan 6, 2019

View reviewed changes

eschnett reviewed Jan 6, 2019

View reviewed changes

tkf added 3 commits January 6, 2019 12:36

Use powi from x^p

6160a5e

Remove ^ from promotion rule test

8c9d844

See: eschnett#43 (comment)

More tests for ^

f970f28

tkf force-pushed the powi branch from 646810e to f970f28 Compare January 6, 2019 20:47

eschnett reviewed Jan 6, 2019

View reviewed changes

Replace: "Base. op" -> "Base.:op"

e2fec95

tkf commented Jan 6, 2019

View reviewed changes

eschnett merged commit 89ca5b4 into eschnett:master Jan 7, 2019

vchuravy reviewed Jan 17, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use powi from x^p #43

Use powi from x^p #43

tkf commented Jan 6, 2019

codecov-io commented Jan 6, 2019 •

edited

Loading

tkf Jan 6, 2019

eschnett Jan 6, 2019

tkf Jan 6, 2019

eschnett left a comment

eschnett Jan 6, 2019

eschnett Jan 6, 2019

tkf Jan 6, 2019

eschnett Jan 6, 2019

tkf Jan 6, 2019

tkf commented Jan 6, 2019

eschnett Jan 6, 2019

tkf Jan 6, 2019

vchuravy Jan 17, 2019

tkf Jan 17, 2019

		@@ -235,7 +235,7 @@ using Test, InteractiveUtils
		==, !=, <, <=, >, >=,
		+, -, *, /, ^, copysign, flipsign, max, min, rem)

Use powi from x^p #43

Use powi from x^p #43

Conversation

tkf commented Jan 6, 2019

codecov-io commented Jan 6, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eschnett left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkf commented Jan 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Jan 6, 2019 •

edited

Loading