Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2x performance regression when comparing types #11425

Closed
dhoegh opened this issue May 24, 2015 · 23 comments
Closed

2x performance regression when comparing types #11425

dhoegh opened this issue May 24, 2015 · 23 comments
Assignees
Labels
performance Must go faster regression Regression in behavior compared to a previous version

Comments

@dhoegh
Copy link
Contributor

dhoegh commented May 24, 2015

Between 8fc5b4e and 195bd01 a 2x performance regression happened on this test case:

function test_regression(n,T)
    for i = 1:n
        T != Vector{UInt16}
    end
end
@time test_regression(10000000, Array{Float64,1})

Output from 195bd01:

julia> @time test_regression(10000000, Array{Float64,1})
elapsed time: 0.591076726 seconds (80 bytes allocated)

Output from 8fc5b4e:

julia> @time test_regresion(10000000, Array{Float64,1})
elapsed time: 0.281341196 seconds (160 bytes allocated)

The test case are a reduction from this bench test: https://gist.github.com/ScottPJones/f3fb082ac30d337d91bc, discussed in #11004 and https://groups.google.com/forum/m/?fromgroups#!topic/julia-dev/eI3x5Een1uI.

I think this is the single reason for the regression observed by @ScottPJones.

@nalimilan
Copy link
Member

AFAICT @ScottPJones's version is different from yours as it gets specialized on T, so it should be much faster:

function check_string{T<:Union(Vector{UInt16},Vector{UInt32},Vector{Char},AbstractString)}(str::T, pos::Int, len::Int, options::Integer=0)

@dhoegh
Copy link
Contributor Author

dhoegh commented May 24, 2015

But if you profile @ScottPJones function you will see the profiler takes large hits on all lines where the type comparison occurs.

@dhoegh
Copy link
Contributor Author

dhoegh commented May 24, 2015

And if you execute his bench on commit 8fc5b4e there are no significant regression compared to 0.3.8

@nalimilan
Copy link
Member

Sure, I'm not saying it doesn't exist -- just that your example might be a bit different from his.

@dhoegh
Copy link
Contributor Author

dhoegh commented May 24, 2015

If the method gets specialized on T as:

 function test_regresion{T}(n,a::T)
     for i = 1:n
         T != Vector{Uint16}
     end
 end

The performance difference gets even worse.
For 8fc5b4e:

julia> @time test_regresion(10000000, [1.])
elapsed time: 0.051467362 seconds (240 bytes allocated)

for 195bd01:

julia> @time test_regresion(10000000, [1.])
elapsed time: 0.462643854 seconds (160 bytes allocated)

@nalimilan
Copy link
Member

Ah, I told you it was different! :-p

Looks like the equality test is no longer inlined. EDIT: silly me, it's just the output which improved in 0.4. The call is there on both versions.

On 0.4 master

@code_native test_regression(10000000, [1.])
    .text
Filename: none
Source line: 3
    pushq   %rbp
    movq    %rsp, %rbp
    pushq   %r14
    pushq   %rbx
    movq    %rdi, %rbx
    testq   %rbx, %rbx
    jle L61
Source line: 3
    movabsq $"==", %r14
L29:    movabsq $139878791540752, %rdi  # imm = 0x7F3811AE7010
    movabsq $139878791788240, %rsi  # imm = 0x7F3811B236D0
    callq   *%r14
    decq    %rbx
    jne L29
L61:    popq    %rbx
    popq    %r14
    popq    %rbp
    ret

On 0.3.8:

@code_native test_regression(10000000, [1.])
    .text
Filename: none
Source line: 0
    push    rbp
    mov rbp, rsp
    push    r14
    push    rbx
    mov rbx, rdi
Source line: 2
    test    rbx, rbx
    jle 35
Source line: 3
    movabs  r14, 140453940809840
    nop dword ptr [rax]
    mov edi, 23434032
    mov esi, 37650384
    call    r14
    dec rbx
    jne -18
    pop rbx
    pop r14
    pop rbp
    ret

@JeffBezanson JeffBezanson added performance Must go faster regression Regression in behavior compared to a previous version labels May 24, 2015
@JeffBezanson
Copy link
Member

Fixed pretty handily by the following stylish patch:

diff --git a/base/operators.jl b/base/operators.jl
index fc3e8ef..edb218f 100644
--- a/base/operators.jl
+++ b/base/operators.jl
@@ -19,7 +19,8 @@ isless(x::FloatingPoint, y::FloatingPoint) = (!isnan(x) & isna
 isless(x::Real,          y::FloatingPoint) = (!isnan(x) & isnan(y)) | (signbit(
 isless(x::FloatingPoint, y::Real         ) = (!isnan(x) & isnan(y)) | (signbit(

-==(T::Type, S::Type) = typeseq(T, S)
+=={T}(::Type{T}, ::Type{T}) = true
+==(T::Type, S::Type)        = false

@JeffBezanson
Copy link
Member

My theory about why this regression happened is that #10380 removed the extra definition ==(T::(Type...), S::(Type...)), which led to less specialization of the == function on types.

@Jutho
Copy link
Contributor

Jutho commented Jun 28, 2015

The fix in 66fb53a has some strange side effects. Consider the code here (the reason for this code will be a different issue): https://gist.github.com/Jutho/373f2c65ec129790426e

The three if's on line 39, 42 and 48 should not happen, except maybe when Union types are involved. With the new line from 66fb53a commented out, running analyzeconvert(); just prints one output, indeed involving a union type. But on current master, with the diagonal dispatch =={T}(::Type{T},::Type{T}) = true of 66fb53a enabled, a whole lot of warnings are printed and (tuple) types are considered equal which are not equal at all.

@JeffBezanson
Copy link
Member

Could you extract an example of two types where == gives the wrong answer?

@yuyichao
Copy link
Contributor

@JeffBezanson

Maybe this?

julia> dict = Dict{DataType, Vector{Any}}()
dDict{DataType,Array{Any,1}} with 0 entries

julia> dict[Tuple{TypeVar(:T, Signed, true), Base.GMP.BigInt}] = [1]
1-element Array{Int64,1}:
 1

julia> dict
Dict{DataType,Array{Any,1}} with 1 entry:
  Tuple{T<:Signed,Base.GMP => Any[1]

julia> dict[Tuple{Int64, Rational}]
1-element Array{Any,1}:
 1

@yuyichao
Copy link
Contributor

Or even shorter

julia> Tuple{Int64, Rational} == Tuple{TypeVar(:T, Signed, true), Base.GMP.BigInt}

true

julia> @which Tuple{Int64, Rational} == Tuple{TypeVar(:T, Signed, true), Base.GMP.BigInt}
ERROR: no unique matching method for the specified argument types

@JeffBezanson
Copy link
Member

Ok, that's pretty weird. But you are really not supposed to construct TypeVars directly.

@yuyichao
Copy link
Contributor

But you are really not supposed to construct TypeVars directly.

I agree. In his case, the (bounded) TypeVar is from the signature of the method (which happens pretty often when people trying to figure out about julia internals and I certainly have done that). Unbound TypeVar doesn't trigger this issue directly.

@Jutho
Copy link
Contributor

Jutho commented Jun 30, 2015

So the problem is only with the bound TypeVars that I extract from the functions? I will try to change them to be unbound when processing the function's argument types.

@yuyichao
Copy link
Contributor

Well, changing them to unbound changes the meaning of the function signatures....

@JeffBezanson
Copy link
Member

I'd like to try to fix this properly, since == is clearly doing something wrong. However we must keep priorities in mind.

@Jutho
Copy link
Contributor

Jutho commented Jun 30, 2015

There is certainly more at play and is not restricted to issues with TypeVar. At a fresh Julia REPL, the following behavior is correct:

a = Tuple{Int32,Unsigned}
b = Tuple{Rational,Rational}
a == b  # => false

However, at a fresh Julia REPL, copy pasting the code analyzeconvert from my gist above, running it once, and then repeating the same statements, I now get:

function analyzeconvert()
# copy from https://gist.github.com/Jutho/373f2c65ec129790426e
end
types,table,dict = analyzeconvert()
a = Tuple{Int32,Unsigned}
b = Tuple{Rational,Rational}
a == b  # => true

@yuyichao
Copy link
Contributor

I think it is just a combination of the issue compareing bound TypeVar and #265

@yuyichao
Copy link
Contributor

As shown in my comment above #11425 (comment), the dispatch system is obviously confused and the method table is probably not in a better situation.

I agree with @JeffBezanson though that this by itself is not of high priority (unless it is causing other issues...)

@Jutho
Copy link
Contributor

Jutho commented Jun 30, 2015

Ok thanks, I have been bitten by not recognizing #265 before.

@yuyichao
Copy link
Contributor

Actually it might be slightly different since #265 is more about directly linking the function and this one is probably about bad specialization in the method cache.....

@tkelman
Copy link
Contributor

tkelman commented Jul 2, 2015

Another thing I noticed, changed by the fix here: compare the timing of test/misc.jl at 769df42 of 24.32 seconds to right beforehand at 96cc970 of 1.07 seconds

tkelman pushed a commit that referenced this issue Jun 17, 2016
ref #11425

(cherry picked from commit e529058)
one piece from #15575, modified to not use @_pure_meta for release-0.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

6 participants