Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slowdown of atomics on 0.7 / fix llvmcall #27694

Closed
chethega opened this issue Jun 20, 2018 · 6 comments · Fixed by #28172
Closed

slowdown of atomics on 0.7 / fix llvmcall #27694

chethega opened this issue Jun 20, 2018 · 6 comments · Fixed by #28172
Assignees
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Milestone

Comments

@chethega
Copy link
Contributor

Somehow all atomics became really slow and allocating. Example:

julia> using Base.Threads
julia> using BenchmarkTools
julia> a = Atomic{Int64}(0);

julia> @btime atomic_add!($a,$1);
  9.177 ns (0 allocations: 0 bytes)
julia> versioninfo()
Julia Version 0.6.3

julia> @btime atomic_add!($a,$1);
  129.263 ns (1 allocation: 16 bytes)
julia> versioninfo()
Julia Version 0.7.0-alpha.136

This is very visible when inspecting @code_llvm: Instead of the hand-written llvmcall, we get some weird giant blob of compiler output.

@vtjnash
Copy link
Member

vtjnash commented Jun 20, 2018

Looks like bad lowering (the call expression isn't getting linearized). The llvmcall "function" is one of the last special-case syntaxes that doesn't really behave properly. Similar to #26297

@JeffBezanson JeffBezanson added this to the 1.0.x milestone Jun 29, 2018
@JeffBezanson JeffBezanson changed the title slowdown of atomics on 0.7 slowdown of atomics on 0.7 / fix llvmcall Jun 29, 2018
@JeffBezanson
Copy link
Member

Adding to 1.0.x; a couple times I've run into cases where poking the compiler a bit causes it to blow up on llvmcall due to its un-settled IR form.

@chethega
Copy link
Contributor Author

chethega commented Jun 29, 2018

The following reproduces the bug and shows a workaround:

julia> using Base: llvmcall
julia> mutable struct foo
       val::Int64
       end
julia> function getval_1(x::foo)
       return llvmcall("%z = inttoptr i64 %0 to i64*\n %zz = load i64, i64* %z \n ret i64 %zz", Int64, Tuple{Ptr{Nothing}}, pointer_from_objref(x))
       end
julia> function getval_2(x::foo)
       ptr=pointer_from_objref(x); 
       return llvmcall("%z = inttoptr i64 %0 to i64*\n %zz = load i64, i64* %z \n ret i64 %zz", Int64, Tuple{Ptr{Nothing}}, ptr)
       end



julia> @code_llvm getval_2(a)
; Function getval_2
; Location: REPL[49]:2
define i64 @julia_getval_2_33505(%jl_value_t addrspace(10)* nonnull dereferenceable(8)) {
top:
; Function pointer_from_objref; {
; Location: pointer.jl:143
  %1 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
  %2 = addrspacecast %jl_value_t addrspace(11)* %1 to %jl_value_t*
;}
; Location: REPL[49]:3
  %z.i = bitcast %jl_value_t* %2 to i64*
  %zz.i = load i64, i64* %z.i, align 8
  ret i64 %zz.i
}


julia> @code_llvm getval_1(a)
#[truncated garbage]

julia> versioninfo()
Julia Version 0.7.0-beta.58
Commit 8e7e6fc00a (2018-06-27 18:34 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-5###U CPU @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, broadwell)

@mohamed82008
Copy link
Contributor

mohamed82008 commented Jul 11, 2018

Will/can this be addressed any time soon?

@KristofferC KristofferC added performance Must go faster regression Regression in behavior compared to a previous version labels Jul 11, 2018
@mohamed82008
Copy link
Contributor

Also

julia> VERSION
v"0.6.4"

julia> using BenchmarkTools

julia> l = Threads.SpinLock()
Base.Threads.TatasLock(Base.Threads.Atomic{Int64}(0))

julia> f(l) = (Threads.lock(l); Threads.unlock(l))
f (generic function with 1 method)

julia> @btime f($l)
  9.116 ns (0 allocations: 0 bytes)
julia> VERSION
v"0.7.0-beta.212"

julia> using BenchmarkTools

julia> l = Threads.SpinLock()
Base.Threads.TatasLock(Base.Threads.Atomic{Int64}(0))

julia> f(l) = (Threads.lock(l); Threads.unlock(l))
f (generic function with 1 method)

julia> @btime f($l)
  192.185 ns (3 allocations: 48 bytes)

@JeffBezanson
Copy link
Member

Adding to 0.7 since this blocks SIMD.jl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants