-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@test / @testset can have a significant overhead on 0.5 compared to 0.4 #18077
Comments
This seems to be due to
|
Accepted as a "real issue" that should be addressed somehow. Not sure if we can target 0.5.x or if this needs to be fixed on 0.6, but optimistically tagging as 0.5.x. |
We do need to see if we can either fix this in 0.5.x or at least provide a reasonable recipe to work around it. |
FWIW, putting the expression from the |
Does this have anything do to with it and/or is it known, or am I doing something silly? using Base.Test
let
@noinline child1() = return nothing
parent1() = child1()
println(code_typed(parent1, ()))
code_llvm(parent1, ())
end
@testset "foobar" begin
@noinline child2() = return nothing
parent2() = child2()
println(code_typed(parent2, ()))
code_llvm(parent2, ())
end
|
So I reduced the above to let
@noinline child1() = return nothing
parent1() = child1() # static dispatch
println(code_lowered(parent1, ()))
end
try
@noinline child2() = return nothing
parent2() = child2() # dynamic dispatch
println(code_lowered(parent2, ()))
end Anyway, for a quick test similar to @KristofferC's I wrapped the testset's code block in an anonymous function, reducing eg. the math tests runtime and memory usage significantly:
Maybe we should just go with something like this for now? diff --git a/base/test.jl b/base/test.jl
index 9b37e87..0a481b4 100644
--- a/base/test.jl
+++ b/base/test.jl
@@ -705,6 +705,7 @@ function testset_beginend(args, tests)
# finally removing the testset and giving it a chance to take
# action (such as reporting the results)
quote
+ (()->begin
ts = $(testsettype)($desc; $options...)
push_testset(ts)
try
@@ -716,6 +717,7 @@ function testset_beginend(args, tests)
end
pop_testset()
finish(ts)
+ end)()
end
end
@@ -777,6 +779,7 @@ function testset_forloop(args, testloop)
end
end
quote
+ (()->begin
arr = Array{Any,1}(0)
first_iteration = true
local ts
@@ -790,6 +793,7 @@ function testset_forloop(args, testloop)
end
end
arr
+ end)()
end
end |
This is much better on 0.6, still terrible on 0.5. |
Ok to close this now? |
Let me run a comparison on 0.7 with |
Yes, no longer any significant slowdown of Although not related to the issue, there is a significant compilation regression vs 0.7. These are timings from testsets on 0.6 and 0.7: 0.6
0.7:
So time of running the test is up by 80% and allocations with 350%. |
I just tried adding ~14x slower on v0.6.2 vs. ~21x slower on master (although I have not fixed all deprecations yet). 0.6.2 _
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.6.2 (2017-12-13 18:08 UTC)
_/ |\__'_|_|_|\__'_| |
|__/ | x86_64-apple-darwin17.3.0
julia> @time Pkg.test("ERFA")
INFO: Testing ERFA
WARNING: year outside range(1000:3000)
INFO: ERFA tests passed
12.574542 seconds (4.92 M allocations: 258.743 MiB, 1.19% gc time)
julia> @time Pkg.test("ERFA")
INFO: Testing ERFA
WARNING: year outside range(1000:3000)
Test Summary: | Pass Total
ERFA | 1311 1311
INFO: ERFA tests passed
178.497327 seconds (413 allocations: 22.969 KiB) Master _
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.7.0-DEV.3590 (2018-01-27 08:16 UTC)
_/ |\__'_|_|_|\__'_| | Commit eea727c0c (1 day old master)
|__/ | x86_64-apple-darwin17.4.0
julia> @time Pkg.test("ERFA")
[ Info: Testing ERFA
WARNING: year outside range(1000:3000)
[ Info: ERFA tests passed
15.921993 seconds (8.73 M allocations: 486.941 MiB, 2.40% gc time)
julia> @time Pkg.test("ERFA")
[ Info: Testing ERFA
WARNING: year outside range(1000:3000)
Test Summary: | Pass Total
ERFA | 1311 1311
[ Info: ERFA tests passed
348.921708 seconds (760 allocations: 35.781 KiB) |
FWIW, the repro I posted above still generates horrible code: _ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.7.0-DEV.3413 (2018-01-15 17:38 UTC)
_/ |\__'_|_|_|\__'_| | master/5e2ff127f8* (fork: 2372 commits, 193 days)
|__/ | x86_64-pc-linux-gnu
julia> using Test
julia> let
@noinline child1() = return nothing
parent1() = child1()
println(code_typed(parent1, ()))
code_llvm(parent1, ())
end
Any[CodeInfo(:(begin
return
end))=>Nothing]
; Function parent1
; Location: REPL[2]:3
define void @julia_parent1_57576() {
top:
ret void
}
julia> @testset "foobar" begin
@noinline child2() = return nothing
parent2() = child2()
println(code_typed(parent2, ()))
code_llvm(parent2, ())
end
Any[CodeInfo(:(begin
SSAValue(0) = (Core.getfield)(#self#, :child2)::Core.Box
SSAValue(1) = (Core.isdefined)(SSAValue(0), :contents)::Bool
unless SSAValue(1) goto 5
goto 8
5:
NewvarNode(:(child2@_3))
child2@_3
8:
SSAValue(2) = (Core.getfield)(SSAValue(0), :contents)
SSAValue(3) = (SSAValue(2))()
return SSAValue(3)
end))=>Any]
; Function parent2
; Location: REPL[3]:3
define nonnull %jl_value_t addrspace(10)* @japi1_parent2_57626(%jl_value_t addrspace(10)*, %jl_value_t addrspace(10)**, i32) #0 {
top:
...
br i1 %19, label %err, label %pass
pass: ; preds = %top
...
%21 = call nonnull %jl_value_t addrspace(10)* @jl_apply_generic(%jl_value_t addrspace(10)** nonnull %3, i32 1)
...
ret %jl_value_t addrspace(10)* %21
err: ; preds = %top
call void @jl_undefined_var_error(%jl_value_t addrspace(12)* addrspacecast (%jl_value_t* inttoptr (i64 140091673833888 to %jl_value_t*) to %jl_value_t addrspace(12)*))
unreachable
} Or, using BenchmarkTools: julia> let
@noinline child1() = return nothing
parent1() = child1()
display(@benchmark $parent1())
end
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 0.015 ns (0.00% GC)
median time: 0.020 ns (0.00% GC)
mean time: 0.020 ns (0.00% GC)
maximum time: 0.073 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @testset "foobar" begin
@noinline child2() = return nothing
parent2() = child2()
display(@benchmark $parent2())
end
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 13.524 ns (0.00% GC)
median time: 13.663 ns (0.00% GC)
mean time: 13.732 ns (0.00% GC)
maximum time: 28.546 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 998 |
Hoping to target compiler performance for 1.0 and 1.0.x. |
I'm not sure what's actionable here. @maleadt's example generates good code now, and the original package (ContMechTensors.jl) doesn't load for me on master |
I'll try it again at some time. |
I retested my example above on the current master and got:
LGTM 🎉 Thanks y'all! |
We noticed that the tests in https://github.com/KristofferC/ContMechTensors.jl got a lot slower on 0.5 than on 0.4.
On 0.4 the first run takes ~50 seconds and the second takes 1.2 seconds.
On 0.5-rc1+1 the first run takes ~140 seconds and the second takes ~20 seconds.
Removing all the
@testset
and changing@test
to@assert
the second run on 0.5 takes ~2 seconds. If I only remove the@testset
the second run takes ~14 seconds on 0.5. The profile in type inference that takes the most time is the callstypeinf_frame
->stupdate!
->issubstate
. It seems that both@test
and@testset
is now causing type inference to run extra much or something?The text was updated successfully, but these errors were encountered: