-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tmapreduce #1
Comments
I have made a preliminary write-up of map-reduce. I will try to run comprehensive tests next week. The problem with threaded map-reduce is that we have to assume that what There are no such issues with |
Hi! |
Sure! As you can see in the code I have The only thing I am waiting for with moving forward is Julia 0.7 to be out to be sure I optimize against a stabilized thing (especially as you probably know - threading model in Julia is experimental https://discourse.julialang.org/t/future-of-base-threads/10440 and might change in the future). Anyway - help would be appreciated. As I have written in the other issue you have commented on I plan to talk about those things on JuliaCon 2018 so it would be great to work out the best practice to share with everyone. |
I tried the |
|
The code can be found in https://github.com/mohamed82008/KissThreading.jl/tree/fixmapreduce which is based on the PR #2. |
On the other hand a complete rework using static load division was made in https://github.com/mohamed82008/KissThreading.jl/tree/rework and all of I noticed the |
So dynamic load balancing should be better if it works, but it is causing allocations in the However the inference bug is not causing the performance hit, since the static version of The following timings were generated from: n = 10000000
a = rand(n)
println("threaded tmapreduce: $(Threads.nthreads()) threads")
tmapreduce(log, +, 0., a)
@time tmapreduce(log, +, 0., a)
println("threaded tmapadd: $(Threads.nthreads()) threads")
tmapadd(log, 0., a)
@time tmapadd(log, 0., a)
println("unthreaded")
mapreduce(log, +, a, init=0.)
@time mapreduce(log, +, a, init=0.) Static: threaded tmapreduce: 4 threads
0.034197 seconds (30 allocations: 1008 bytes)
threaded tmapadd: 4 threads
0.034678 seconds (16 allocations: 384 bytes)
unthreaded
0.106056 seconds (18 allocations: 816 bytes) Dynamic: threaded tmapreduce: 4 threads
0.647810 seconds (6.33 M allocations: 93.911 MiB)
threaded tmapadd: 4 threads
0.636600 seconds (5.87 M allocations: 86.418 MiB)
unthreaded
0.104249 seconds (18 allocations: 816 bytes) |
Relevant: JuliaLang/julia#27694 |
Thanks. I was waiting for 0.7 to stabilize to update the package and then register it. In particular given a new But |
Wow the version with batches is doing really well, linear scaling. Still allocating but speed is good. I also found that a good heuristic for the batch size is threaded tmapreduce: 4 threads
0.024689 seconds (350 allocations: 6.000 KiB)
threaded tmapadd: 4 threads
0.026095 seconds (342 allocations: 5.859 KiB)
unthreaded
0.099201 seconds (18 allocations: 816 bytes) |
No description provided.
The text was updated successfully, but these errors were encountered: