-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel postprocessing on Vorna #27
Comments
Worker 16 terminated.
ERROR: LoadError: ProcessExitedException(16)
Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:369
[2] macro expansion
@ ./task.jl:388 [inlined]
[3] _require_callback(mod::Base.PkgId)
@ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/Distributed.jl:76
[4] #invokelatest#2
@ ./essentials.jl:708 [inlined]
[5] invokelatest
@ ./essentials.jl:706 [inlined]
[6] require(uuidkey::Base.PkgId)
@ Base ./loading.jl:920
[7] require(into::Module, mod::Symbol)
@ Base ./loading.jl:901
[8] top-level scope
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/macros.jl:204
in expression starting at /wrk/users/hongyang/result/demo_1d2d_parallel_pyplot.jl:6
From worker 16: OpenBLAS blas_thread_init: pthread_create failed for thread 5 of 8: Resource temporarily unavailable
From worker 16: OpenBLAS blas_thread_init: RLIMIT_NPROC 200 current, 257413 max
... I found that ClusterManagers.jl is the recommended way to handle cross-node cluster jobs. I've tested on Vorna with 2nodes, 32 cores successfully. One thing that might look a little bit annoying is that there is a standard output file for every process launched in Julia. Maybe there's also a way to turn if off? Also note that someone complained about the slowness when calling |
Another observation is that as time goes by, the time it takes to generate plots becomes longer on Vorna. Not sure if this is something related to the machine or if there's any memory leak. Not sure about how to address this. Shall I turn on the profiler? |
After looking more into multiprocessing, I realized that my current implementation with broadcasting pattern like const cmap = matplotlib.cm.turbo
@everywhere cmap = $cmap is actually type-unstable. This can be verified by attempting to change the type of I need to think of a better way to handle parameters. By making the common parameters type-stable, we have some improvements on memory usage. For example, for the 2D parallel contour plots with PyPlot, the previous version has julia> @time include("demo_2d_parallel_pyplot.jl")
Total number of files: 3
Running with 1 workers...
From worker 2: filename = ./bulk.0001347.vlsv
From worker 2: filename = ./bulk.0001348.vlsv
From worker 2: filename = ./bulk.0001492.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
Finished!
54.947133 seconds (26.79 M allocations: 1.597 GiB, 1.19% gc time, 5.08% compilation time) while the modified version has julia> @time include("/home/hongyang/Vlasiator/Vlasiator.jl/examples/demo_2d_parallel_pyplot.jl")
Total number of files: 3
Running with 1 workers...
From worker 2: file = ./bulk.0001347.vlsv
From worker 2: file = ./bulk.0001348.vlsv
From worker 2: file = ./bulk.0001492.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
Finished!
53.315913 seconds (17.11 M allocations: 1.021 GiB, 0.65% gc time, 5.20% compilation time) |
Now we have learned
I would consider this done. |
The parallel contour plotting is still slow as time progresses. Now I have a hypothesis: the default behavior in Matplotlib is appending data to the canvas, but not replacing. This means that with more frames, you are overlapping data and creating a huge memory burden. Confirmed by tests. Total number of files: 3
Running with 1 workers...
From worker 2: filename = ./bulk.0001347.vlsv
From worker 2: filename = ./bulk.0001348.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
From worker 2: filename = ./bulk.0001492.vlsv
Finished in 38.37s.
60.756959 seconds (26.83 M allocations: 1.599 GiB, 1.22% gc time, 5.21% compilation time) New method (no overlapping): Total number of files: 3
Running with 1 workers...
From worker 2: file = ./bulk.0001347.vlsv
From worker 2: file = ./bulk.0001348.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
From worker 2: file = ./bulk.0001492.vlsv
Finished in 26.08s.
50.044524 seconds (17.15 M allocations: 1.023 GiB, 0.70% gc time, 6.37% compilation time) This change makes it 13x faster with 8 workers on Vorna! |
For some unknown reasons, Julia cannot launch more than 8 worker processes on Vorna which has 16 cores/node consisted of 2 CPUs. Weird.
The text was updated successfully, but these errors were encountered: