Parallel postprocessing on Vorna #27

henry2004y · 2021-09-09T13:32:11Z

For some unknown reasons, Julia cannot launch more than 8 worker processes on Vorna which has 16 cores/node consisted of 2 CPUs. Weird.

henry2004y · 2021-09-09T16:24:09Z

Worker 16 terminated.
ERROR: LoadError: ProcessExitedException(16)
Stacktrace:
 [1] sync_end(c::Channel{Any})
   @ Base ./task.jl:369
 [2] macro expansion
   @ ./task.jl:388 [inlined]
 [3] _require_callback(mod::Base.PkgId)
   @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/Distributed.jl:76
 [4] #invokelatest#2
   @ ./essentials.jl:708 [inlined]
 [5] invokelatest
   @ ./essentials.jl:706 [inlined]
 [6] require(uuidkey::Base.PkgId)
   @ Base ./loading.jl:920
 [7] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:901
 [8] top-level scope
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/macros.jl:204
in expression starting at /wrk/users/hongyang/result/demo_1d2d_parallel_pyplot.jl:6
      From worker 16:	OpenBLAS blas_thread_init: pthread_create failed for thread 5 of 8: Resource temporarily unavailable
      From worker 16:	OpenBLAS blas_thread_init: RLIMIT_NPROC 200 current, 257413 max
...

I found that ClusterManagers.jl is the recommended way to handle cross-node cluster jobs. I've tested on Vorna with 2nodes, 32 cores successfully. One thing that might look a little bit annoying is that there is a standard output file for every process launched in Julia. Maybe there's also a way to turn if off?

Also note that someone complained about the slowness when calling @everywhere for the first time. He tried to launch a job with over 1000 cores and it takes more than an hour for the 1st @everywhere call.

henry2004y · 2021-09-09T21:36:18Z

Another observation is that as time goes by, the time it takes to generate plots becomes longer on Vorna. Not sure if this is something related to the machine or if there's any memory leak.

Not sure about how to address this. Shall I turn on the profiler?

henry2004y · 2021-09-30T09:23:53Z

After looking more into multiprocessing, I realized that my current implementation with broadcasting pattern like

const cmap = matplotlib.cm.turbo
@everywhere cmap = $cmap

is actually type-unstable. This can be verified by attempting to change the type of cmap on the remote process: Julia won't complain, which indicates that this is bad for performance.

I need to think of a better way to handle parameters.

By making the common parameters type-stable, we have some improvements on memory usage. For example, for the 2D parallel contour plots with PyPlot, the previous version has

julia> @time include("demo_2d_parallel_pyplot.jl")
Total number of files: 3
Running with 1 workers...
      From worker 2:	filename = ./bulk.0001347.vlsv
      From worker 2:	filename = ./bulk.0001348.vlsv
      From worker 2:	filename = ./bulk.0001492.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
Finished!
 54.947133 seconds (26.79 M allocations: 1.597 GiB, 1.19% gc time, 5.08% compilation time)

while the modified version has

julia> @time include("/home/hongyang/Vlasiator/Vlasiator.jl/examples/demo_2d_parallel_pyplot.jl")
Total number of files: 3
Running with 1 workers...
      From worker 2:	file = ./bulk.0001347.vlsv
      From worker 2:	file = ./bulk.0001348.vlsv
      From worker 2:	file = ./bulk.0001492.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
Finished!
 53.315913 seconds (17.11 M allocations: 1.021 GiB, 0.65% gc time, 5.20% compilation time)

henry2004y · 2021-09-30T19:18:20Z

Now we have learned

how to improve on memory usage by taking advantage of type-stability
how to run multi-node job with ClusterManagers

I would consider this done.

henry2004y · 2021-10-06T07:46:25Z

The parallel contour plotting is still slow as time progresses. Now I have a hypothesis: the default behavior in Matplotlib is appending data to the canvas, but not replacing. This means that with more frames, you are overlapping data and creating a huge memory burden.

Confirmed by tests.
Old method (overlapping):

Total number of files: 3
Running with 1 workers...
      From worker 2:	filename = ./bulk.0001347.vlsv
      From worker 2:	filename = ./bulk.0001348.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
      From worker 2:	filename = ./bulk.0001492.vlsv
Finished in 38.37s.
 60.756959 seconds (26.83 M allocations: 1.599 GiB, 1.22% gc time, 5.21% compilation time)

New method (no overlapping):

Total number of files: 3
Running with 1 workers...
      From worker 2:	file = ./bulk.0001347.vlsv
      From worker 2:	file = ./bulk.0001348.vlsv
┌ Warning: Less than 1GB free memory detected. Using memory-mapped I/O!
└ @ Vlasiator ~/.julia/packages/Vlasiator/muKF0/src/vlsv/vlsvreader.jl:127
      From worker 2:	file = ./bulk.0001492.vlsv
Finished in 26.08s.
 50.044524 seconds (17.15 M allocations: 1.023 GiB, 0.70% gc time, 6.37% compilation time)

This change makes it 13x faster with 8 workers on Vorna!

henry2004y added the bug Something isn't working label Sep 9, 2021

henry2004y added a commit that referenced this issue Sep 30, 2021

Example for #27

49b629b

henry2004y closed this as completed Sep 30, 2021

henry2004y reopened this Oct 6, 2021

henry2004y closed this as completed Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel postprocessing on Vorna #27

Parallel postprocessing on Vorna #27

henry2004y commented Sep 9, 2021

henry2004y commented Sep 9, 2021 •

edited

Loading

henry2004y commented Sep 9, 2021 •

edited

Loading

henry2004y commented Sep 30, 2021 •

edited

Loading

henry2004y commented Sep 30, 2021

henry2004y commented Oct 6, 2021 •

edited

Loading

Parallel postprocessing on Vorna #27

Parallel postprocessing on Vorna #27

Comments

henry2004y commented Sep 9, 2021

henry2004y commented Sep 9, 2021 • edited Loading

henry2004y commented Sep 9, 2021 • edited Loading

henry2004y commented Sep 30, 2021 • edited Loading

henry2004y commented Sep 30, 2021

henry2004y commented Oct 6, 2021 • edited Loading

henry2004y commented Sep 9, 2021 •

edited

Loading

henry2004y commented Sep 9, 2021 •

edited

Loading

henry2004y commented Sep 30, 2021 •

edited

Loading

henry2004y commented Oct 6, 2021 •

edited

Loading