-
-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization of Wedge Product #29796
Comments
This comment has been minimized.
This comment has been minimized.
Author: Michael Jung |
Changed keywords from none to mixed_forms |
This comment has been minimized.
This comment has been minimized.
Changed keywords from mixed_forms to differential_forms, parallel |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Commit: |
comment:7
This is my very first approach simply copied from the previous ones. However, I noticed that in lower dimensions, the parallelization is even slower. Furthermore, one could improve this process a little bit further just my considering distinct indices from the beginning (see the check in the loop). I appreciate any help since I am not familiar with effective parallelization. New commits:
|
comment:9
Some computations in 4 dimensions made it slightly worse: from around 8 sec to 15 sec. In contrast, more complicated computations in 6 dimensions yield a good improvement. However, I noticed that the cpus are not fully engaged and run around 20-80% workload, varying all the time. Hence, there is much room for improvement. I appreciate any suggestions. I feel a little bit lost here. New commits:
|
comment:10
Replying to @mjungmath:
I would say that the behaviour that you observe is due to the computation being not fully parallelized in the current code. Indeed, in the final lines
the computation |
comment:11
Interestingly, I dropped the summation completely, and still, the computation takes longer than without parallelization. This is odd, isn't it? Even this modification doesn't improve anything: ind_list = [(ind_s, ind_o) for ind_s in cmp_s._comp
for ind_o in cmp_o._comp
if len(ind_s+ind_o) == len(set(ind_s+ind_o))]
nproc = Parallelism().get('tensor')
if nproc != 1:
# Parallel computation
lol = lambda lst, sz: [lst[i:i + sz] for i in
range(0, len(lst), sz)]
ind_step = max(1, int(len(ind_list) / nproc))
local_list = lol(ind_list, ind_step)
# list of input parameters:
listParalInput = [(cmp_s, cmp_o, ind_part) for ind_part in
local_list]
@parallel(p_iter='multiprocessing', ncpus=nproc)
def paral_wedge(s, o, local_list_ind):
partial = []
for ind_s, ind_o in local_list_ind:
ind_r = ind_s + ind_o
partial.append([ind_r, s._comp[ind_s] * o._comp[ind_o]])
return partial
for ii, val in paral_wedge(listParalInput):
for jj in val:
cmp_r[[jj[0]]] = jj[1]
else:
# Sequential computation
for ind_s, ind_o in ind_list:
ind_r = ind_s + ind_o
cmp_r[[ind_r]] += cmp_s._comp[ind_s] * cmp_o._comp[ind_o] If I am fully aware that this leads to wrong results and the summation should be covered within the parallelization, somehow. Nevertheless, this seems strange to me. |
comment:12
Besides this odd fact, do you have any ideas how the summation can be parallelized, too? |
comment:15
Setting new milestone based on a cursory review of ticket status, priority, and last modification date. |
comment:16
By the way, why don't we use See for example: https://towardsdatascience.com/a-beginners-introduction-into-mapreduce-2c912bb5e6ac |
Apparently, the wedge product is not performed on multiple cores when parallel computation is enabled. According to the compontent-wise computation of general tensors, I add this feature for the wedge product for alternate forms, too.
CC: @egourgoulhon @tscrim @mkoeppe
Component: geometry
Keywords: differential_forms, parallel
Author: Michael Jung
Branch/Commit: u/gh-mjungmath/wedge_product_parallel @
6303e7c
Issue created by migration from https://trac.sagemath.org/ticket/29796
The text was updated successfully, but these errors were encountered: