More options on substreams #1561

jychoi-hpc · 2019-06-27T19:47:47Z

This is a feature request on aggregation (substreams).

I am wondering if we can add an option to set stride on aggregation (say, streamstride=Y).

Currently, if I run N processes and set X as substreams, rank 0 up to N/X-1 will be aggregated. I like to have an option to aggregate every Y-th processes (i.e., rank 0, Y-1, 2*Y-1, etc).

This will be helpful on Summit (and with SSD). Especially, XGC re-orders ranks and currently it is impossible to set to write XGC restart data by using one aggregator per node.

Any comment or suggestion will be appreciated.

The text was updated successfully, but these errors were encountered:

williamfgc · 2019-06-27T19:52:30Z

@jychoi-hpc correct me if I don't understand the request correctly, would substreams=N/stride be what you're looking for?

chuckatkins · 2019-06-27T20:26:21Z

I think it would be super useful if we could even generalize this a little more and have different grouping strategies. Something similar to the process distribution option that job schedulers have. We could allow something like:

block - consecutive grouping like is done now (0,1,2,3)(4,5,6,7)(8,9,10,11)
cyclic - round robin style (0,3,6,9)(1,4,7,10)(2,5,8,11)
plane=N - a mix of the two, so say N=2 then (0,1,6,7)(2,3,8,9)(4,5,10,11)
random - self explanatory

Implementing one extra would likely be not much different than all of them. Worth considering I think.

jychoi-hpc · 2019-06-27T20:39:45Z

I like what @chuckatkins suggested.

Here is a simple case to use block and cyclic. In the following cases, here is a layout of ranks over 3 nodes (6 MPI processes per node).

Case A
node1:  0  1  2  3  4  5
node2:  6  7  8  9 10 11
node3: 12 13 14 15 16 17

Case B
node1:  0  3  6  9 12 15
node2:  1  4  7 10 13 16
node3:  2  5  8 11 14 17

If one want to use one aggregator per node, he/she can use block approach for Case A and cyclic for Case B.

germasch · 2019-06-27T20:45:36Z

One potential option that would be very flexible might be to allow the application to pass in the split communicator itself, in addition to the default behavior that does the MPI_Comm_split internally in MPIAggregator.

chuckatkins · 2019-06-27T20:51:13Z

Even cooler would be if there was a way to also have a user defined callback at each step to adjust the aggregation. It's something that is particularly of interest for viz where something like an isosurface is sparse but the nodes on where the data is dense changes across steps. That'd obviously be much more work but the initial support for different fixed aggregation strategies could lay the groundwork for that later on.

Initially though just having a few fixed strategies would be a great addition.

williamfgc · 2019-06-28T12:00:21Z

@jychoi-hpc thanks for the example, let's start (after the release) with the case needed by XGC. Thanks!

pnorbert · 2019-06-29T18:50:42Z

This sounds okay to me but the most generic option is that we allow for passing a communicator in Open(). So an application can order the processes for the I/O in any way, e.g. to have consecutive ranks on one node.

jychoi-hpc added the enhancement label Jun 27, 2019

williamfgc self-assigned this Jun 28, 2019

chuckatkins closed this as completed in 90e56b2 Mar 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More options on substreams #1561

More options on substreams #1561

jychoi-hpc commented Jun 27, 2019

williamfgc commented Jun 27, 2019 •

edited

Loading

chuckatkins commented Jun 27, 2019

jychoi-hpc commented Jun 27, 2019

germasch commented Jun 27, 2019

chuckatkins commented Jun 27, 2019

williamfgc commented Jun 28, 2019

pnorbert commented Jun 29, 2019

More options on substreams #1561

More options on substreams #1561

Comments

jychoi-hpc commented Jun 27, 2019

williamfgc commented Jun 27, 2019 • edited Loading

chuckatkins commented Jun 27, 2019

jychoi-hpc commented Jun 27, 2019

germasch commented Jun 27, 2019

chuckatkins commented Jun 27, 2019

williamfgc commented Jun 28, 2019

pnorbert commented Jun 29, 2019

williamfgc commented Jun 27, 2019 •

edited

Loading