Skip to content

Commit

Permalink
Merge pull request #3095 from vicentebolea/add-mpi-dataplane
Browse files Browse the repository at this point in the history
SST: Add MPI SST dataplane
  • Loading branch information
pnorbert authored Jul 20, 2022
2 parents b7832af + 2440d18 commit ee3bfc8
Show file tree
Hide file tree
Showing 174 changed files with 1,686 additions and 231 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/everything.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,9 @@ jobs:
compiler: cuda
parallel: serial
constrains: build_only
- os: el8
compiler: gcc10
parallel: mpich

steps:
- uses: actions/checkout@v3
Expand Down
3 changes: 3 additions & 0 deletions cmake/DetectOptions.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,9 @@ if(ADIOS2_USE_SST AND NOT WIN32)
set(ADIOS2_SST_HAVE_CRAY_DRC TRUE)
endif()
endif()
if(ADIOS2_HAVE_MPI)
set(ADIOS2_SST_HAVE_MPI TRUE)
endif()
endif()

# DAOS
Expand Down
59 changes: 59 additions & 0 deletions docs/user_guide/source/advanced/ecp_hardware.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
######################
ADIOS2 in ECP hardware
######################

ADIOS2 is widely used in ECP (Exascale Computing Project) HPC (high performance
computing) systems, some particular ADIOS2 features needs from specifics
workarounds to run successfully.

OLCF CRUSHER
============

SST MPI Data Transport
----------------------

MPI Data Transport relies on client-server features of MPI which are currently
supported in Cray-MPI implementations with some caveats. Here are some of the
observed issues and what its workaround (if any) are:

**MPI_Finalize** will block the system process in the "Writer/Producer" ADIOS2
instance. The reason is that the Producer ADIOS instance internally calls
`MPI_Open_port` which somehow even after calling `MPI_Close_port` `MPI_Finalize`
still consider its port to be in used, hence blocking the process. The
workaround is to use a `MPI_Barrier(MPI_COMM_WORLD)` instead of `MPI_Finalize()`
call.

**srun does not understand mpmd instructions** Simply disable them with the flag
`-DADIOS2_RUN_MPI_MPMD_TESTS=OFF`

**Tests timeout** Since we launch every tests with srun the scheduling times
can exceed the test default timeout. Use a large timeout (5mins) for running
your tests.

Examples of launching ADIOS2 SST unit tests using MPI DP:

.. code-block:: bash
# We omit some of the srun (SLURM) arguments which are specific of the project
# you are working on. Note that you could avoid calling srun directly by
# setting the CMAKE variable `MPIEXEC_EXECUTABLE`.
# Launch simple writer test instance
srun {PROJFLAGS }-N 1 /gpfs/alpine/proj-shared/csc331/vbolea/ADIOS2-build/bin/TestCommonWrite SST mpi_dp_test CPCommPattern=Min,MarshalMethod=BP5'
# On another terminal launch multiple instances of the Reader test
srun {PROJFLAGS} -N 2 /gpfs/alpine/proj-shared/csc331/vbolea/ADIOS2-build/bin/TestCommonRead SST mpi_dp_test
Alternatively, you can configure your CMake build to use srun directly:
.. code-block:: bash
cmake . -DMPIEXEC_EXECUTABLE:FILEPATH="/usr/bin/srun" \
-DMPIEXEC_EXTRA_FLAGS:STRING="-A{YourProject} -pbatch -t10" \
-DMPIEXEC_NUMPROC_FLAG:STRING="-N" \
-DMPIEXEC_MAX_NUMPROCS:STRING="-8" \
-DADIOS2_RUN_MPI_MPMD_TESTS=OFF
cmake --build .
ctest
# monitor your jobs
watch -n1 squeue -l -u $USER
4 changes: 2 additions & 2 deletions docs/user_guide/source/engines/sst.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ the underlying network communication mechanism to use for exchanging
data in SST. Generally this is chosen by SST based upon what is
available on the current platform. However, specifying this engine
parameter allows overriding SST's choice. Current allowed values are
**"RDMA"** and **"WAN"**. (**ib** and **fabric** are accepted as
**"MPI"**, **"RDMA"**, and **"WAN"**. (**ib** and **fabric** are accepted as
equivalent to **RDMA** and **evpath** is equivalent to **WAN**.)
Generally both the reader and writer should be using the same network
transport, and the network transport chosen may be dictated by the
Expand Down Expand Up @@ -288,7 +288,7 @@ BeginStep timeouts) and writer-side rules (like queue limit behavior) apply.
QueueLimit integer **0** (no queue limits)
QueueFullPolicy string **Block**, Discard
ReserveQueueLimit integer **0** (no queue limits)
DataTransport string **default varies by platform**, RDMA, WAN
DataTransport string **default varies by platform**, MPI, RDMA, WAN
WANDataTransport string **sockets**, enet, ib
ControlTransport string **TCP**, Scalable
NetworkInterface string **NULL**
Expand Down
1 change: 1 addition & 0 deletions docs/user_guide/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Funded by the `Exascale Computing Project (ECP) <https://www.exascaleproject.org
advanced/memory_management
advanced/gpu_aware
advanced/plugins
advanced/ecp_hardware

.. toctree::
:caption: Ecosystem Tools
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/globalArray/globalArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ int main(int argc, char *argv[])
{
int rank = 0, nproc = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/joinedArray/joinedArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ int main(int argc, char *argv[])
int rank = 0;
#if ADIOS2_USE_MPI
int nproc = 1;
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/localArray/localArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ int main(int argc, char *argv[])
int rank = 0;
#if ADIOS2_USE_MPI
int nproc = 1;
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/values/values_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ int main(int argc, char *argv[])
{
int rank = 0, nproc = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
11 changes: 10 additions & 1 deletion examples/heatTransfer/read/heatRead.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,16 @@ void Compute(const std::vector<double> &Tin, std::vector<double> &Tout,

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
std::string engineName = std::string(argv[argc - 1]);

int threadSupportLevel = MPI_THREAD_SINGLE;
if (engineName == "SST")
{
threadSupportLevel = MPI_THREAD_MULTIPLE;
}

MPI_Init_thread(&argc, &argv, threadSupportLevel, &provided);

/* When writer and reader is launched together with a single mpirun command,
the world comm spans all applications. We have to split and create the
Expand Down
12 changes: 11 additions & 1 deletion examples/heatTransfer/write/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,17 @@ void printUsage()

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;

std::string engineName = std::string(argv[argc - 1]);

int threadSupportLevel = MPI_THREAD_SINGLE;
if (engineName == "SST")
{
threadSupportLevel = MPI_THREAD_MULTIPLE;
}

MPI_Init_thread(&argc, &argv, threadSupportLevel, &provided);

/* When writer and reader is launched together with a single mpirun command,
the world comm spans all applications. We have to split and create the
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpAttributeWriter/helloBPAttributeWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFWriteCRead/CppReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFWriteCRead/CppWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFlushWriter/helloBPFlushWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReaderHeatMap2D.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReaderHeatMap3D.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpTimeWriter/helloBPTimeWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPPutDeferred.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPSZ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPSubStreams.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ int main(int argc, char *argv[])
{
int rank, size;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPWriter.c
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ int main(int argc, char *argv[])
{
int rank, size;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/datamanReader/helloDataManReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ void PrintData(std::vector<T> &data, size_t step)
int main(int argc, char *argv[])
{
// initialize MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

Expand Down
3 changes: 2 additions & 1 deletion examples/hello/datamanWriter/helloDataManWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ std::vector<T> GenerateData(const size_t step)
int main(int argc, char *argv[])
{
// initialize MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

Expand Down
3 changes: 2 additions & 1 deletion examples/hello/dataspacesReader/helloDataSpacesReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ int main(int argc, char *argv[])
int size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/dataspacesWriter/helloDataSpacesWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ int main(int argc, char *argv[])
int size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/hdf5Reader/helloHDF5Reader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,8 @@ void ReadData(adios2::IO h5IO, adios2::Engine &h5Reader,

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/hdf5Writer/helloHDF5Writer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ int main(int argc, char *argv[])
{
int rank = 0, size = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
adios2::ADIOS adios(MPI_COMM_WORLD);
Expand Down
Loading

0 comments on commit ee3bfc8

Please sign in to comment.