Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SST: Add MPI SST dataplane #3095

Merged
merged 6 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/everything.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,9 @@ jobs:
compiler: cuda
parallel: serial
constrains: build_only
- os: el8
compiler: gcc10
parallel: mpich

steps:
- uses: actions/checkout@v3
Expand Down
3 changes: 3 additions & 0 deletions cmake/DetectOptions.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,9 @@ if(ADIOS2_USE_SST AND NOT WIN32)
set(ADIOS2_SST_HAVE_CRAY_DRC TRUE)
endif()
endif()
if(ADIOS2_HAVE_MPI)
set(ADIOS2_SST_HAVE_MPI TRUE)
endif()
endif()

# DAOS
Expand Down
59 changes: 59 additions & 0 deletions docs/user_guide/source/advanced/ecp_hardware.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
######################
ADIOS2 in ECP hardware
######################

ADIOS2 is widely used in ECP (Exascale Computing Project) HPC (high performance
computing) systems, some particular ADIOS2 features needs from specifics
workarounds to run successfully.

OLCF CRUSHER
============

SST MPI Data Transport
----------------------

MPI Data Transport relies on client-server features of MPI which are currently
supported in Cray-MPI implementations with some caveats. Here are some of the
observed issues and what its workaround (if any) are:

**MPI_Finalize** will block the system process in the "Writer/Producer" ADIOS2
instance. The reason is that the Producer ADIOS instance internally calls
`MPI_Open_port` which somehow even after calling `MPI_Close_port` `MPI_Finalize`
still consider its port to be in used, hence blocking the process. The
workaround is to use a `MPI_Barrier(MPI_COMM_WORLD)` instead of `MPI_Finalize()`
call.

**srun does not understand mpmd instructions** Simply disable them with the flag
`-DADIOS2_RUN_MPI_MPMD_TESTS=OFF`

**Tests timeout** Since we launch every tests with srun the scheduling times
can exceed the test default timeout. Use a large timeout (5mins) for running
your tests.

Examples of launching ADIOS2 SST unit tests using MPI DP:

.. code-block:: bash
# We omit some of the srun (SLURM) arguments which are specific of the project
# you are working on. Note that you could avoid calling srun directly by
# setting the CMAKE variable `MPIEXEC_EXECUTABLE`.
# Launch simple writer test instance
srun {PROJFLAGS }-N 1 /gpfs/alpine/proj-shared/csc331/vbolea/ADIOS2-build/bin/TestCommonWrite SST mpi_dp_test CPCommPattern=Min,MarshalMethod=BP5'
# On another terminal launch multiple instances of the Reader test
srun {PROJFLAGS} -N 2 /gpfs/alpine/proj-shared/csc331/vbolea/ADIOS2-build/bin/TestCommonRead SST mpi_dp_test
Alternatively, you can configure your CMake build to use srun directly:
.. code-block:: bash
cmake . -DMPIEXEC_EXECUTABLE:FILEPATH="/usr/bin/srun" \
-DMPIEXEC_EXTRA_FLAGS:STRING="-A{YourProject} -pbatch -t10" \
-DMPIEXEC_NUMPROC_FLAG:STRING="-N" \
-DMPIEXEC_MAX_NUMPROCS:STRING="-8" \
-DADIOS2_RUN_MPI_MPMD_TESTS=OFF
cmake --build .
ctest
# monitor your jobs
watch -n1 squeue -l -u $USER
4 changes: 2 additions & 2 deletions docs/user_guide/source/engines/sst.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ the underlying network communication mechanism to use for exchanging
data in SST. Generally this is chosen by SST based upon what is
available on the current platform. However, specifying this engine
parameter allows overriding SST's choice. Current allowed values are
**"RDMA"** and **"WAN"**. (**ib** and **fabric** are accepted as
**"MPI"**, **"RDMA"**, and **"WAN"**. (**ib** and **fabric** are accepted as
equivalent to **RDMA** and **evpath** is equivalent to **WAN**.)
Generally both the reader and writer should be using the same network
transport, and the network transport chosen may be dictated by the
Expand Down Expand Up @@ -288,7 +288,7 @@ BeginStep timeouts) and writer-side rules (like queue limit behavior) apply.
QueueLimit integer **0** (no queue limits)
QueueFullPolicy string **Block**, Discard
ReserveQueueLimit integer **0** (no queue limits)
DataTransport string **default varies by platform**, RDMA, WAN
DataTransport string **default varies by platform**, MPI, RDMA, WAN
WANDataTransport string **sockets**, enet, ib
ControlTransport string **TCP**, Scalable
NetworkInterface string **NULL**
Expand Down
1 change: 1 addition & 0 deletions docs/user_guide/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Funded by the `Exascale Computing Project (ECP) <https://www.exascaleproject.org
advanced/memory_management
advanced/gpu_aware
advanced/plugins
advanced/ecp_hardware

.. toctree::
:caption: Ecosystem Tools
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/globalArray/globalArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ int main(int argc, char *argv[])
{
int rank = 0, nproc = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/joinedArray/joinedArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ int main(int argc, char *argv[])
int rank = 0;
#if ADIOS2_USE_MPI
int nproc = 1;
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/localArray/localArray_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ int main(int argc, char *argv[])
int rank = 0;
#if ADIOS2_USE_MPI
int nproc = 1;
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
3 changes: 2 additions & 1 deletion examples/basics/values/values_write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ int main(int argc, char *argv[])
{
int rank = 0, nproc = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
#endif
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
11 changes: 10 additions & 1 deletion examples/heatTransfer/read/heatRead.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,16 @@ void Compute(const std::vector<double> &Tin, std::vector<double> &Tout,

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
std::string engineName = std::string(argv[argc - 1]);

int threadSupportLevel = MPI_THREAD_SINGLE;
if (engineName == "SST")
{
threadSupportLevel = MPI_THREAD_MULTIPLE;
}

MPI_Init_thread(&argc, &argv, threadSupportLevel, &provided);

/* When writer and reader is launched together with a single mpirun command,
the world comm spans all applications. We have to split and create the
Expand Down
12 changes: 11 additions & 1 deletion examples/heatTransfer/write/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,17 @@ void printUsage()

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;

std::string engineName = std::string(argv[argc - 1]);

int threadSupportLevel = MPI_THREAD_SINGLE;
if (engineName == "SST")
{
threadSupportLevel = MPI_THREAD_MULTIPLE;
}

MPI_Init_thread(&argc, &argv, threadSupportLevel, &provided);

/* When writer and reader is launched together with a single mpirun command,
the world comm spans all applications. We have to split and create the
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpAttributeWriter/helloBPAttributeWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFWriteCRead/CppReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFWriteCRead/CppWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpFlushWriter/helloBPFlushWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReaderHeatMap2D.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpReader/helloBPReaderHeatMap3D.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpTimeWriter/helloBPTimeWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPPutDeferred.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPSZ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPSubStreams.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ int main(int argc, char *argv[])
{
int rank, size;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPWriter.c
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ int main(int argc, char *argv[])
int rank, size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/bpWriter/helloBPWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ int main(int argc, char *argv[])
{
int rank, size;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/datamanReader/helloDataManReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ void PrintData(std::vector<T> &data, size_t step)
int main(int argc, char *argv[])
{
// initialize MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

Expand Down
3 changes: 2 additions & 1 deletion examples/hello/datamanWriter/helloDataManWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ std::vector<T> GenerateData(const size_t step)
int main(int argc, char *argv[])
{
// initialize MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

Expand Down
3 changes: 2 additions & 1 deletion examples/hello/dataspacesReader/helloDataSpacesReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ int main(int argc, char *argv[])
int size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/dataspacesWriter/helloDataSpacesWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ int main(int argc, char *argv[])
int size;

#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/hdf5Reader/helloHDF5Reader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,8 @@ void ReadData(adios2::IO h5IO, adios2::Engine &h5Reader,

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
3 changes: 2 additions & 1 deletion examples/hello/hdf5Writer/helloHDF5Writer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ int main(int argc, char *argv[])
{
int rank = 0, size = 1;
#if ADIOS2_USE_MPI
MPI_Init(&argc, &argv);
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
adios2::ADIOS adios(MPI_COMM_WORLD);
Expand Down
Loading