Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read from one (out of many) ranks. #856

Closed
guj opened this issue Dec 21, 2020 · 2 comments · Fixed by #862
Closed

Read from one (out of many) ranks. #856

guj opened this issue Dec 21, 2020 · 2 comments · Fixed by #862

Comments

@guj
Copy link
Contributor

guj commented Dec 21, 2020

Describe the bug
Unable to open a file-based series in an MPI program, then read from one of the rank. E.g . rank 0.

To Reproduce
Compile-able/executable code example to reproduce the problem:

#include <openPMD/openPMD.hpp>
#include <mpi.h>

#include <iostream>
#include <memory>
#include <cstddef>


using std::cout;
using namespace openPMD;

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);

    int mpi_size;
    int mpi_rank;

    MPI_Comm_size(MPI_COMM_WORLD, &mpi_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &mpi_rank);
 {
        Series series = Series(
            "../samples/8a_parallel_3Db_%T.bp",
            Access::READ_ONLY, MPI_COMM_WORLD);            

        MeshRecordComponent E_x = series.iterations[1].meshes["B"]["x"];

        Offset chunk_offset = {
          static_cast< long unsigned int >(mpi_rank) + 1, 1, 1
        };
        Extent chunk_extent = {2, 2, 1};

        if (0 == mpi_rank) 
          {
            auto chunk_data = E_x.loadChunk<double>(chunk_offset, chunk_extent);
            series.flush();        
          }
    }

    MPI_Finalize();

    return 0;
}

// ...

Expected behavior
when running with 1 rank, read is successful.
when running with 2 ranks, hangs.
Reason:
when Series is opened for reading , Series.cpp's readFileBased() is called (by all ranks). This functions issues fClose upon finishes. When Rank 0 wants to read a variable, readFileBased is invoked by rank 0 upon flush(). Which then tries to open the iteration again. However because Rank 1 is inactive, open, a collective call, can not proceed successfully.

Software Environment
current with

  • version of ADIOS2: [e.g. 2.6.0]
    (did not check with H5)
@guj guj added the bug label Dec 21, 2020
@franzpoeschel
Copy link
Contributor

Some background:
In ADIOS, opening a new engine (i.e. file) is a collective operation. This explains why the first access to an iteration in file-based iteration layout is collective, too. This is true for readers as well as for writers (a similarly structured writer as in Junmin's example would fail for writers, too).

Now, such a pattern for reading actually used to work until we merged #822. This was, because Series::readFileBased would simply open up all iterations one after another during parsing and never bothering closing any files, leading to errors like OSError: [Errno 24] Too many open files, first observed here. As Junmin points out, subsequent reading operations were not collective, since all those collective operations had already been done in a batch during parsing.

So we should discuss whether we should treat this as a bug or not: If the following circumstances are given:

  • Opening a file is a collective operation in the backend
  • File-based iteration layout is used

.. should accessing an iteration be a collective operation then or not?

  • No: This would be hard to implement. We cannot expect read/write patterns up front, hence doing something as drastic as keeping thousands of filehandles open seems unavoidable. Likely impossible to implement for readers.
  • Yes: This would break reading workflows that were previously possibly (albeit somewhat accidentally)

Proposal:

  1. We should remove this conditional. Opening a file only in case it is dirty is asking for trouble in parallel situations. We should ensure that the following code will open the iteration's file:
     auto iteration = series.iterations[0];
     series.flush(); // iteration is not dirty on our rank, but it might be on another one
    This also means that using Series::flush() would be reverting back to the behavior pre-Allow reading a file-based series with many iterations without crashing the number of file handles #822. Meaning that we should provide alternatives:
  2. With the arguments above, accessing a file-based Series should be done in a pattern where all ranks step through the iterations collectively. Iteration::close() should be used in favor of Series::flush(). Additionally, introduce a new call Iteration::flush(), to only flush the current iteration – and promote its usage over Series::flush().
  3. Since item 1) would lead to some pitfalls, we could alternatively replace this conditional by one that checks for iterations that have been accessed via series.iterations[i].

Either suggestion would not directly fix Junmin's code snippet. But it would make it fixable by replacing:

        if (0 == mpi_rank) 
          {
            auto chunk_data = E_x.loadChunk<double>(chunk_offset, chunk_extent);
            series.flush();        
          }

with:

        if (0 == mpi_rank) 
          {
            auto chunk_data = E_x.loadChunk<double>(chunk_offset, chunk_extent);
          }
        series.flush();        

@ax3l ax3l added the MPI label Dec 22, 2020
@ax3l
Copy link
Member

ax3l commented Dec 22, 2020

Discussed offline: Implementing an Iteration::open() (collective) can solve this use-case, only affects file-based encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants