-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] History collection give out of bound error with EmbeddedSurfaceSubRegion. #1356
Comments
Just reproduced this on quartz, I'll take a look. |
Bypassing the initial issue -- that the set we're collecting from is initially empty -- is relatively straightforward. The bigger issue is that later on when we go to collect the output the buffer won't be allocated since it thinks the size is 0. I'm in the middle of triaging what is needed to add the functionality (some of it is in place, but some things remain to be worked out). I'll update when I have an idea what will be required to get the new functionality up and running. |
Let me know if I can help. I am also working on syncrhonizing data across ranks in mpi runs in #1354. I wonder if we can do something at that stage for the |
I spent some time this morning looking at this. Changing collection operations to allow the collected data to change size is pretty straightforward and doesn't require a big change. The difficulty is writing the data to file since we buffer our collection operations internally until we go to write a bunch of them at once. This requires accounting for a number of things in with MPI+HDF5 in terms of data psuedo-partitioning, dataspace sizing and changing the way we write the data from all-at-once to incremental (can't hyperslab at dataspace in HDF5 with one of the dimensions changing extent, so we treat each collected 'row' as a hyperslab and write them incrementally). So we'll have to change our buffering + file writing operations. I am part way through what needs to get done but it will probably take a couple days to get everything put together since there are a number of edge cases we can run into and its best to handle them without breaking the cases out. Mostly these amount to dealing with data movement/creation/deletion across MPI between write operations:
In particular it is dealing with these in the context of HDF5 that represent the biggest issue since HDF5 has some caveats w.r.t. using MPIO for simultaneous write operations. These can all be handled by a single operation to be sure, just need to make a number of adjustments to the current way we buffer + write. |
Thanks for looking into this. I think it is important to fix this. Do you think that writing data incrementally will also allow us to start writing to the HDF5 files info related to the solver performance (i.e., iteration count etc) ? This would be a big plus. I don't know much about neither HDF5 nor MPIO but I think that for the use case I have in mind some of the things you listed should never happen and this may simplify things a bit. Below my comment about each difficult scenario you described.
I d like to hear @rrsettgast and @corbett5 opinions. Also, I wonder if this functionality could be useful for @homel1 's work on the particles. I guess particles may introduce some of the cases you mentioned. In particular I think they can move from one rank to another. |
Describe the bug
Out of bound error in the pack collection when running cases with
EmbeddedSurfaceSubRegion
.To Reproduce
Just try to run Tutorial 10 with a debug build.
The text was updated successfully, but these errors were encountered: