-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release/public-v1: fix grib2 read errors with large task counts (draft PR) #155
release/public-v1: fix grib2 read errors with large task counts (draft PR) #155
Conversation
…sk 0, let all other processes wait for it to complete
@GeorgeGayno-NOAA Should we decide to proceed with this PR, I want to suggest to pull in your bugfix PR for develop that corrected the calls (arguments) to |
I have no problem with that. I am sure @DusanJovic-NOAA would agree. |
sfc_climo_gen. Issue ufs-community#140.
Thanks, @GeorgeGayno-NOAA. I cherry-picked the commits relevant for release/public-v1 from #148, resolved the conflicts and updated this PR. Would you please take a look? I'll test it with the MRW App on a few platforms so that we can make sure it is working as intended. |
@climbfuji Your updates look correct. |
@GeorgeGayno-NOAA this works as expected, tested on Cheyenne with both GNU and Intel and on Orion with Intel. Can you please approve and merge? |
ee918e4
to
790a760
Compare
Thanks, @GeorgeGayno-NOAA - merged and retagged, pushed the tag to the repo. |
This PR is an attempt to fox the grib2 read errors encountered with large task counts randomly on several machines. The issue is described in length in ufs-community/ufs-mrweather-app#190, together with a suggestion from @GeorgeGayno-NOAA in ufs-community/ufs-mrweather-app#190 (comment) (implemented here).
Changes in this PR:
MPI_ABORT
calls in various files (from @DusanJovic-NOAA)Testing:
@ligiabernardet @panll FYI