-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checking of nans in export field bundles #377
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please choose either FB_check_for_nans or fldbun_check_for_nans.
I will test UFS, but I'd like to wait until after the WW3 unstructured mesh gets committed early this week. |
@DeniseWorthen - thanks! @uturuncoglu @denise - This should have no impact on the UFS since I have inserted an #ifdef for this to only be relevant in CESM. The issue is that there is no code in UFS that implements the shr_infnan_mod logic that we are using in CESM. So until that type of functionality exists - UFS should just call this routine and immediately return. @uturuncoglu - for CMCC this code can be activated since they are also using the share code that CESM is using. |
@mvertens Thanks. I hadn't actually looked at the changes yet ;-) |
Thanks a lot for your work on this, @mvertens - this will be very nice to have! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvertens We have the shr_infnan_mod.F90 available in our CDEPS repo so I will try to get this working on our side for a later PR.
@jedwards4b In med_methods_mod, we have
should that be
? |
This should not have passed testing - I'm looking into it now. |
@mvertens @jedwards4b I've been able to turn this feature on for UFS now but I think I'm seeing a fairly big impact on run time. I ran our c96-1deg ocean+ice case for 5 days with this feature on and off. To turn off, I just inserted a return at the top of the Was a similar impact seen in CESM? I'm wondering how to best make this optional for UFS, since it would be a nice feature to have available. One way I thought is
Or, there could be a config variable to toggle. Or, it could be controlled w/ a #ifdef DEBUG. Any thoughts? |
@DeniseWorthen Thanks for looking at the performance impact of this change. I will run a similar test with PFS.ne30pg3_t061.B1850MOM and let you know. I think it would be best to add a namelist variable to control it - I'll open a issue to track that. |
For a 20 day run with PFS.ne30pg3_t061.B1850MOM |
wow, much less impact for you. |
@jedwards4b - thanks for this test. I think its worth keeping this check with this small of a hit- particularly since we don't turn on floating point trapping in production runs. Do you agree? I think it also should be discussed at a CSEG meeting to make sure there is agreement to keep this given that there is a cost. |
Another follow-up. It turns out that the impact UFS was seeing in run time w/ this feature on was not due to the NaN checking, but w/ the switching of the default PIO settings in an earlier PR (#378). If PIO settings are not provided, med_io had been using
This was changed to
If I revert that change, turning on the check-for-nan feature has negligible impact on wall-clock for our 5day test (in fact, it was a tiny bit faster, but that is just noise). I don't understand the PIO settings and how to choose them, but it looks I can specify our old settings as a configuration using
|
@DeniseWorthen Thanks for looking into this further. The subset rearranger scales better but is slower at small task counts. |
@jedwards4b Interesting! Gerhard has been working w/ Jun on various scaling issues w/ the coupled model and one of the issues he highlighted was both CMEPS and CICE slow I/O. We have not been using PIO for CICE but started to look into it after Gerhard's evaluations. It sounds like we should test using the subset rearranger also. |
Description of changes
checking of nans in export field bundles #377
Specific notes
Introduced a new method med_methods_FB_check_for_nans in med_methods_mod and has all of the med_phases_prep_xxx_mod.F90 files call this method. If Nans are found in any field the model will abort, but it will first print out the number of nans found for each field and on each processor.
Contributors other than yourself, if any: @jedwards4b
CMEPS Issues Fixed: #370
Are changes expected to change answers? bfb
Any User Interface Changes (namelist or namelist defaults changes)? None
Testing performed
Verified that if nans were set in the 1d and 2d field in the call to med_phases_prep_atm, then PET files were written with the right statistics and the model aborted.