Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WW3 NUOPC cap to allow using radiation stress components #2367

Conversation

uturuncoglu
Copy link
Collaborator

@uturuncoglu uturuncoglu commented Jul 16, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

The current version of cap used by WW3 model does not allow coupling through the radiation stress components due to incomplete implementation. This PR fixes the issue related to radiation stress components in WW3 NUOPC cap (new mesh cap). Note that there is no any coupled RT that uses radiation stress components to couple WW3 with ocean component under UFS Weather Model. The changes are tested with DATM+SCHSIM+WW3 coupling under fork of ufo-weather-model used by UFS Coastal Application.

Commit Message:

* UFSWM - sync WW3
  * AQM - 
  * CDEPS - 
  * CICE - 
  * CMEPS - 
  * CMakeModules - 
  * FV3 - 
    * ccpp-physics - 
    * atmos_cubed_sphere - 
  * GOCART - 
  * HYCOM - 
  * MOM6 - 
  * NOAHMP - 
  * WW3 - fix incomplete implementation of radiation stress components
  * stochastic_physics - 

Priority:

  • Normal

Git Tracking

UFSWM:

  • None

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

Since ww3 restart file is modified, the PR will affect the baseline for following tests:
hafs_regional_atm_wav
atmwav_control_noaero_p8

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@uturuncoglu
Copy link
Collaborator Author

uturuncoglu commented Jul 16, 2024

@JessicaMeixner-NOAA I run full RT suite on Hercules and we have two failed test. Here are the details,

hafs_regional_atm_wav is failing

baseline dir = /work/noaa/epic/hercules/UFS-WM_RT/NEMSfv3gfs/develop-20240624/hafs_regional_atm_wav_intel
working dir  = /work2/noaa/stmp/tufuk/stmp/tufuk/FV3_RT/rt_4057563/hafs_regional_atm_wav_intel
Checking test hafs_regional_atm_wav_intel results ....
 Comparing atmf006.nc .....USING NCCMP......OK
 Comparing sfcf006.nc .....USING NCCMP......OK
 Comparing 20190829.060000.out_grd.ww3 .....USING CMP......OK
 Comparing 20190829.060000.out_pnt.ww3 .....USING CMP......OK
 Comparing ufs.hafs.ww3.r.2019-08-29-21600 .....USING CMP......NOT IDENTICAL
 Comparing ufs.hafs.cpl.r.2019-08-29-21600.nc .....USING NCCMP......OK

  0: The total amount of wall time                        = 823.460869
  0: The maximum resident set size (KB)                   = 990128

Test hafs_regional_atm_wav_intel FAIL Tries: 2

** atmwav_control_noaero_p8:**

baseline dir = /work/noaa/epic/hercules/UFS-WM_RT/NEMSfv3gfs/develop-20240624/atmwav_control_noaero_p8_intel
working dir  = /work2/noaa/stmp/tufuk/stmp/tufuk/FV3_RT/rt_4057563/atmwav_control_noaero_p8_intel
Checking test atmwav_control_noaero_p8_intel results ....
 Comparing sfcf000.nc .....USING NCCMP......OK
 Comparing sfcf012.nc .....USING NCCMP......OK
 Comparing atmf000.nc .....USING NCCMP......OK
 Comparing atmf012.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.coupler.res .....USING CMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile1.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile2.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile3.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile4.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile5.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_core.res.tile6.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile1.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile2.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile3.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile4.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile5.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_srf_wnd.res.tile6.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile1.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile2.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile3.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile4.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile5.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.fv_tracer.res.tile6.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile1.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile2.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile3.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile4.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile5.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.phy_data.tile6.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile1.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile2.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile3.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile4.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile5.nc .....USING NCCMP......OK
 Comparing RESTART/20210322.180000.sfc_data.tile6.nc .....USING NCCMP......OK
 Comparing RESTART/ufs.atmw.cpl.r.2021-03-22-64800.nc .....USING NCCMP......OK
 Comparing 20210322.180000.out_pnt.ww3 .....USING CMP......OK
 Comparing 20210322.180000.out_grd.ww3 .....USING CMP......OK
 Comparing ufs.atmw.ww3.r.2021-03-22-64800 .....USING CMP......NOT IDENTICAL

I am not sure since those files are not netcdf but maybe this is related with having extra fields in the files. Is there any way to check time. Maybe I could run a restart test if there is to see it is changing answer or not.

@uturuncoglu
Copy link
Collaborator Author

@JessicaMeixner-NOAA Maybe it would be nice to run this two test with head of develop with updated WW3 (dev/ufs-weather-model without merging fix)

@JessicaMeixner-NOAA
Copy link
Collaborator

@JessicaMeixner-NOAA Maybe it would be nice to run this two test with head of develop with updated WW3 (dev/ufs-weather-model without merging fix)

I'm not sure I understand this @uturuncoglu I'm not sure what you mean by merging fix?

The changes do look odd to me. I don't know why these two tests would change but not others.

@uturuncoglu
Copy link
Collaborator Author

@JessicaMeixner-NOAA Okay. Sorry. I have just checked and UFS WM is using head of dev/ufs-weather-model. So, it is fine. I am not sure why this two test is showing answer change in the restart file. Maybe other WW3 tests are not checking the restart file and they looks passed. Not sure. Is there any way to check those file to see what is different. I have just check the sizes of the file in baseline and the run directory and they are same.

@uturuncoglu
Copy link
Collaborator Author

@JessicaMeixner-NOAA I also checked cpld_control_p8 and it seems it is not checking *.ww3.r.* files. So, it is normal not to fail. Those tests could also have issue with *.ww3.r.* files. The strange thing is that cpld_restart_p8 is passing without any issue and if there is a something related to the restart file, it must fail. Right?

@JessicaMeixner-NOAA
Copy link
Collaborator

@uturuncoglu I would not expect these changes to change anything for the restarts, so this is really odd to me. Unfortunately the restart files themselves are binary and difficult to track down what exactly is causing issues. We know about some issues in the restart file for unstructured grids (which is why they are not included as a baseline in those tests, a fix for that should be coming soon), but both cases you have failing are structured, and we do not know of any issues there.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jul 17, 2024

@uturuncoglu We don't compare the ww3 restarts for tests like the cpld_control because they do show as different for some reason. The fact that the restart test passes for the various global coupled tests argues for the difference to be in something that doesn't impact the actual data values.

The only thing I can see that your PR does is change the field names---which are aliased anyway. Can you try adding the fields

call fldlist_add(fldsFrWav_num, fldsFrWav, 'Sw_wavsuu')

as

call fldlist_add(fldsFrWav_num, fldsFrWav, 'wavsuu')

to check whether these two tests would fail? Your coastal fieldsExchange would still work, but for consistency we'd prefer state variables to be named like Sw as in your PR. But if not changing the name gave B4B then something weird is going on.

Neither of the two tests you mention failing have restart tests. I think you could build one for the atmwav, but I don't think regional MOM6 restarts B4B on the MOM6 side.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Thanks. This PR also adds missing fields to state as follows https://github.com/oceanmodeling/WW3/blob/e8b3ef615eb332198846cd7848d093672504cf30/model/src/wav_import_export.F90#L148. I could try to use original names and see what happens. I'll update you about it.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jul 17, 2024

@uturuncoglu You're right of course...sorry for not looking more closely. But your change is B4B with the tests which have restart tests though, right? Meaning both control and restart tests pass.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Yes. That is right. Restart tests passes without any issue. It is just a simple change. I am not sure it affects restart files of those two tests. Anyway, I am trying to your suggestion at this point to see what happens.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen I confirmed that I have same issue even if I use old variables names. hafs_regional_atm_wav test is facing same way and complain about restart file. I'll do couple of more test to remove different part of change to track down which one causing issue.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jul 17, 2024

@uturuncoglu OK. I think it's going to be difficult to track down what exactly is changing. I think the fact that a) all tests that actually test restart are B4B and b) all files except the WW3 restart file are B4B means it isn't really changing answers, it's just changing some bit in the restart file.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen If i remove following part of the code, the test passes without any issue. https://github.com/oceanmodeling/WW3/blob/e8b3ef615eb332198846cd7848d093672504cf30/model/src/wav_import_export.F90#L148. I am not sure how adding advertising new fields are changing the restart file at this point. Since, those are not connected in the ocean side, it must be fine. It is very puzzling for me. @mvertens do you have any idea? This is new mesh cap that is also used by NCAR CESM and you might have some suggestion.

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu Removing just Sw_wavsuu and leaving the others produces B4B restarts?

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Let me check.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen yes, if I comment out Sw_wavsuu and keep others it is working fine. Do you know why?

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu I don't have any immediate insight. We've never had these variables turned on for the global applications.

The thing I might try next is to see if a debug case for atmwav_control_noaero_p8_intel (you'd need to create this baseline for a debug compile) w/ and w/o the extra stress fields.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Okay. Let me try that.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jul 22, 2024

@uturuncoglu Just a data point----I checked that if you ask WW3 to write out the stress components, the fields in the wave history files are identical to what shows up in the coupler history files, so there doesn't seem to be any issue w/ how the cap is calculating these fields (the SR was lifted from the wmesmf code).

Note there is a typo that needs fixing

diff --git a/model/src/wav_grdout.F90 b/model/src/wav_grdout.F90
index 4583070d..4cde5e7e 100644
--- a/model/src/wav_grdout.F90
+++ b/model/src/wav_grdout.F90
@@ -223,8 +223,8 @@ contains

     !  6   Wave-ocean layer
     gridoutdefs(6,1:25) = [ &
-         varatts( "SXY  ", "SXX       ", "Radiation stresses xx                           ", "N m-1     ", "  ", .false.) , &
-         varatts( "SXY  ", "SYY       ", "Radiation stresses yy                           ", "N m-1     ", "  ", .false.) , &
+         varatts( "SXX  ", "SXX       ", "Radiation stresses xx                           ", "N m-1     ", "  ", .false.) , &
+         varatts( "SYY  ", "SYY       ", "Radiation stresses yy                           ", "N m-1     ", "  ", .false.) , &
          varatts( "SXY  ", "SXY       ", "Radiation stresses xy                           ", "N m-1     ", "  ", .false.) , &
          varatts( "TWO  ", "TAUOX     ", "Wave to ocean momentum flux x                   ", "m2 s-2    ", "  ", .false.) , &
          varatts( "TWO  ", "TAUOY     ", "Wave to ocean momentum flux y                   ", "m2 s-2    ", "  ", .false.) , &

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Thanks. I see the issue. Do you want me to add it to this PR? I still need to check the debug test.

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu I think it makes sense to add the fix to your PR, since it is related to using the rad stresses for your application. Thanks.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen I tested code with atmwav_control_noaero_p8_debug test that I introduced based on your suggestion. It looks it is fine and passing without any issue.

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu Your PR in debug mode can reproduce a debug baseline created from the current ufs-weather-model branch. Is that right?

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Yes. I created test and its baseline and run again to check. It was passing. Even with your recent changes related with writing SXX and SYY, the same two tests are still failing. Do you want me to double check?

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu No need to repeat, I just want to be sure I understand your test. You say the same two tests are failing. So the current dev/u-w-m branch passes against it's own baseline in debug mode. But the PR branch in debug mode still fails comparison against that baseline.

I'm at a loss right now as to why adding these fields might change the restart file.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen I used my PR branch for all the tests but I commented the section that adds new variables and the tests are passing with it but if I add new variables then it fails. It is strange but only first variable (Sw_wavsuu) is causing issue. If I remember correctly I included Sw_wavsvv and Sw_wavsuv only (without adding Sw_wavsuu) and baseline was passing. So, it is definitely odd.

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu I built a restart test for the atmwav test and tested it w/ your additional fields added to the field list. The restart reproduces the control case, and the ww3 restart file is also B4B at hour=12.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jul 28, 2024

@uturuncoglu Please test this fix. In my test, the wave restarts for the atmwav test with and w/o the extra fields are B4B.

I have a second fix which appears to mostly resolve the issues w/ the cpld tests, where ww3 restarts themselves were not B4B so they were never included in the comparison list.

diff --git a/model/src/w3iorsmd.F90 b/model/src/w3iorsmd.F90
index 24d9a280..8c813edf 100644
--- a/model/src/w3iorsmd.F90
+++ b/model/src/w3iorsmd.F90
@@ -1376,6 +1376,7 @@ CONTAINS
         TICE(1) = -1
         TICE(2) =  0
         TRHO(1) = -1
+        trho(2) =  0
         TIC1(1) = -1
         TIC1(2) =  0
         TIC5(1) = -1

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Okay. Let me test in my side too an add this fix to the PR. Thanks for your help.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Still same for me.

baseline dir = /work/noaa/epic/hercules/UFS-WM_RT/NEMSfv3gfs/develop-20240624/hafs_regional_atm_wav_intel
working dir  = /work2/noaa/stmp/tufuk/stmp/tufuk/FV3_RT/rt_988576/hafs_regional_atm_wav_intel
Checking test hafs_regional_atm_wav_intel results ....
 Comparing atmf006.nc .....USING NCCMP......OK
 Comparing sfcf006.nc .....USING NCCMP......OK
 Comparing 20190829.060000.out_grd.ww3 .....USING CMP......OK
 Comparing 20190829.060000.out_pnt.ww3 .....USING CMP......OK
 Comparing ufs.hafs.ww3.r.2019-08-29-21600 .....USING CMP......NOT IDENTICAL
 Comparing ufs.hafs.cpl.r.2019-08-29-21600.nc .....USING NCCMP......OK

Let me push the change an you could double check it is fine and working in your side.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Here is the commit for the fix: NOAA-EMC/WW3@5debd42

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen Do you think if baseline has issue?

@DeniseWorthen
Copy link
Collaborator

@uturuncoglu Did you create a new baseline (without the added fields) and compare against that? So, new baseline w/ trho(2)=0 with top of dev/ufs-weather-model and then run against that w/ your added fields.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen No, let me create a one and check against it.

@DeniseWorthen
Copy link
Collaborator

OK. I was only testing the atmwav_control_noaero_p8_intel test. I didn't test the failing hafs case.

@uturuncoglu
Copy link
Collaborator Author

@DeniseWorthen @JessicaMeixner-NOAA JFYI, hafs_regional_atm_wav test is passed after your fix. I think this PR will only modify baseline for hafs_regional_atm_wav and atmwav_control_noaero_p8. So, I'll update the PR description.

@uturuncoglu
Copy link
Collaborator Author

@JessicaMeixner-NOAA @DeniseWorthen I am closing this since we have new PR (#2396) now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants