Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aerosol fields do not reproduce when fhmax=4,fhzero=2 #1190

Open
DeniseWorthen opened this issue Apr 25, 2022 · 34 comments
Open

aerosol fields do not reproduce when fhmax=4,fhzero=2 #1190

DeniseWorthen opened this issue Apr 25, 2022 · 34 comments
Assignees
Labels
bug Something isn't working EPIC Support Requested

Comments

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Apr 25, 2022

Description

To reduce the time required by the updated cpld_bmark_p8 test with the mesh cap for PR #1131, I've tried to reduce fhmax to 4 and restart the model from hour 2.

All files reproduce except for the atmf004.tile[1-6].nc and fv_tracer.res.tile[1-6].nc restart files. These files differ only in the following fields: nh3, nh4a, no3an2, no3an2, no3an3, pm25, pm10.

To Reproduce:

A test branch using the current cpld_control_c96_p8 modified to run for fhmax=4 is here: branch. This test produces same field differences as those in the updated cpld_bmark_p8 test.

The control and restart cases in the test branch can be run using the oRT command:

./opnReqTest -n cpld_control_c96_p8 -c rst -ek

This will use ecflow and keep the run directory.

@DeniseWorthen DeniseWorthen added the bug Something isn't working label Apr 25, 2022
@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 27, 2022

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

@bbakernoaa
Copy link
Collaborator

@DeniseWorthen Does this include the updated compiler flags. I can't find that issue/pr right now but I believe that @rmontuoro had fixed this issue.

@DeniseWorthen
Copy link
Collaborator Author

DeniseWorthen commented Apr 27, 2022

I can get restart reproducibility with the current configuration which uses fhmax in intervals of 6 (depending on the test). It is when reducing the fhmax to 4 (and fhzero to either 1 or 2) that the aerosol fields are not reproducing.

@JessicaMeixner-NOAA
Copy link
Collaborator

If you use Dusan's PR: #1171 does it help? I believe that's the issue/fix Barry is referring to.

@weiyuan-jiang
Copy link
Collaborator

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

Sorry I cannot answer the question. But I can ask around for you

@DeniseWorthen
Copy link
Collaborator Author

Thanks @JessicaMeixner-NOAA, I understood which fix Barry was referring to.

I can test Dusan's compile options. However, since aerosols reproduce using fhmax=6,fhzero=6, I would be surprised if that explains why it is not reproducing at fhmax=4, fhzero=2.

@DeniseWorthen
Copy link
Collaborator Author

I tested using oRT after merging Dusan's release_flags branch and obtained the same non-reproducing aerosol fields.

@DeniseWorthen
Copy link
Collaborator Author

I've updated the test branch to try a 3/1/4 restart test. The oRT enforces the restart time at FHMAX/2 so testing of the 3/1/4 cannot be done w/ the oRT. Also, because of Issue MOM6 Issue #90, comparison of MOM6 restarts will need to be removed if otherwise the 3/1/4 test reproduces.

@mathomp4
Copy link

mathomp4 commented May 2, 2022

Query for someone in GEOS-land, do you have a "descriptive" explanation for the variables in play here? I've never run UFS so I'm a bit in the dark. 😄 I'm sort of guessing they are like our DT (time steps)?

@weiyuan-jiang
Copy link
Collaborator

Are the restart files the only input files? Are there any Extdata in the tests? @junwang-noaa

@junwang-noaa
Copy link
Collaborator

@mathomp4 The test case is a C96 global forecast coupled case. The time step for atmosphere is 720s, it does not change in the control (fh0->4hr from a cold start) and the restart test(fh0->2 cold start, then fh2->4 with restrart). In the restart test, the forecast restarts from current time at fh=2 using the restart files and continue to run 2 hrs to get fh=4hr.

@DeniseWorthen
Copy link
Collaborator Author

@mathomp4 The fhmax is the forecast length. In this case, we are running the model forward 4 hours and writing a restart for the components at hour 2. Using the restarts at hour2, the model is run from hour=2 to hour=4. What I comparing are the FV3 tracer restart files and the model forecast files between the initial (hr 0:4) and the restart run (2:4).

I can get the aerosol fields to reproduce if I do the same test using a restart at hour 3. In this case I'm still running the model 4 hours but I'm using a restart from hour 3 to restart to run the final 1 hour.

fhzero is the interval when accumulated fields are re-zeroed. I've actually tested w/ both fhzero=1 and 2, so I think it is not really a fhzero issue.

@SMoorthi-emc
Copy link
Contributor

SMoorthi-emc commented May 2, 2022 via email

@DeniseWorthen
Copy link
Collaborator Author

Thanks @SMoorthi-emc. I think I did have fhout set to either 2 (for fhzero=2) or 1 (for fhzero=1) but I will recheck.

@weiyuan-jiang I'm not sure how to answer your question. I have a run directory on hera here

/scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_OPNREQ_TEST/opnReqTest_14673/cpld_control_c96_p8_std_base

@mathomp4
Copy link

mathomp4 commented May 2, 2022

Okay. I can confirm this on the GEOS end it seems. I ran a start-stop run of 4 hours vs 2+2 and I'm getting restart failures as well. I guess my nightly tests never picked up on this because my 'default' regression start-stop test is 24 vs 18+6...and there's a lot of 3s in that.

I've pinged @bena-nasa about this as well as @weiyuan-jiang and @tclune from our group knowing this.

@bena-nasa
Copy link

bena-nasa commented May 2, 2022

Hi All,
there appears to be a hard coded 3 hourly frequency here
https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/NI2G_GridComp/NI2G_GridCompMod.F90#L393
and here:
https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/SU2G_GridComp/SU2G_GridCompMod.F90#L480

in gocart2g

If I changed this to a 2 hour frequency then a run of 4 hours vs 2 + 2 passes our start-stop regress. So this just seems suspicious and could explain why something involving 2 hours is misbehaving (just speculation for UFS since I can't test but certainly explains why our own regression failed in the run length was not a multiple of 3 hours). Seems like this needs to be an even interval of the run segment length or perhaps something needs to be saved in a checkpoint that is not happening and the logic for this needs to be tightened. I'll open an issue in the gocart repository.

@weiyuan-jiang
Copy link
Collaborator

Here is Arlindo's comments. Quote: " There was a reason why the 3 hour alarm was hardwired, as not to give the user the illusion that they could specify any other value. An easier solution may involve changing the way we handle these oxidants. The way this oxidant is "recycled" always apperead contrived in my opinion. So, stop trying to find a way to address this in code. There is no deep mandate to keep this algorithm. Let us discuss this in our aerosol group meeting."

@tclune
Copy link

tclune commented May 2, 2022

@weiyuan-jiang when did he make that comment?

@bena-nasa
Copy link

@junwang-noaa
Copy link
Collaborator

@bena-nasa @weiyuan-jiang May I ask if there is any update on this issue? Thanks

@weiyuan-jiang
Copy link
Collaborator

I am not aware of any update on this issue. @junwang-noaa

@bena-nasa
Copy link

@junwang-noaa
Our best thought is that the issue is this 3 hourly frequency hard coded in gocart (see the issue linked above in the gocart repo). I think the issue is two-fold, the alarm needs to be created with a fixed reference time and an extra field needs to be in the checkpoint file. Unfortunately I was having some misbehaviour with the ESMF alarms when I tried to fix this. In that issue Arlindo commented that perhaps that algorithm itself needs changed altogether but I have not heard anything more on that.
I was on vacation the last several days. I can give a 2nd look at fixing the current algorithm as is, maybe my first attempt I did something wrong.

@junwang-noaa
Copy link
Collaborator

A related issue #1207 was created to allow model to restart at fh=3hr and write out restart files at the end of forecast time fh=4.

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Oct 11, 2022 via email

@junwang-noaa
Copy link
Collaborator

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

@mathomp4
Copy link

mathomp4 commented Jul 3, 2023

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

@junwang-noaa I think this was fixed by @bena-nasa in GEOS-ESM/GOCART#224 (or at least partially)? This PR got into GOCART v2.2.0

@junwang-noaa junwang-noaa reopened this Jul 3, 2023
@junwang-noaa
Copy link
Collaborator

@mathomp4 That is great! Currently we have a PR with GOCART pointing to develop branch on 5/4 ("Ensure GOCART2G can run without the NI component"). Do we need to make additional changes in GOCART configurations when switching to GICART v2.2.0?

@mathomp4
Copy link

mathomp4 commented Jul 3, 2023

@junwang-noaa what hash are you pointing to? I can look and what's different.

Also, I suppose I'd say use v2.2.1 as that has a bug fix on 2.2.0.

@junwang-noaa
Copy link
Collaborator

It is this version.

@mathomp4
Copy link

mathomp4 commented Jul 3, 2023

Okay. So v2.1.4 essentially. I think you should be able to go to v2.2.1 without any big issues that I can see (famous last words).

@junwang-noaa
Copy link
Collaborator

Thanks for checking. I will update and test, will let you know if I run into any issues.

@zach1221
Copy link
Collaborator

zach1221 commented Mar 4, 2024

Thanks for checking. I will update and test, will let you know if I run into any issues.

Hi, @junwang-noaa . Can this issue be closed or is there further work required here?

@junwang-noaa
Copy link
Collaborator

I don't have a chance to finalize it. Will EPIC test it?

@zach1221
Copy link
Collaborator

zach1221 commented Mar 4, 2024

I don't have a chance to finalize it. Will EPIC test it?

Yes, I can test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working EPIC Support Requested
Projects
Archived in project
Development

No branches or pull requests

10 participants