-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add C48L127 atmosphere only test and turn on the control_csawmg test on jet/cheyenne #724
Conversation
Yes, please.
…On Tue, Aug 3, 2021 at 9:04 AM Dom Heinzeller ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/fv3_conf/control_run.IN
<#724 (comment)>
:
> @@ -67,10 +69,17 @@ else
fi
-cp @[INPUTDATA_ROOT]/${inputdir}/INPUT/aerosol.dat .
-cp @[INPUTDATA_ROOT]/${inputdir}/INPUT/co2historicaldata_201*.txt .
-cp @[INPUTDATA_ROOT]/${inputdir}/INPUT/sfc_emissivity_idx.txt .
-cp @[INPUTDATA_ROOT]/${inputdir}/INPUT/solarconstant_noaa_an.txt .
+if [ $NPX = 49 ]; then
Are these input files (.txt files) different for low resolution data? I
didn't know they are resolution-dependent. This seems confusing to me.
Or is it just that the directory structure different for FV3_input_data48?
------------------------------
In tests/rt.conf
<#724 (comment)>
:
> @@ -59,7 +60,7 @@ COMPILE | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16_RRTMGP,FV3_GFS_v16_csawmg,FV3_GFS_
RUN | control_rrtmgp | | fv3 |
#RUN | control_rrtmgp_2threads | | |
#RUN | control_rrtmgp_c192 | | fv3 |
-RUN | control_csawmg | - jet.intel cheyenne.intel | fv3 |
+RUN | control_csawmg | - cheyenne.intel | fv3 |
Curious if this now works on Cheyenne, too. Shall we try?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#724 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TJN2WAZDPSPFJNKFZTT27SLTANCNFSM5BGKT47A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
Co-authored-by: Jiande Wang <jiande.wang@noaa.gov>
Testing this now. |
@junwang-noaa the test now runs on Cheyenne with Intel and is run-to-run reproducible. I also tried it with GNU, but I get the following FPE from all tasks:
|
@SMoorthi-emc @junwang-noaa @DusanJovic-NOAA I looked at the GNU problems with csawmg_debug. It crashes right away in the dycore init, it seems, see first screenshot. The second screenshot shows that two variables Two questions.
How about 1., any insights? |
I got around the crash in the dycore by changing
to
so that
The reason is that some of the variables, e.g. |
@climbfuji Sorry, I forgot to change iaer in control_csawmg_debug. I just updated the code. Do you use the new iaer(1011) in the control_caswmg_debug test? |
I think the debug test didn't have the correct iaer option. But it runs (and always ran) with Intel, just not with GNU. Will try again with GNU, but I think we should change the hord options in |
Ok, now I am back to the very old / original error I have seen a few years back, memory corruption:
Will try to track it down. Since this happened with GNU 8, 9 and 10, it is unlikely that it is a compiler bug. |
Co-authored-by: Minsuk Ji <minsuk.ji@noaa.gov>
@SMoorthi-emc Are you OK to change the dycore namelist for hord variables as Dom pointed out? |
Machine: orion |
Baseline did not get generated at all, although 92 baseline generation tests passed. @BrianCurtis-NOAA can you please copy /work/noaa/stmp/bcurtis/stmp/bcurtis/FV3_RT/REGRESSION_TEST_INTEL to /work/noaa/nems/emc.nemspara/RT/NEMSfv3gfs/develop-20210805/INTEL? |
Machine: gaea |
@@ -560,6 +561,8 @@ export FNALBC="'global_snowfree_albedo.bosu.t126.384.190.rg.grb'," | |||
export FNVETC="'global_vegtype.igbp.t126.384.190.rg.grb'," | |||
export FNSOTC="'global_soiltype.statsgo.t126.384.190.rg.grb'," | |||
export FNSMCC="'global_soilmgldas.t126.384.190.grb'," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both FNSMCC and FNSMCC_control?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes at this time. The non-control related tests are using FNSMCC. I hope we can unify them when all the global atm tests are updated.
slurmstepd: error: *** JOB 268863209 ON nid00665 CANCELLED AT 2021-08-05T17:03:15 DUE TO TIME LIMIT ***`` |
Machine: cheyenne |
@BrianCurtis-NOAA this is the same failiing test as in the previous commit, when something (we still don't know exactly which script) got killed around 1am MT after being in the queue for 800 minutes. |
@climbfuji I checked to see if PBS has a timeout setting in the queue, and I couldn't find info to support yes or no. At first glance it's either a PBS timeout or just coincidence that the machine had killed those jobs at around 1AM and it was 800 minutes in the queue. |
Sorry, my explanation was poor. The 800 minutes were from the previous failure, I didn't check how long it was in the queue this time before it got killed. We need to monitor if using the economy queue delays the jobs too much. The "process scrubber" on Cheyenne kills the cron jobs around 1am MT, if I remember correctly. |
Co-authored-by: Benjamin.Blake EMC <Benjamin.Blake@v71a1.ncep.noaa.gov> Co-authored-by: Benjamin.Blake EMC <Benjamin.Blake@v72a1.ncep.noaa.gov> Co-authored-by: Benjamin.Blake EMC <Benjamin.Blake@v71a3.ncep.noaa.gov> Co-authored-by: chan-hoo <chan-hoo.jeon@noaa.gov>
PR Checklist
Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.
This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR
An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
are specified below.
If new or updated input data is required by this PR, it is clearly stated in the text of the PR.
Instructions: All subsequent sections of text should be filled in as appropriate.
The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.
Description
This PR will add a C48L127 atmosphere only test in the regression test suite. This test requires small resources to facilitate infrastructure/workflow development.
Issue(s) addressed
Link the issues to be closed with this PR, whether in this repository, or in another repository.
(Remember, issues must always be created before starting work on a PR branch!)
Testing
How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)
Dependencies
fv3atm PR#356
ufs-weather-model PR#724