Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules #492

Merged
merged 102 commits into from
Apr 9, 2021

Conversation

DeniseWorthen
Copy link
Collaborator

@DeniseWorthen DeniseWorthen commented Mar 29, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Instructions: All subsequent sections of text should be filled in as appropriate.

The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.

Description

Provide a detailed description of what this PR does. What bug does it fix, or what feature does it add? Is a change of answers expected from this PR? Are any library updates included in this PR (modulefiles etc.)?

  • Updates CMEPS to the latest emc/develop
  • Removes shr_pio_mod.F90 from CMEPS-interface/CMakeLists.txt
  • Removes two files (pio_in and med_modelio.nml) from tests/parm and run directory
  • Updates pio module to 2.5.2 for all platforms
  • Replaces SUITE_NAME with CCPP_SUITE in coupled tests
  • Refactors modules into ufs_common and ufs_common_debug (PR Refactor modulefiles #482)
  • switches to use h-nems area on jet

Issue(s) addressed

Link the issues to be closed with this PR, whether in this repository, or in another repository.
(Remember, issues must always be created before starting work on a PR branch!)

CMEPS #37
UFS weather #415
UFS weather #428
UFS weather #510

Testing

  • The full RT suite (intel and gnu) were run on hera on 4/5/21 and all tests are B4B with current baselines (develop-20210401).
  • The full RT also passed on Gaea against develop-20210401.

How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3

Dependencies

  • waiting on CMEPS PR #36

co-author: @binli2337
co-author: @DusanJovic-NOAA

use updcmeps branch of CMEPS-interface/CMEPS
add file to CMakeLists
shorten name of nems.configure file for coupled model
clean up white space
add rahul's fix for optionally loading fv3_debug if exists and
debug=y
update to emc/develop after reversion of med_io_mod changes
update baseline to develop-20201106; skip-ci
verify changes to cmeps in feature/bulk branch do not affect
baselines for cpld model
@BrianCurtis-NOAA
Copy link
Collaborator

Machine: orion
Compiler: intel
Job: RT
Repo location: /work/noaa/nems/emc.nemspara/autort/pr/602704560/20210409084509/ufs-weather-model
Please manually delete: /work/noaa/stmp/bcurtis/stmp/bcurtis/FV3_RT/rt_275314
Test cpld_2threads 076 failed in check_result failed
Test cpld_2threads 076 failed in run_test failed
Please make changes and add the following label back:
orion-intel-RT

@DeniseWorthen
Copy link
Collaborator Author

I repeated the failed orion test by copying the cpld_control and cpld_2threads run directories and running both again. Both tests passed manual comparison of the mediator restart file w/ the develop-20210406 cpld_control test.

I will keep a copy of the auto-rt log and re-run the 2threads test manually.

* cpld_bmarkfrac_v16_nsst test failed in original RT
because the variable SUITE_NAME had not been updated
to CCPP_SUITE. The test was repeated with the correct
variable and passed
* cpld_2thread test failed in original auto-rt. The
run directory was copied and the test re-run. The mediator
restart file compared b4b with the existing baseline file.
The test was re-run manually and passed. The log was
appended to the auto-rt log
@DeniseWorthen DeniseWorthen added Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. Waiting for Reviews The PR is waiting for reviews from associated component PR's. labels Apr 9, 2021
@junwang-noaa
Copy link
Collaborator

@DeniseWorthen Has CI been run? Otherwise I think it's ready for commit. Thanks for all the testing!

@DeniseWorthen
Copy link
Collaborator Author

@DeniseWorthen Has CI been run? Otherwise I think it's ready for commit. Thanks for all the testing!

I ran CI when I commited the cheyenne.gnu log but it doesn't look like it worked.

@MinsukJi-NOAA
Copy link
Contributor

@DeniseWorthen Has CI been run? Otherwise I think it's ready for commit. Thanks for all the testing!

I ran CI when I commited the cheyenne.gnu log but it doesn't look like it worked.

The thr and dbg tests failed. Considering they are the jobs that take the longest, i believe they failed because the next commit automatically stopped the ec2 instances. I am working on a PR to fix this re-occurring issue in CI test.

@DeniseWorthen DeniseWorthen merged commit ade15dd into ufs-community:develop Apr 9, 2021
MinsukJi-NOAA pushed a commit to MinsukJi-NOAA/ufs-weather-model that referenced this pull request Apr 12, 2021
* First test

* Change branch name

* github event name...

* change to tests/ci dir

* change repo name

* print out stderr

* try recursive checkout

* checkout myself

* in a hurry

* fetch owner id

* Make setup depend on prcheck

* update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483)

* point to fv3 branch
* Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst)
* Modify rt.conf to add FV3_GFS_v15p2_couplednsst
* Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst,  remove test cpld_control_nsst
* Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0.
* Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist
* Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section
* RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh
* RT JOBS PASSED: hera.intel. Log file uploaded.
* run-ci, commit 7 RegressionTest log files
* Push RegressionTests_wcoss_dell_p3.log

Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>

* remove develop

* quiet git commands. run-ci

* Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492)


* update ufs for CMEPS master with PIO changes
* update pio to 2.5.2 across platforms
* replace variable SUITE_NAME with CCPP_SUITE
* Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps
* switch to h-nems area on jet

Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>

* debug this again...

* debug again

* debug continues

* missed repo field

* remove debug related texts

* remove -x flag. run-ci

* reduce sleep time

* diag

* diag2

* Fix pr_uid

* diag again

Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
Co-authored-by: Denise Worthen <denise.worthen@noaa.gov>
Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
MinsukJi-NOAA pushed a commit to MinsukJi-NOAA/ufs-weather-model that referenced this pull request Apr 12, 2021
* First test

* Change branch name

* github event name...

* change to tests/ci dir

* change repo name

* print out stderr

* try recursive checkout

* checkout myself

* in a hurry

* fetch owner id

* Make setup depend on prcheck

* update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483)

* point to fv3 branch
* Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst)
* Modify rt.conf to add FV3_GFS_v15p2_couplednsst
* Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst,  remove test cpld_control_nsst
* Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0.
* Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist
* Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section
* RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh
* RT JOBS PASSED: hera.intel. Log file uploaded.
* run-ci, commit 7 RegressionTest log files
* Push RegressionTests_wcoss_dell_p3.log

Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>

* remove develop

* quiet git commands. run-ci

* Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492)


* update ufs for CMEPS master with PIO changes
* update pio to 2.5.2 across platforms
* replace variable SUITE_NAME with CCPP_SUITE
* Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps
* switch to h-nems area on jet

Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>

* debug this again...

* debug again

* debug continues

* missed repo field

* remove debug related texts

* remove -x flag. run-ci

* reduce sleep time

* diag

* diag2

* Fix pr_uid

* diag again

* still debugging

* lets see if this works

* use context as main yml key

Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
Co-authored-by: Denise Worthen <denise.worthen@noaa.gov>
Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
MinsukJi-NOAA pushed a commit to MinsukJi-NOAA/ufs-weather-model that referenced this pull request Apr 14, 2021
* First test

* Change branch name

* github event name...

* change to tests/ci dir

* change repo name

* print out stderr

* try recursive checkout

* checkout myself

* in a hurry

* fetch owner id

* Make setup depend on prcheck

* update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483)

* point to fv3 branch
* Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst)
* Modify rt.conf to add FV3_GFS_v15p2_couplednsst
* Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst,  remove test cpld_control_nsst
* Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0.
* Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist
* Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section
* RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh
* RT JOBS PASSED: hera.intel. Log file uploaded.
* run-ci, commit 7 RegressionTest log files
* Push RegressionTests_wcoss_dell_p3.log

Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>

* remove develop

* quiet git commands. run-ci

* Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492)


* update ufs for CMEPS master with PIO changes
* update pio to 2.5.2 across platforms
* replace variable SUITE_NAME with CCPP_SUITE
* Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps
* switch to h-nems area on jet

Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>

* debug this again...

* debug again

* debug continues

* missed repo field

* remove debug related texts

* remove -x flag. run-ci

* reduce sleep time

* diag

* diag2

* Fix pr_uid

* diag again

* still debugging

* lets see if this works

* use context as main yml key

* Try pull request now. run-ci

* minor change. run-ci

* typo fix

* debug 1

* debug2

* minor mods. run-ci

* check another dir

* sleep

* sigh

* sigh2

* increase time again

* test again

Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
Co-authored-by: Denise Worthen <denise.worthen@noaa.gov>
Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
@DeniseWorthen DeniseWorthen deleted the feature/updcmeps branch June 1, 2021 12:21
pjpegion pushed a commit to NOAA-PSL/ufs-weather-model that referenced this pull request Apr 4, 2023
Add 'valid time' variable using ISO string format to netcdf history files.
Use double precision variable to set value of 'time' attribute in wrt comp import state
Update ccpp/physics (setting surface-related interstitial variables for SCM prescribed-surface-flux mode)
Update inline_post_stub.F90 subroutine interfaces to match inline_post.F90

Co-authored-by: Ted Mansell <ted.mansell@noaa.gov>
Co-authored-by: Grant Firl <grant.firl@noaa.gov>
epic-cicd-jenkins pushed a commit that referenced this pull request Apr 17, 2023
…A_3km pre-defined domain, update timestep and MPI settings (#492)

## DESCRIPTION OF CHANGES: 
This PR accomplishes three things: 

1. A new pre-defined domain (RRFS_NA_3km) has been added to the SRW App. Nodes/core settings must be modified for chgres_cube and post due to the size of this domain.  A WE2E test was added and more information on all of these settings can be found within the related config.sh script (tests/baseline_configs/config.grid_RRFS_NA_3km.sh). 
2. The default k_split value is updated for a faster model integration. With k_split=2, we see model integration ~30% faster than the previous settings for the same weather model hash. **This will not affect physics suites that have specified other k_split values**
3. In order to properly run the above domain with the intended FV3_RRFS_v1alpha physics suite, the weather model needed to be updated to a more recent hash. This more up-to-date weather model version also has renamed the FV3_GFS_v16beta suite to FV3_GFS_v16; this required a number of changes to the workflow and end-to-end tests. In addition, several changes to default settings are occurring in this PR. Changes have also been made to k/n_split values in the namelist template which optimize run time.  CPUS_PER_TASK_RUN_FCST is changed from "4" to "2" in this PR.  Setting this field to "4" was doubling the requested nodes for the run_fcst task.  For example, a 3-km CONUS run that normally requests 25 nodes (based on predefined layout_x/y values) was asking for 50, simply because CPUS_PER_TASK_RUN_FCST=4, which was unacceptable.  When it is set to "2", the number of nodes remains unchanged, in line with the layout_x/y values.  EMC is using CPUS_PER_TASK_RUN_FCST=2 for their runs, so this should be uncontroversial.

This PR will need to be accompanied by changes in the ufs-srweather-app for updating the weather model hash and incorporating some necessary build changes (including compiling with 32-bit reals by default); this PR has been created (ufs-community/ufs-srweather-app#140) but it still a draft pending some platform-specific fixes and the merger of this PR.

## TESTS CONDUCTED: 
For the initial changes for PR #480, multiple tests on Hera were run, including a full 36-hr forecast here: /scratch2/BMC/det/beck/FV3-LAM/expt_dirs/test_RRFS_NA_3km_36hr

With the additional changes and updates to the weather model, and updates to the Hera environment file, all end-to-end tests (aside from nco tests) were run on Hera (intel). There were a few pre-existing failures, and aside from an occasional GST failure due to wallclock time issues (see #490) the only new failures were for grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GSD_SAR, grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_HRRR, and grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_RRFS_v1beta, which all had a new failure in make_ics and make_lbcs. Currently investigating this issue, though it is almost certainly related to the build environments which need to be addressed in ufs-community/ufs-srweather-app#140

## CONTRIBUTORS: 
@JeffBeck-NOAA authored the half of these changes originating from #480, and offered the following credits on his original PR:

Thanks are due to @JamesAbeles-NOAA for his recommendations for build/namelist changes and help troubleshooting run times.  Thanks to @BenjaminBlake-NOAA and @JacobCarley-NOAA for their help with the domain configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
7 participants