-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules #492
Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules #492
Conversation
use updcmeps branch of CMEPS-interface/CMEPS add file to CMakeLists shorten name of nems.configure file for coupled model clean up white space
add rahul's fix for optionally loading fv3_debug if exists and debug=y
update to emc/develop after reversion of med_io_mod changes
update baseline to develop-20201106; skip-ci
verify changes to cmeps in feature/bulk branch do not affect baselines for cpld model
skip-ci
Machine: orion |
I repeated the failed orion test by copying the cpld_control and cpld_2threads run directories and running both again. Both tests passed manual comparison of the mediator restart file w/ the develop-20210406 cpld_control test. I will keep a copy of the auto-rt log and re-run the 2threads test manually. |
* cpld_bmarkfrac_v16_nsst test failed in original RT because the variable SUITE_NAME had not been updated to CCPP_SUITE. The test was repeated with the correct variable and passed
* cpld_2thread test failed in original auto-rt. The run directory was copied and the test re-run. The mediator restart file compared b4b with the existing baseline file. The test was re-run manually and passed. The log was appended to the auto-rt log
@DeniseWorthen Has CI been run? Otherwise I think it's ready for commit. Thanks for all the testing! |
I ran CI when I commited the cheyenne.gnu log but it doesn't look like it worked. |
The thr and dbg tests failed. Considering they are the jobs that take the longest, i believe they failed because the next commit automatically stopped the ec2 instances. I am working on a PR to fix this re-occurring issue in CI test. |
* First test * Change branch name * github event name... * change to tests/ci dir * change repo name * print out stderr * try recursive checkout * checkout myself * in a hurry * fetch owner id * Make setup depend on prcheck * update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483) * point to fv3 branch * Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst) * Modify rt.conf to add FV3_GFS_v15p2_couplednsst * Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst, remove test cpld_control_nsst * Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0. * Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist * Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section * RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh * RT JOBS PASSED: hera.intel. Log file uploaded. * run-ci, commit 7 RegressionTest log files * Push RegressionTests_wcoss_dell_p3.log Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> * remove develop * quiet git commands. run-ci * Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492) * update ufs for CMEPS master with PIO changes * update pio to 2.5.2 across platforms * replace variable SUITE_NAME with CCPP_SUITE * Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps * switch to h-nems area on jet Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov> * debug this again... * debug again * debug continues * missed repo field * remove debug related texts * remove -x flag. run-ci * reduce sleep time * diag * diag2 * Fix pr_uid * diag again Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com> Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> Co-authored-by: Denise Worthen <denise.worthen@noaa.gov> Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
* First test * Change branch name * github event name... * change to tests/ci dir * change repo name * print out stderr * try recursive checkout * checkout myself * in a hurry * fetch owner id * Make setup depend on prcheck * update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483) * point to fv3 branch * Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst) * Modify rt.conf to add FV3_GFS_v15p2_couplednsst * Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst, remove test cpld_control_nsst * Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0. * Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist * Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section * RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh * RT JOBS PASSED: hera.intel. Log file uploaded. * run-ci, commit 7 RegressionTest log files * Push RegressionTests_wcoss_dell_p3.log Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> * remove develop * quiet git commands. run-ci * Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492) * update ufs for CMEPS master with PIO changes * update pio to 2.5.2 across platforms * replace variable SUITE_NAME with CCPP_SUITE * Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps * switch to h-nems area on jet Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov> * debug this again... * debug again * debug continues * missed repo field * remove debug related texts * remove -x flag. run-ci * reduce sleep time * diag * diag2 * Fix pr_uid * diag again * still debugging * lets see if this works * use context as main yml key Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com> Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> Co-authored-by: Denise Worthen <denise.worthen@noaa.gov> Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
* First test * Change branch name * github event name... * change to tests/ci dir * change repo name * print out stderr * try recursive checkout * checkout myself * in a hurry * fetch owner id * Make setup depend on prcheck * update with fv3 & ccpp/physics updates in order to turn the NSST model on in the coupled model (Replace PR ufs-community#453) (ufs-community#483) * point to fv3 branch * Add one more test cpld_control_nsst to test added ccpp suite (FV3_GFS_v15p2_couplednsst) * Modify rt.conf to add FV3_GFS_v15p2_couplednsst * Modify rt.conf to add a new test, cpld_bmarkfrac_v16_nsst, remove test cpld_control_nsst * Modify tests/tests/cpld_bmarkfrac_v16_nsst: 1. cpld_bmarkfrac_v16 to cpld_bmarkfrac_v16_nsst. 2. export NSTF_NAME=2,1,0,0,0 to export nstf_name=2,1,0,0,0. * Modify input.benchmark_v16.nml.IN & cpld_bmarkfrac_v16_nsst for a consistent definition of nstf_name namelist * Modify cpld_bmarkfrac_v16_nsst by moving the NSTF_NAME to the namelist field updates section * RegressionTests_orion.intel.log of the rt run and BL_DATE=20210406 in rt.sh * RT JOBS PASSED: hera.intel. Log file uploaded. * run-ci, commit 7 RegressionTest log files * Push RegressionTests_wcoss_dell_p3.log Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> * remove develop * quiet git commands. run-ci * Update CMEPS for latest ESCOMP/master; Update PIO to 2.5.2; Refactor modules (ufs-community#492) * update ufs for CMEPS master with PIO changes * update pio to 2.5.2 across platforms * replace variable SUITE_NAME with CCPP_SUITE * Merge remote-tracking branch 'DusanJovic/module_common' into feature/updcmeps * switch to h-nems area on jet Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov> * debug this again... * debug again * debug continues * missed repo field * remove debug related texts * remove -x flag. run-ci * reduce sleep time * diag * diag2 * Fix pr_uid * diag again * still debugging * lets see if this works * use context as main yml key * Try pull request now. run-ci * minor change. run-ci * typo fix * debug 1 * debug2 * minor mods. run-ci * check another dir * sleep * sigh * sigh2 * increase time again * test again Co-authored-by: XuLi-NOAA <55100838+XuLi-NOAA@users.noreply.github.com> Co-authored-by: Brian Curtis <brian.curtis@noaa.gov> Co-authored-by: Denise Worthen <denise.worthen@noaa.gov> Co-authored-by: Bin Li <Bin.Li@gaea13.ncrc.gov> Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
Add 'valid time' variable using ISO string format to netcdf history files. Use double precision variable to set value of 'time' attribute in wrt comp import state Update ccpp/physics (setting surface-related interstitial variables for SCM prescribed-surface-flux mode) Update inline_post_stub.F90 subroutine interfaces to match inline_post.F90 Co-authored-by: Ted Mansell <ted.mansell@noaa.gov> Co-authored-by: Grant Firl <grant.firl@noaa.gov>
…A_3km pre-defined domain, update timestep and MPI settings (#492) ## DESCRIPTION OF CHANGES: This PR accomplishes three things: 1. A new pre-defined domain (RRFS_NA_3km) has been added to the SRW App. Nodes/core settings must be modified for chgres_cube and post due to the size of this domain. A WE2E test was added and more information on all of these settings can be found within the related config.sh script (tests/baseline_configs/config.grid_RRFS_NA_3km.sh). 2. The default k_split value is updated for a faster model integration. With k_split=2, we see model integration ~30% faster than the previous settings for the same weather model hash. **This will not affect physics suites that have specified other k_split values** 3. In order to properly run the above domain with the intended FV3_RRFS_v1alpha physics suite, the weather model needed to be updated to a more recent hash. This more up-to-date weather model version also has renamed the FV3_GFS_v16beta suite to FV3_GFS_v16; this required a number of changes to the workflow and end-to-end tests. In addition, several changes to default settings are occurring in this PR. Changes have also been made to k/n_split values in the namelist template which optimize run time. CPUS_PER_TASK_RUN_FCST is changed from "4" to "2" in this PR. Setting this field to "4" was doubling the requested nodes for the run_fcst task. For example, a 3-km CONUS run that normally requests 25 nodes (based on predefined layout_x/y values) was asking for 50, simply because CPUS_PER_TASK_RUN_FCST=4, which was unacceptable. When it is set to "2", the number of nodes remains unchanged, in line with the layout_x/y values. EMC is using CPUS_PER_TASK_RUN_FCST=2 for their runs, so this should be uncontroversial. This PR will need to be accompanied by changes in the ufs-srweather-app for updating the weather model hash and incorporating some necessary build changes (including compiling with 32-bit reals by default); this PR has been created (ufs-community/ufs-srweather-app#140) but it still a draft pending some platform-specific fixes and the merger of this PR. ## TESTS CONDUCTED: For the initial changes for PR #480, multiple tests on Hera were run, including a full 36-hr forecast here: /scratch2/BMC/det/beck/FV3-LAM/expt_dirs/test_RRFS_NA_3km_36hr With the additional changes and updates to the weather model, and updates to the Hera environment file, all end-to-end tests (aside from nco tests) were run on Hera (intel). There were a few pre-existing failures, and aside from an occasional GST failure due to wallclock time issues (see #490) the only new failures were for grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GSD_SAR, grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_HRRR, and grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_RRFS_v1beta, which all had a new failure in make_ics and make_lbcs. Currently investigating this issue, though it is almost certainly related to the build environments which need to be addressed in ufs-community/ufs-srweather-app#140 ## CONTRIBUTORS: @JeffBeck-NOAA authored the half of these changes originating from #480, and offered the following credits on his original PR: Thanks are due to @JamesAbeles-NOAA for his recommendations for build/namelist changes and help troubleshooting run times. Thanks to @BenjaminBlake-NOAA and @JacobCarley-NOAA for their help with the domain configuration.
PR Checklist
Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.
This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR
An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
are specified below.
If new or updated input data is required by this PR, it is clearly stated in the text of the PR.
Instructions: All subsequent sections of text should be filled in as appropriate.
The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.
Description
Provide a detailed description of what this PR does. What bug does it fix, or what feature does it add? Is a change of answers expected from this PR? Are any library updates included in this PR (modulefiles etc.)?
Issue(s) addressed
Link the issues to be closed with this PR, whether in this repository, or in another repository.
(Remember, issues must always be created before starting work on a PR branch!)
CMEPS #37
UFS weather #415
UFS weather #428
UFS weather #510
Testing
How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)
Dependencies
co-author: @binli2337
co-author: @DusanJovic-NOAA