Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Integrate jinja-enabled config files and add rrfs workflow #2

Closed
wants to merge 131 commits into from

Conversation

danielabdi-noaa
Copy link
Owner

DESCRIPTION OF CHANGES:

This PR extends upon work done in PR ufs-community#701 to integrate jinja support to config and workflow files.
The RRFS workflow templated with jinja code mirroring its old xml counterpart is also added. I tested the latter only in
a setup where none of the RRFS tasks are added, so while it will generate the workflow, it needs to be tested as each PR goes in.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

DOCUMENTATION:

ISSUE:

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

CONTRIBUTORS (optional):

MichaelLueken and others added 30 commits November 21, 2022 12:13
* new script to run and monitor e2e tests without cron

* fixed find command and directed system output to /dev/null

* Modify run_WE2E_tests.sh

 - When on Cheyenne, set use_cron_to_relaunch=false
 - When use_cron_to_relaunch=false, output a message at the end of the script
   describing how to run the new run_srw_tests.py script to manage the tests
 - Regardless of other variables, print a message at the end of the script
   showing the user where the test directory is

* Updated .cicd/scripts/srw_test.sh to work with ush/run_srw_tests.py

Co-authored-by: Michael Kavulich, Jr <kavulich@ucar.edu>
Co-authored-by: michael.lueken <Michael.Lueken@noaa.gov>
…nity#498)

* update build process img

* update img

* Revert changes

This reverts commit b839658.

* BuildSRW updates - esp mac/linux

* remove Contributor's Guide

* update Glossary and RunSRW

* update build/run process images

* remove outdated images; add new

* Components & CCPP updates, misc

* ConfigWflow updates

* Container updates

* update FAQ

* graphics updates

* I/O updates

* Intro updates

* Grid ch updates

* Grid ch updates

* Quickstart Updates

* Rocoto edits

* WE2E ch

* update supported tests

* update fix file list

* minor update

* minor edits

* minor edits

* misc minor fixes

* change /home to /Users/gillianpetro

* fix geopotential height

* update hpc-stack doc link

* nperlin changes

* nperlin changes

Co-authored-by: gspetro <gillian.s.petro@gmail.com>
* Update build_linux_gnu.lua

Use the srw_common module with the standard list of modules

* Update wflow_linux.lua

* Update build_linux_gnu.lua
* Add level for data: section indicating ics_lbcs.

* Add the NaturalEarth shape files.

* Add plotting to the workflow.

* Fix some deprecation warnings.

* Don't plot 500mb vars if it is not available.

* Add modulefiles for plot_allvars task.

* Add a test case for plotting graphics.

* Add diff plotting.

* Remove ush/Python folder.

* Turn off plotting by default.

* Bug fix plot_diff call.

* Add missing diff plot script.

* Make plotting work in NCO mode.

* Avoid positional arguments in the plotting scripts.
Some bug fixes.

* Make COMOUT_REF a template to compare multiple dates and cycles correctly.

* Use logging to capture output from plot scripts in log files.

* Turn off debug level, too much information.

* Ignore warnings in plotting.

* Bug fix diff plotter: mismatch in number of ticks, and ticklabels.

* Increase wallclock time, bugfix orion FIXshp.

* Add some default values to config.community/nco for convenience.
…#500)

* add Tutorial chapter

* update build process img

* update img

* Revert changes

This reverts commit b839658.

* misc. changes

* add Indy VX sample case

* update VX cases data/config

* VXcases run instructions

* Updates to Compare section

* minor fixes

* minor fixes

* add clarifications; remove comments

* remove accidental changes to plotting files

* rm Tutorial ch (for separate PR)

* rm accidental v2.1.0 refs in BuildSRW

* clarify plotting instructions

* minor corrections

* minor fixes

* add conda activate command

* Eddie's edits

* change wtime to 5h

* Update prerequisites, data locations, headings

Co-authored-by: gspetro <gillian.s.petro@gmail.com>
ufs-community#503)

* ensure arbitrary restart_interval

* remove wtime change in we2e

Co-authored-by: chan-hoo <chan-hoo.jeon@clogin04.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: chan-hoo <chan-hoo.jeon@clogin01.cactus.wcoss2.ncep.noaa.gov>
* Updated modules to include EPIC bin path

* updated vx maxtries

* updated met and metplus paths

Co-authored-by: Edward Snyder <Edward.Snyder@noaa.com>
…y#510)

* add gfs v16.3 to data list on hpss

* fix typo

Co-authored-by: chan-hoo jeon <chan-hoo.jeon@dlogin03.dogwood.wcoss2.ncep.noaa.gov>
…ommunity#505)

Global variable use has been removed in setup.py, and reduced in generate_FV3LAM_wflow.py. The use of globals is a carry-over from the bash era of this utility, and does not meet modern coding standards.
…unity#528)

* update ConfigWorkflow.rst w/plotting info

* add plotting info to RunSRW

* add Template Vars chapter

* add plotting info to RunSRW; remove Graphics ch

* add sphinx_rtd extension

* add docutils requirement to render bullet points correctly

* reformat plotting section

* update requirements to render bullets

* fix requirements syntax

* update sphinx requirements

* asterisk to dash bullets

* revert changes to fix bullet points

* minor edits to RunSRW plotting section

* remove Graphics chapter

* remove/update references to Graphics chapter

* fix typos

* change /Users/gillianpetro to

Co-authored-by: gspetro <gillian.s.petro@gmail.com>
…ity#522)

Use TEST_PREGEN_BASEDIR to set DOMAIN_PREGEN_BASEDIR if provided for a
machine.
* add sphinx_rtd_theme to conf.py

* pin sphinx_rtd_theme==0.5.1 in req file

* pin docutils==0.17

* pin Pygments-2.13.0

* pinned sphinx_rtd_theme==1.1.1

* pin sphinx==5.3.0

* pin importlib-metadata==5.2.0

* pin pillow==9.3.0

* set docutils==0.16

* comment out unwanted pinnings

* remove unnecessary changes for formatting fix

* add back sphinx_rtd_theme extension in conf.py

Co-authored-by: gspetro <gillian.s.petro@gmail.com>
… rocoto. (ufs-community#508)

* Increase precision of degs_per_radian to 15 digits.

* Use generic date util.

* Add fake slurm commands for rocoto usage on linux.

* Modify machine files for linux and mac.

* Modify linux and macos wflow modules.

* Fix unittest.

* Remove openmpi module loading in linux/mac build modulefile.

* Fix sacct.

* Fix crontab unspecified USER issue.

* Add EXTRN_MDL_DATA_STORES to macos.

* Add more states to squeue/sacct.

* Add a taskthrottle=1 option for linux/mac.

* Don't specifiy number of processes for mpirun.

* Get exit code directly instead of from log file.

* Set taskthrottle to 1000 by default.

* Fix linux lmod path bug.

* Set stack size to unlimited for linux/mac.

* Fix unittest.
…1, jjob and ex- scripts (ufs-community#536)

* Add j-job and ex- scripts files for online-cmaq

* reviewers comments on ufs-srw-app

* remove unnecessary commands

Co-authored-by: chan-hoo <chan-hoo.jeon@clogin01.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: chan-hoo <chan-hoo.jeon@clogin04.cactus.wcoss2.ncep.noaa.gov>
* Fix plotting bug.

* Add plotting test case to fundamental tests in both nco/community mode.
Currently the domain to plot is hard-coded to conus in the plotting scripts. This PR adds the capability to choose either conus or regional or both.
…nvalid entries (ufs-community#559)

This is actually two fixes/enhancements, one required by the other:

1) Fix error message that appears when user specifies an invalid key in their config.yaml file. The current version references undefined variables and appears to be a copy/paste error from some other exception.

2) In order to achieve the above neatly, I had to change the behavior of the python_utils function check_structure_dict(). Rather than simply printing the invalid key/value pair, it now returns a dictionary of invalid key/value pairs (that is empty if all keys are valid). I also fixed a minor bug here: even though this function claimed to detect all invalid entries, it actually only printed the first before returning. Now all invalid entries are returned.

---------

Co-authored-by: Zachary Moon <zmoon92@gmail.com>
…ion) (ufs-community#562)

This PR restores the previous behavior of the variable EXPT_BASEDIR (Item 2 is the behavior that was previously broken), which has the following effect on the experiment directory EXPTDIR:

1) If EXPT_BASEDIR is not set or set to a null value, the default value (${HOMEdir}/../expt_dirs) will be used
2) If EXPT_BASEDIR is set to a relative path (i.e. the first character is not /), the user-specified path will be appended to the default value ${HOMEdir}/../expt_dirs (for example if the user specifies EXPT_BASEDIR=some/relative/path in their config.yaml, it will be updated to EXPT_BASEDIR=${HOMEdir}/../expt_dirs/some/relative/path in the workflow
3) If EXPT_BASEDIR is set to an absolute path, that path will be used as entered

After the above logic is applied, EXPTDIR will be created by joining the paths EXPT_BASEDIR and EXPT_SUBDIR as usual.
This PR introduces two new scripts to the repository: run_WE2E_tests.py and monitor_jobs.py. The purpose of these scripts is to eventually provide a pythonic replacement for the current workflow end-to-end test submission script. Additionally, the monitor_jobs function gives the capability to monitor and submit jobs automatically via the command line or a batch job, rather than relying on crontab entries.
…ecast files (ufs-community#566)

This PR enables running of only the SRW App's deterministic verification (vx) tasks on staged forecast files from previous runs of the App. It partially resolves Issue ufs-community#565 (it resolves the issue for deterministic vx but not ensemble vx).

Specific changes:

* Update lua module file for vx tasks to suppress "Logging error" messages in vx task log files.
* Rename experiment variable MODEL to VX_FCST_MODEL_NAME to clarify that this is the name of the forecast model in the context of verification (and which will be used in the vx output files). This requires updates to most (all?) of the * METplus configuration files and the verification ex-scripts.
* Create the new variable VX_FCST_INPUT_BASEDIR to allow the user to specify a directory in which to look for staged forecast output (instead of running a forecast).
* Modify the rocoto template xml (FV3LAM_wflow.xml) to make dependencies of vx tasks on post-processing tasks appear only when the post tasks are enabled.
* Add a new WE2E test category subdirectory named verification in which to group all vx tests (since more vx tasks will be coming in future PRs). Move the two existing tests MET_verification and MET_ensemble_verification from wflow_features to verification, and add a new test named MET_verification_only_vxto test the capability that this PR introduces.
Note: The new WE2E test MET_verification_only_vx requires new data, specifically post-processed forecast output from the SRW App. This data needs to be staged on each platform; currently, it is located in a personal directory on Hera.
…y#583)

PR ufs-community#566 changed the variable "MODEL" to a more descriptive name, but failed to make this change in config.community.yaml. The unit tests for generate_FV3LAM_wflow.py make use of this file as an input config.yaml, so they are now failing due to this incorrect variable name. This wasn't caught because prior to ufs-community#558 the unit tests were broken for a different reason.

This change simply makes the appropriate rename, which should fix the failing unit test. Also created an f-string that was missed in a setup.py error message.
)

* Add the CCPP physics suite FV3_GFS_v17_p8 to the UFS SRW App.
* Add a new WE2E test for the new suite.
* Remove Hera from the Jenkins pipeline (no account for EPIC currently on this machine).

---------

Co-authored-by: Michael Lueken <michael.lueken@noaa.gov>
Co-authored-by: chan-hoo <chan-hoo.jeon@clogin02.cactus.wcoss2.ncep.noaa.gov>
To be consistent with the naming convention used for variables that specify per-task properties (e.g. PPN_RUN_FCST, WTIME_RUN_FCST, MAXTRIES_RUN_FCST), rename variables that store the task names (e.g. RUN_FCST_TN) so that the "TN" part is at the beginning, e.g. TN_RUN_FCST.
…ty#584)

This PR adds a Tutorial chapter to the SRW Documentation. There are descriptions for 5 severe weather events and a full tutorial for the first one (2019061518).
This PR also includes:

* Updates to the Glossary
* An introduction to SSH and scp data transfer, which will help users to download tutorial plots from HPC systems to their local system for viewing.
* Minor fixes/updates in other chapters for spelling/grammar/accuracy.

---------

Co-authored-by: gspetro <gillian.s.petro@gmail.com>
Co-authored-by: Michael Kavulich <kavulich@ucar.edu>
danielabdi-noaa and others added 28 commits April 1, 2023 23:17
…fs-community#690)

When pipeline files are archived to s3 bucket, retrieving the file via a browser attempts to render/display files of known extensions. A browser doesn't generally understand what to do with a .log extension (e.g. build.log). For ease of use in the CI Dashboard, which is a static HTML page, the s3 archived build log needs a .txt extension (e.g. build.txt).
)

Add upgrade option besides delete, rename, quit when checking for existing experiment directory (or another folder). For EXPTDIR the behavior is different from other directories. The "reuse" option behaves like rsync for EXPTDIR. The workflow generation is handled by python code in SRW app, therefore check_for_preexist_dir.py handles that. For other directories, the shell version of the script is used, and reuse means reusing the existing directory. Apparently this option is sometimes useful for RRFS_dev1 workflow runs.
* Add a flag DO_FCST_RESTART turning on/off the restart option of the forecast task.
* Add a python script update_restart_input_nml_file.py replacing the six parameters related to the restart option in the FV3 input.nml file.
* Add a parameter fhrot to model_configure.
* Update the configuration files model_configure and nems.configure with the latest format.
* This capability is the NCO's requirement.

---------

Co-authored-by: chan-hoo <chan-hoo.jeon@clogin04.cactus.wcoss2.ncep.noaa.gov>
Co-authored-by: Christina.Holt <Christina.Holt@noaa.gov>
Following the migration to a new site for Jenkins, communication between Jenkins and Hera/Jet have been lost. Guidance from this afternoon (April 5, 2023) says that communication should be returned by CoB today. Reactivating Jet in the Jenkinsfile now.
…ty#715)

The "," separated words in the bash case statement of file devbuild.sh does not work on MacOS Monterey and bash version 5.2.15. For generality, it had better use "|" to separate words in the bash case statement.
Update the workflow to use the pre-combined point source data files (NOAA-EMC/AQM-utils#4) in order to reduce runtime for Online-CMAQ with explicit point source on. This is a breaking change (if using explicit point source) since the invocation of the point source data merge tool has changed slightly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.