-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update lmod, miniconda, task modulefiles for Gaea #353
Update lmod, miniconda, task modulefiles for Gaea #353
Conversation
@EdwardSnyder-NOAA I got a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May or may not need changes in load_modules_run_task.sh
@danielabdi-noaa that was a typo I fixed with yesterday's commit. Currently, this PR will crash in the middle of building the SRW App. It is an error building the UPP. I haven't had time to look into this error yet. |
* Add preamble script from global workflow. * Call preamble script in j-jobs and ex-scripts * Call preamble in other scripts. * Make names of j-jobs and ex-scripts consistent. * Working towards nco vars in table 1. * Change default bin directory to exec * Appen FATAL ERROR to print_err_msg_exit. * Replace some cp, cd, mkdir calls with their corresponding _vrfy versions * Add job and jobid to the job-card. * Add cyc and subcyc to rocoto xml * Add a j-job preamble script for setpdy. * Add a j-job postamble as well. * Define some Table 1 vars in setup. * Remove unused SRC_DIR, and rename others * Rename CYCLE_BASEDIR to COMIN_BASEDIR * Create the NCO root directories in setup. * Remove source machine file wrapper. * Bug fix in job_preamble. * Make make_ics/lbcs use DATA directory properly. * Make run_fcst use DATA directory properly. * Made run_post use DATA directory properly. * Make make_grid use DATA properly (untested). * Make make_sfc_climo use DATA properly (untested). * Make make_orog use DATA properly (untested). * Bug fix for none-nco mode. * Don't pass arguments from j-jobs to ex-scripts. * Make forecast and post-output go to COMOUT. * Remove CYCLE_DIR and use COMIN instead. * Bug fix for community mode. * Append cyc to COMIN in NCO mode. * Fix rocoto run_post dependency with run_fcst issue. * Use OPSROOT instead of PTMP and STMP. * Move nco vars in config_defaults. * Move logdir location to COMROOT. * Set all root directories to EXPTDIR in community mode. * Use pgmout and pgmerr. * Fix inline post. * Make pgmout/err redirection work with community mode. * Use print_err in get_obs_mrms. * Add prep_step. * Add post_step. * Add dbn_alert to post-processed grib2 output. * Download extrn files directly to COMIN. * Make make_ics/lbcs directly output to COMIN. * Change names of extrn_mdl_var_defns files. * Name fixes for DO_ENSEMBLE=false, dyn/phy * Don't create symlinks to grib2 files in NCO mode. * Append rrfs to make_ics/lbcs output. * Modify extrn_mdl_var_defns names. * Move forecast output to DATA/RUN.PDY. This location can be used to store output of other tasks as well. * Move templates to parm. * Fix for new parm location. * Move metplus one level up. * Fixes for community mode. * Rename SCRIPTSDIR and JOBSDIR. * Move all FIX** directories in to a fix/ directory. * Make FIXrrfs be EXPTDIR for community mode. * Symlink upp and ufs_utils parm files to top level parm directory. * Remove UPP_DIR and UFS_UTILS_DIR. * Define cycle with subcyc when it is non-zero. * Don't delete COMIN_BASEDIR if it already exists. * Disassociate NCO mode from pre-generated grid. * Don't choose fix location based on RUN_ENVIR. * Bug fix in make_lbcs. * Add flag to symlink or copy fix files. * Change slurm log file locations * Minor fix for inline post in nco mode. * Add unique workflow ID to avoid clashes between different runs, while keeping the relation between different tasks, which PID can not do. * Make verification tasks NCO complaint. * Pass RUN_ENVIR to we2e script. * Fixes for merge conflicts. * Add versions for wcoss2. * Fix symlinks. * Minor changes. * Move grid/orog/sfcc completion files to EXPTDIR/grid/orog etc. * Output modified namelist file with seeds in current directory. * Fixes for unittests. * Bugfix wrf_io version * Fix CI issue with bin locations. * Allow NCO root directories to be set individually. * Don't append workflow id in community mode. * Add helper script to rename model e.g. rrfs->aqm * Bug fixes and naming changes for consitency. * Replace instances of USHrrfs etc with a generic USHdir etc. * Add unittest for whole workflow now that the merge made it possible. * Remove unused process_args utility. * Remove hard coded paths from configs. * Don't replace existing var value with None. * Add config.nco to unittest. * Fix for Orion issue. * Fix default OPSROOT location in run_we2e. * Modeify setup_we2e script to run fundamental tests on all machines. * Fix conflicting ics/lbcs temp location by moving to DATA. * Bug fix in load_modules taken from PR #353. * Specify default shell instead of symlinking. * Turn off grid/orog/sfc_climo tasks for NCO test cases. * Use PDY and cyc in ex-scripts. * Remove CDATE from xml and define int job_preamble. * Use machine specific list of tests if available. * Run all tests in community mode so that the last NCO test case gets reported as finished. * Minor changes * Avoid using preamble in functions. * Use preamble in function too. * Turn on debugging for utility functions. * Turn on debug & verbose in CI. * Turn off set -e for init_env
The list of modules to be loaded needs updates.
Fixed a typo
@danielabdi-noaa - the module build_gaea_intel needed updates, should be working now! |
Thanks for fixing the problem. I am able to build the PR successfully on Gaea now, so approving. |
Waiting for the PR-830 to get approved in a regional_workflow repo, before merging these changes into the develop branch. |
Updated following recent tests of Met verification, as in ufs-srweather-app repository PR-353: ufs-community/ufs-srweather-app#353
* Lmod/8.7.12 init, updated miniconda3 - for Gaea Added new (Lmod/8.7.12) initialization wrapper script to the ENV_INIT_SCRIPTS_FPS variable; added the PROJ_LIB and PATH variables referring to a PROJ package location in the updated miniconda3/4.12.0 and the regional_workflow environment that contain the PROJ package * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/. * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * update run_vx.local Updates to the MetPlus verification script; not yet officially supported in the release of public-v2. These changes and test of the MetPlus script were done by @EdwardSnyder-NOAA * added missing argument for the ./etc/lmod-setup.sh script A bug found by @EdwardSnyder-NOAA; a separate PR to be created into the develop branch * Gaea: Lmod/8.7.12 initialization using a wrapper script, under role.epic account * Gaea: initialize Lmod/8.7.12 using a wrapper script * update a new location of miniconda3/4.12.0, rocoto Specify a new location of the miniconda3/4.12.0 with the regional_workflow environment containing all the necessary packages, and the rocoto/1.3.3 module installed on Gaea under EPIC role account: /lustre/f2/dev/wpo/role.epic/contrib/ * Delete get_extrn_lbcs.local A redundant module; it is placed under ./regional_workflow/modulefiles/tasks/gaea/ instead. * Update load_modules_run_task.sh * Update run_vx.local Updated following recent tests of Met verification, as in ufs-srweather-app repository PR-353: ufs-community/ufs-srweather-app#353 * Update gaea.sh Updated Met Installation locations on Gaea
DESCRIPTION OF CHANGES:
Updated lmod, hpc-stack, miniconda, and task modulefiles so that we can run the WE2E tests on Gaea. Two PRs were created that address these issues in more detail for the release/public-v2 branch of the SRW App (SRW: #352 and Regional Workflow: #830). The quick tests and MET_verification tests will be conducted on Gaea via the Jenkins pipeline.
Type of change
TESTS CONDUCTED:
DEPENDENCIES:
DOCUMENTATION:
ISSUE:
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@natalie-perlin for her help with the configuration and installations of: hpc-stack, lmod, and miniconda3 on Gaea, as well as testing and troubleshooting the MET_verification WE2E test.