-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update FVCOM code to handle subdomain restart files using multiple cores. #624
Update FVCOM code to handle subdomain restart files using multiple cores. #624
Conversation
@hu5970 - One of the checks is failing because Doxygen has detected some undocumented variables. When I compiled your branch setting
|
Does this test need to be updated or expanded to test this new logic: https://github.com/ufs-community/UFS_UTILS/tree/develop/tests/fvcom_tools |
@@ -105,7 +105,7 @@ subroutine alloc_obsbase(this,numvar,ifquality) | |||
if(this%ifquality) allocate(this%quality(numvar)) | |||
|
|||
else | |||
write(*,*) 'alloc_obsbase Error: dimension must be larger than 0:', numvar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the only change in this file is write(*,*) -> write(6,*) for many lines. Is the output not shown correctly with write(*,*)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write(6,) will direct the information to different stdout files for each core. Write(,) will write all information to one stdout file. write(,) is the same as write(6.) in single core job but it will make stdout file very messy when multiple cores are used.
@@ -49,7 +50,7 @@ program process_FVCOM | |||
|
|||
! Grid variables | |||
|
|||
character*180 :: geofile | |||
character*180 :: thisfv3file | |||
character*2 :: workPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to find why 'geofile' was changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goefile is a local variable for file names. It is used by several kinds of the restart files. I think the name just to avoid confusion by its name.
@dmwright526, do you mind reviewing this PR? Thank you! |
@hu5970 'develop' was updated yesterday. So don't forget to update your branch accordingly. |
@GeorgeGayno-NOAA Thanks, will sync with develop after get the major review comments addressed. |
@@ -73,6 +74,8 @@ program process_FVCOM | |||
character(len=180) :: wcstart | |||
character(len=180) :: inputFVCOMselStr | |||
character(len=180), dimension(:), allocatable :: args | |||
integer :: fv3_io_layout_y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hu5970 , From what I understand here, the domain is divided in the y-direction only and not x-direction for the mpi application, is this true? Also, this now requires the size of each subgrid in the y-direction to be passed to this program in the regional_workflow make_ics step. Is there a way to easily pass this automatically without user intervention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only y-direction is divided into subdomains for speeding up initialization. The x-direction can be divided but the suggest from FV3LAM experts is to divide Y (to set io_layout(2) >1).
In this code, the y-dimension for each subdomain is decided from reading the Y coordinate from grid_spec of each subdomain. No need user intervention. Also, this multiple subdomains only for warm start cycles. The cold start cycle still has single big domain for initial. That is why I moved the FVCOM run script out of make_ics and put it into prep step (a step prepare background for each data assimilation cycle).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. We will just need to make sure that this will work with a cold start and that if a 5th input argument is needed that it is added to the documentation.
@hu5970 and @dmwright526 - I have not seen any activity on this PR for a week. What is the status? |
Please let me know if I need to do something. I did not know how to make Doxygen and tests but would like to learn if there are instructions. Thanks. |
Any more comments or actions needed for this PR? |
Your branch is out of date with 'develop'. You will need to do a merge first. Also, one of the unit tests is failing because of doxygen errors. |
Did you address these errors? |
@GeorgeGayno-NOAA, I think @hu5970 would like some help with Doxygen. Is there anyone who could work with him on that? |
@hu5970 Do you need assistance? |
These errors can be eliminated by adding definitions for ybegin and yend. For example:
|
@GeorgeGayno-NOAA @JeffBeck-NOAA Thanks for help. I will fix do the Doxygen issue following your instruction and sync with current develop branch. |
8f3b73d
to
193598d
Compare
193598d
to
ea85976
Compare
Can someone check why those checks have been waiting for 24-h? Thanks, |
@hu5970, the workflow run needs to be approved (@GeorgeGayno-NOAA) to run because it's your first time contributing. |
@hu5970 I just started them for you. |
@GeorgeGayno-NOAA @kgerheiser Thanks. |
@JeffBeck-NOAA and @dmwright526 - Has @hu5970 addressed your comments? |
I know we turned off the unit test because it won't run under GNU (see #596). So I thought I would run it on Hera using Intel. But the branch won't compile and neither does 'develop'. We should fix this. |
@GeorgeGayno-NOAA Do you know if the failure is due to fvcom tools or is it a larger issue? The overall additions @hu5970 added look good from my end. |
@GeorgeGayno-NOAA, I reviewed the PR to the extent of my knowledge, and deferred to @dmwright526 for the rest, so I'm glad to see he's happy with things. I'll wait for an update on the Hera intel unit test, then approve. |
The unit test won't compile because of missing arguments. Here is what I did to get it compile:
But I don't know how to initialize these new variables. |
@GeorgeGayno-NOAA The unit test will fail at this point due to dummy dataset not including the ice thickness and surface roughness length was not updated in the previous unit test. These features were added recently to fix a crash in the MYNN surface layer scheme. I am working on another issue right now, but I will work on getting the dataset and unit test updated tomorrow to fix that issue. |
@GeorgeGayno-NOAA @JeffBeck-NOAA @hu5970 I was able to update the unit test and create the new unit test data to reflect the update in variables ingested from FVCOM and the new MPI support. The new fvcom unit test ran successfully on Jet with Intel. How do you want to proceed with this given the number of simultaneously moving pieces here? The new unit test will require an update to the data on the EMC ftp. I know there is an open PR for changing how the unit test data is downloaded, so I want to make sure these changes are inline with that. The easiest might be to attach the new unit test to this PR. |
Yes, let's include the new unit test under this PR. Where are your code changes? I may be able to check them into @hu5970 branch. And where is the new dataset that needs hosted on the EMC ftp? |
@GeorgeGayno-NOAA They can be found here The ftst_readfvcomnetcdf.F90, fvcom_unittest.nc, and sfcdata_unittest.nc are the files that are needed. |
@dmwright526 Thanks, I will take a look. @hu5970 Do you mind if I update your fork with his unit test updates? |
@dmwright526 I was able to get the unit test to run successfully on Hera. Thanks. @KateFriedman-NOAA Can you replace the files here: https://ftp.emc.ncep.noaa.gov/static_files/public/UFS/ufs_utils/unit_tests/fvcom_tools/ with the new versions here (Hera): /scratch2/NCEPDEV/stmp1/George.Gayno/UFS_UTILS.dmwright/tests/fvcom_tools |
@GeorgeGayno-NOAA Please feel free to update my fork for updating the unit test. Thanks, Ming |
to check new variables ingested from FVCOM and for the new MPI support. Fixes ufs-community#623.
@GeorgeGayno-NOAA Done! While on Mars:
|
@dmwright526 I added your updated regression test at 2a7788c. It compiles and runs for me on Hera. Thanks. I will not include the new data files under this PR. I can handle that under #630. If you give your approval, I can merge this today. |
Sorry, Kate. I just meant the two netcdf files (those ending in .nc). |
@@ -73,6 +74,8 @@ program process_FVCOM | |||
character(len=180) :: wcstart | |||
character(len=180) :: inputFVCOMselStr | |||
character(len=180), dimension(:), allocatable :: args | |||
integer :: fv3_io_layout_y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. We will just need to make sure that this will work with a cold start and that if a 5th input argument is needed that it is added to the documentation.
@GeorgeGayno-NOAA Sorry, I didn't see the "pending" on my comments... The changes look great on my end. Thank you for the help on this! |
No worries, I've removed the other files that got pushed to RZDM and left just those two .nc files:
|
* origin/develop: Update ccpp submodule (ufs-community#827) Change length of character variable "mosaic_name" in "make_solo_mosaic.c" (ufs-community#824) Update GDAS INIT utility for GFS COM reorganization (ufs-community#820) Add backup calculation for orography and mask generation (ufs-community#713) Improve repository build script (ufs-community#819) Use latest wgrib2 executable on Jet. (ufs-community#816) chgres_cube - Minor update for high-resolution grids . (ufs-community#814) Update PR template (ufs-community#813) global_cycle - Add soil moisture nudging for NoahMP option (ufs-community#809) Release v1.10.0 (ufs-community#807) Run sfc_climo_gen utility on Hera (ufs-community#791) Fix bug in chgres_cube subroutine search_many (ufs-community#808) sfc_climo_gen - Output fractions of each vegetation/soil type category (ufs-community#748) Replace w3nco library with w3emc (ufs-community#802) Fix Debug compilation (ufs-community#800) Update WCOSS2 build module to use new ESMF v8.4.1 library. (ufs-community#799) Update chgres_cube documentation for duplicate grib2 records (ufs-community#795) Use latest Cray modules on WCOSS2 (ufs-community#792) Remove checksum attribute from global_cycle restart files (ufs-community#794) Remove goto statements from chgres_cube (ufs-community#775) FVCOM_TOOLS - Remove dependency on module_nwp_base.f90 (ufs-community#790) Remove "gcovr" step from the 'developer' workflow (ufs-community#785) Update build module for Cheyenne (ufs-community#783) GDAS Init utility - consolidate the copy of coldstart files (ufs-community#773) Remove support for ODIN machine (ufs-community#782) Update GDAS INIT utility to optionally use fracoro data (ufs-community#741) Option to build only application specific utilities (e.g. GFS) (ufs-community#777) Use combined IP/IP2 library (ufs-community#695) Initial updates to global_cycle for Noah-MP land model (ufs-community#774) New resolution options for the cpld_gridgen utility (ufs-community#769) Use new EPIC-maintained hpc-stack on Jet (ufs-community#771) Update GDAS initialization scripts for the new ENKF directory (ufs-community#764) Reduce memory usage in chgres_cube (ufs-community#766) Detect duplicate vertical levels in chgres_cube. (ufs-community#767) Point to new input orography directory. (ufs-community#758) Release version 1.9 (ufs-community#754) HAFSv1 grid nesting updates (ufs-community#752) Remove compiler warnings from chgres_cube (ufs-community#747) chgres_cube: Split input_data module into to 3 separate modules (ufs-community#744) More work on CI, checking with different versions of ESMF (ufs-community#742) Add processing of soil color to sfc_climo_gen (ufs-community#721) more work on CI - added Linux_versions workflow (ufs-community#739) Add TEST_FILE_DIR option to CMake build, where test data files can be found instead of using FTP. (ufs-community#732) adding developer workflow (ufs-community#724) Update GDAS INIT utility for v16.3. (ufs-community#723) OROG_GSL - Remove negative bias in orographic asymmetery (OA) fields (ufs-community#718) Incorporate BNU soil type data (ufs-community#717) High-resolution MODIS and STATSGO veg/soil data. (ufs-community#703) Run WCOSS2 consistency tests under role account (ufs-community#711) Fix macOS CI workflow (ufs-community#715) Add utility codes to create BNU soil texture data (ufs-community#707) Update chgres_cube to output netcdf4 file. (ufs-community#704) Incorporate high-res global VIIRS vegetation data. (ufs-community#699) Run grid_gen consistency tests in parallel. (ufs-community#697) Fix error handling in "link_fixdirs.sh". Add -L to the copy command. (ufs-community#701) Update for new fixed data directory structure (ufs-community#688) Add WCOSS2 support for tests (ufs-community#693) Increase test data pull timeout (ufs-community#692) Option to install binaries to any directory. (ufs-community#685) Update global_cycle to use the latest CCPP version of sfcsub.F (ufs-community#671) Incorporate weight_gen program. (ufs-community#677) Add install of jpeg-turbo to macos workflows (ufs-community#684) Improve logic in regression test driver script (ufs-community#681) Release v1.8 (ufs-community#679) Update link_fixdirs.sh for new data directories (ufs-community#672) Update default ice climatology in ./ush/global_cycle.sh. (ufs-community#664) Remove all references to WCOSS 1 from UFS_UTILS (ufs-community#667) Update workflows to use latest macOS and ubuntu (ufs-community#675) Update GDAS Initialization utility for recent HPSS tarball name change (ufs-community#666) Port UFS_UTILS to WCOSS2 (ufs-community#642) Fix chgres_cube to process GEFS GRIB2 data (ufs-community#658) Update global_cycle_driver.sh for GFS OPS directory convention (ufs-community#655) Update documentation for SRW App (ufs-community#656) New coupled model utility (ufs-community#647) Update support for S4 and enable regression testing (ufs-community#654) global_cycle - Link to CCPP version of sfcsub.F (ufs-community#636) Add processing of new global AFWA snow data to emcsfc_snow2mdl. (ufs-community#648) Update build module on Cheyenne (ufs-community#646) Move to Intel 2022 on Jet, Hera and Orion (ufs-community#650) Host doxygen documentation for multiple releases (ufs-community#644) Download unit test data as part of the CMake build (ufs-community#630) chgres_cube - Complete removal of wgrib2 library (ufs-community#641) Eliminate circular dependency in chgres_cube Update workflow files to pull netcdf-c library from GitHub chgres_cube - Remove the wgrib2 library from the GRIB2 data read routines. Undefined symbols on macOS with Intel compiler (ufs-community#628) Update FVCOM code to handle sub-domain restart files using multiple cores. (ufs-community#624) chgres_cube - Run routine 'convert_omega' on all tasks. (ufs-community#627) Use ESMF 8.2.0 library Automate update of consistency test baseline data. (ufs-community#603) Update workflow files to use newer versions of ESMF and NCEPLIBS. (ufs-community#617) Update build modules to be lua compliant (ufs-community#614) Allow FVCOM tools to Update Ice Surface Roughness Length (ufs-community#604) Update the requested memory in the Orion chgres_cube consistency test script (ufs-community#611) Use copy of grib_util under Jet role account. (ufs-community#608) Run consistency tests on Orion using role account (ufs-community#606) Run consistency tests on Hera using role account (ufs-community#605) Run consistency tests on Jet using role account. (ufs-community#607) Update more documentation after move to ufs-community (ufs-community#597) fvcom_tools - Add option to process 'cold' or 'warm' restart files (ufs-community#595) Update documentation after move to ufs-community (ufs-community#594) chgres_cube - Eliminate segmentation fault in input_data.F90 (ufs-community#585) Update to language of unit test README to match that in unit test. chgres_cube - Simplify surface processing using field bundles (ufs-community#572) Add compiler flags for GNU Fortran v10 or newer compilers. (ufs-community#583) Move verbose output from example unit test to be commented to streamline test output. Updates to test README to add instructions for use of example unit test. Add Findwgrib2.cmake (ufs-community#578) Added unit test to be used as an instructional example for new users.
Add code to FVCOM to handle the subdomain restart files when the FV3LAM has io_layout(2) >1.
Add code to use several cores to process subdomain restart files to speed up the process.
Also, change write(,) to write(6,*) to write the running information into the stdout file for each core.
This is to address issue #623.