Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WW3, Turn on Gaea for waves, Create template for ww3_multi.inp #544

Conversation

JessicaMeixner-NOAA
Copy link
Collaborator

@JessicaMeixner-NOAA JessicaMeixner-NOAA commented Apr 26, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Description

This PR turns on wave tests for S2SW and ATMW on gaea. It turns on ATMW tests for wcoss-cray as well. The ww3_multi.inp file was templated. The S2SW tests were load balanced (#461). The WW3 input file directory was updated for the creation of the mod_def files to include the building of ww3_grid and a 2 degree mod_def is created as well in case the low resolution test cases want to take advantage of the lower resolution wave model to use even fewer resources for waves.

In addition we switch the v16 coupled tests to output gaussian because of the pio issue (#509)

  • New input directory for WW3 is needed
  • New baselines are needed because the number of MPI tasks are changed for S2SW tests and issue 427 has not yet been solved.

Issue(s) addressed

Testing

How were these changes tested? So far on orion, hera and gaea What compilers / HPCs was it tested with? intel Are the changes covered by regression tests? Yes Have regression tests and unit tests (utests) been run? No On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel (baselines created for all tests but verify against 9 tests failed due to disk quota).
  • wcoss_cray
  • wcoss_dell_p3

Dependencies

No dependencies

co-authors: @DeniseWorthen

JessicaMeixner-NOAA and others added 30 commits February 8, 2021 16:36
simplify some of the CMake and add future flexibility for gnu
This reverts commit 7b826d4.
 Conflicts:
	tests/fv3_conf/ccpp_gfdlmp_run.IN
	tests/fv3_conf/ccpp_multigases_run.IN
	tests/fv3_conf/cpld_bmark_run.IN
	tests/rt.conf
	tests/tests/fv3_gfdlmprad
	tests/tests/fv3_gfdlmprad_atmwav
@JessicaMeixner-NOAA
Copy link
Collaborator Author

wcoss-dell baseline has been created, now running against the new baseline

@BrianCurtis-NOAA
Copy link
Collaborator

Machine: jet
Compiler: intel
Job: BL
Repo location: /lfs4/HFIP/h-nems/emc.nemspara/autort/pr/623399322/20210506174517/ufs-weather-model
Please manually delete: /lfs4/HFIP/h-nems/emc.nemspara/RT_RUNDIRS/emc.nemspara/FV3_RT/rt_101822
Test cpld_control_c384 007 failed failed
Test cpld_control_c384 007 failed in run_test failed
Please make changes and add the following label back:
jet-intel-BL

@DeniseWorthen
Copy link
Collaborator

The jet failure was in the baseline creation. I think it was just one of those flakey jet things. I see in err

[225:x471] unexpected DAPL connection event 0x4008 from 284
Fatal error in PMPI_Wait: Internal MPI error!, error stack:

@BrianCurtis-NOAA I assume what i need to do is manually move all the baselines that did complete, manually create the missing baseline and then auto-RT. Is that correct?

@BrianCurtis-NOAA
Copy link
Collaborator

The jet failure was in the baseline creation. I think it was just one of those flakey jet things. I see in err

[225:x471] unexpected DAPL connection event 0x4008 from 284
Fatal error in PMPI_Wait: Internal MPI error!, error stack:

@BrianCurtis-NOAA I assume what i need to do is manually move all the baselines that did complete, manually create the missing baseline and then auto-RT. Is that correct?

Yes. Correct.

@BrianCurtis-NOAA
Copy link
Collaborator

Machine: jet
Compiler: intel
Job: RT
Repo location: /lfs4/HFIP/h-nems/emc.nemspara/autort/pr/623399322/20210506231511/ufs-weather-model
Please make changes and add the following label back:
jet-intel-RT

Copy link
Collaborator

@junwang-noaa junwang-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good to me, we can commit it when all RTs are done.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented May 7, 2021

The Jet failure was disk quota. I can see in the rt logs that all jobs that ran passed, but I can't cat them into a single file.
But, I don't think all tests ran. I think we're missing 5 tests:

datm_control_cfsr,datm_control_gefs,datm_bulk_gefs,datm_mx025_gefs,datm_cdeps_debug_cfsr

based on grepping for 'Disk quota exceeded' in the run*.logs

JessicaMeixner-NOAA and others added 5 commits May 7, 2021 11:19
--- Note the first set of regression tests, the compile for cpld
 timed out so I re-ran just the coupled test.  The first log is
on top and the second log with just the coupled is on the bottom
…A/ufs-weather-model into makewavetemplateinput
@DeniseWorthen
Copy link
Collaborator

will merge when ci tests have completed

@BrianCurtis-NOAA BrianCurtis-NOAA merged commit 68cb84b into ufs-community:develop May 7, 2021
@JessicaMeixner-NOAA JessicaMeixner-NOAA deleted the makewavetemplateinput branch June 7, 2021 13:30
@MinsukJi-NOAA MinsukJi-NOAA mentioned this pull request Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. New Input Data Req'd This PR requires new data to be sync across platforms Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

adjust some cpld v16 tests for forecast length and output type load balance wave tests
6 participants