Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new baselines #184

Closed
ligiabernardet opened this issue Sep 2, 2020 · 7 comments
Closed

Create new baselines #184

ligiabernardet opened this issue Sep 2, 2020 · 7 comments
Assignees

Comments

@ligiabernardet
Copy link
Collaborator

Create new baselines on the machines for which the App regression test is setup (Cheyenne and Stampede) so that user's runs can match the baselines.

@uturuncoglu
Copy link
Collaborator

@ligiabernardet The app is updated (ufs-release-v1.1 branch) and we have following issues,

  • I am getting problem from CHGRES in some cases with C768 resolution but those could be related with the system and sometimes when I submit again it could produce the input files for the model. You could refer following issue for them CHGRES error when processing GRIB2 data #179

  • I tested the app on Stampede using Intel compiler. All tests are passed except C768 resolutions. Those are still in the queue. The new baseline folder is in /work/02503/edwardsj/UFS/ufs_baselines/20200904-intel but I still need to complete the C768 tests.

  • I also tested app on Cheyenne with both Intel and GNU compilers (both uses MPT).

    • GNU: All tests are passed. I did not get any build error with GNU related with the radiation_aerosols.f file. The new baseline is in /glade/p/cesmdata/cseg/ufs_baselines/20200904_gnu.

    • INTEL: All tests are passed except following tests. In this case, CHGRES runs without any problem but runs dies without any particular issues. It could be system issue because the same tests runs without any problem on stampede which uses Intel compiler also. I tried to run them again manually but the result is same. I'll try again and let you know if I found something new. The baseline folder without the result of these three test is in /glade/p/cesmdata/cseg/ufs_baselines/20200904_intel/

        SMS_Lh3.C192.GFSv15p2.cheyenne_intel (Overall: PEND) details:
          PEND SMS_Lh3.C192.GFSv15p2.cheyenne_intel RUN
        SMS_Lh3.C384.GFSv15p2.cheyenne_intel (Overall: PEND) details:
          PEND SMS_Lh3.C384.GFSv15p2.cheyenne_intel RUN
        SMS_Lh3_D.C384.GFSv16beta.cheyenne_intel (Overall: PEND) details:
          PEND SMS_Lh3_D.C384.GFSv16beta.cheyenne_intel RUN
      

Besides those problems that could be platform specific, the app is ready for your test. I also tested with three different input types (GRIB2, NEMSIO and NETCDF). Please let me know how it goes.

@uturuncoglu
Copy link
Collaborator

The failed test on Cheyenne uses 252 core for C384 and 180 core for C192. It is strange that other tests with same resolution passes without any problem, that indicates system issue.

@uturuncoglu uturuncoglu self-assigned this Sep 8, 2020
@uturuncoglu
Copy link
Collaborator

@ligiabernardet @climbfuji Here is the instructions to run full test suite.

git clone https://github.com/ufs-community/ufs-mrweather-app.git
cd ufs-mrweather-app
./manage_externals/checkout_externals
cd cime/scripts/

# To run on Cheyyene with GNU
qcmd -l walltime=3:00:00 -- "export UFS_DRIVER=nems; CIME_MODEL=ufs ./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine cheyenne --xml-compiler gnu --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00"
# To run on Cheyyene with Intel
qcmd -l walltime=3:00:00 -- "export UFS_DRIVER=nems; CIME_MODEL=ufs ./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine cheyenne --xml-compiler intel --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00"
# To run both GNU and Intel together
qcmd -l walltime=3:00:00 -- "export UFS_DRIVER=nems; CIME_MODEL=ufs ./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine cheyenne --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00"

# To run on Stampede (needs to run in 3 part due to the limitation on Stampede)
export UFS_DRIVER=nems
./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine stampede2-skx  --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00 --xml-category prealpha_p1
./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine stampede2-skx  --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00 --xml-category prealpha_p2
./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine stampede2-skx --workflow ufs-mrweather_wo_post -j 4 --walltime 03:00:00 --xml-category prealpha_p3

To compare against exiting baseline the command look like following,

CIME_MODEL=ufs ./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine MACHINE --baseline-root /glade/p/cesmdata/cseg/ufs_baselines --compare BASLINE --workflow ufs-mrweather_wo_post -j 4"

To generate new baseline,

CIME_MODEL=ufs ./create_test --xml-testlist ../../src/model/FV3/cime/cime_config/testlist.xml --xml-machine MACHINE --baseline-root /glade/p/cesmdata/cseg/ufs_baselines -g BASLINE --workflow ufs-mrweather_wo_post -j 4"

I'll make PR soon for the app.

@uturuncoglu
Copy link
Collaborator

@climbfuji is there any NCEPLIBS installation on Orion that I could use for UFS MR app?

@climbfuji
Copy link
Collaborator

No, orion is a configurable platform and it is up to the user to install the NCEPLIBS him/herself following the instructions in the usual place (NCEPLIBS-external, tag ufs-v1.1.0, doc/README_orion_intel.txt).

@uturuncoglu
Copy link
Collaborator

@climbfuji Okay. Thanks

@ligiabernardet
Copy link
Collaborator Author

We decided that we will NOT provide baselines for the users to compare the new runs against. There were no baselines with v1.0.0 and we'll keep it that way for ease of maintenance. Users can always create their own baselines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants