Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exact restart fail on test with update to fates_s1.4.1_a3.0.0 #315

Closed
ekluzek opened this issue Jan 12, 2018 · 7 comments
Closed

Exact restart fail on test with update to fates_s1.4.1_a3.0.0 #315

ekluzek opened this issue Jan 12, 2018 · 7 comments
Labels
science: bug Bugs that are specific to the implementation of a scientific model

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Jan 12, 2018

An exact restart test fails when I updated the clm fates branch to fates_s1.4.1_a3.0.0. The test is ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm50FatesGs.yellowstone_pgi.clm-fates. This same test works fine with either intel or nag compiler on hobart, but also fails with the pgi compiler on hobart. A lot of fields are different from the baseline to restart...

RMS AREA_PLANT 3.4742E-02 NORMALIZED 6.2319E-01
RMS AREA_TREES 3.4742E-02 NORMALIZED 6.2319E-01
RMS BTRAN 1.1822E-02 NORMALIZED 6.2323E-01
RMS GPP 1.0954E-08 NORMALIZED 1.8167E-02
RMS H2OSOI 1.6319E-05 NORMALIZED 9.6514E-05
RMS LITTER_IN 4.5475E-13 NORMALIZED 1.3018E-06
RMS LITTER_OUT 5.7231E-11 NORMALIZED 6.2319E-01
RMS NBP 1.4486E-08 NORMALIZED 1.5750E-02
RMS NEP 1.4486E-08 NORMALIZED 1.5750E-02
RMS NPP 1.3843E-08 NORMALIZED 4.7671E-02
RMS TLAI 1.5066E-03 NORMALIZED 1.9449E-02
RMS TOTLITC 3.4925E-10 NORMALIZED 1.7393E-06
RMS T_SCALAR 3.7401E-05 NORMALIZED 7.7218E-05
RMS AREA_PLANT 3.5480E-02 NORMALIZED 6.4070E-01
RMS AREA_TREES 3.5480E-02 NORMALIZED 6.4070E-01
RMS BTRAN 1.1685E-02 NORMALIZED 6.4141E-01
RMS GPP 9.6497E-09 NORMALIZED 1.8454E-02
RMS H2OSOI 6.3526E-05 NORMALIZED 3.8013E-04
RMS LITTER_IN 1.0232E-12 NORMALIZED 2.9305E-06
RMS LITTER_OUT 8.2481E-11 NORMALIZED 6.5991E-01
RMS NBP 3.9211E-08 NORMALIZED 4.4221E-02
RMS NEP 3.9211E-08 NORMALIZED 4.4221E-02
RMS NPP 3.6849E-08 NORMALIZED 9.3921E-02
RMS PFTbiomass 6.4373E-04 NORMALIZED 4.4718E-05
RMS TOTECOSYSC 1.4648E-03 NORMALIZED 1.5571E-06
RMS TOTLITC 2.1305E-06 NORMALIZED 5.7711E-03
RMS TOTSOMC 1.2207E-04 NORMALIZED 1.3388E-07
RMS T_SCALAR 1.5954E-04 NORMALIZED 3.0967E-04

@ekluzek ekluzek added the science: bug Bugs that are specific to the implementation of a scientific model label Jan 12, 2018
@ekluzek
Copy link
Collaborator Author

ekluzek commented Jan 12, 2018

The test with fates_s1.3.1_a2.0.0_n02_clm4_5_17_r265 passed the same exact restart test. The test fails for: fates_s1.4.0_a3.0.0_n04_clm4_5_17_r265 and passes for fates_s1.3.1_a2.0.0_n03_clm4_5_17_r265 on hobart_pgi. So the changes between those two tags cause the issue. there are some small changes in clm there, as well as the update from fates_s1.3.1_a2.0.0 to fates_s1.4.0_a3.0.0_rev2.

@rgknox
Copy link
Contributor

rgknox commented Jan 12, 2018

@ekluzek , when fates_s1.3.1_a2.0.0_n02_clm4_5_17_r265 passes, which fates tag is used for that one? is it https://github.com/NGEET/fates/releases/tag/sci.1.3.1_api.2.0.0?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Jan 12, 2018

@rgknox yes, it uses: fates_s1.3.1_a2.0.0, which is sci.1.3.1_api.2.0.0. But, I updated an earlier comment to point to the version with the fail in it which is the update from fates_s1.3.1_a2.0.0 to fates_s1.4.0_a3.0.0. I don't think the CLM changes are likely to be the problem. I looked through the differences in fates, especially for new variables that aren't being set. the difference is large over 4k lines so it's difficult to say. But, I did notice the variable trunk_product in biogeochem/EDPhysiologyMod.F90 in subroutine CWD_Input isn't set for all conditions.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Jan 12, 2018

In the version in fates_clm, the compset and testmods names are different. But, I reproduced a fail
with fates-clm at clm4_5_15_r234_v1.1.0_fatesAPI_v3.0.0 and fates at sci.1.4.1_api.3.0.0 for ERS_D_Mmpi-serial_Ld5.1x1_brazil.ICLM45ED.hobart_pgi.clm-edTest. This also shows this problem isn't just for CLM50 as my original test showed, but also for CLM45, and for the older CLM version that's in fates-clm.

@billsacks
Copy link
Member

billsacks commented Mar 8, 2018

In my testing for ESCOMP/CTSM#311 (using billsacks/ctsm@f9ff7cf) I got

FAIL ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm45FatesGs.cheyenne_intel.clm-Fates COMPARE_base_rest

The differences between the base and restart run are in just one field:

 RMS NPP_BY_AGE                       3.2910E-07            NORMALIZED  8.1129E-02

I think the only relevant difference in this branch is that I am using a different fates initial conditions file, which is likely substantially different from the one that had been used before (see notes in the above-referenced PR for details).

@ekluzek and @rgknox do you think it's reasonable to attribute that failure to this issue, and to add this test to the ExpectedFails list in CTSM?

(For details, see /glade/scratch/sacks/tests_0306-2105c/ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm45FatesGs.cheyenne_intel.clm-Fates.GC.0306-2105c_i on cheyenne.)

@rgknox
Copy link
Contributor

rgknox commented Mar 25, 2019

Ok, putting the pieces together here.
My take is that maybe we should mark this as an expected fail, and see how these tests react to a more recent version of fates.
Is it true that only the 1x1 brazil test is reporting these errors, and that is the only fates test that uses an finitdat?

@glemieux
Copy link
Contributor

Looking at the fates expected failures list, these test pass on the I2000Clm50FatesCru version of the tests listed (we don't use the Clm45 version of the tests anymore). @ekluzek do you think its ok to close this out?

@rgknox rgknox closed this as completed May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
science: bug Bugs that are specific to the implementation of a scientific model
Projects
None yet
Development

No branches or pull requests

4 participants