Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/ISSUE] Low values of MSA in 13.0.0 #72

Closed
msulprizio opened this issue Jan 5, 2021 · 21 comments
Closed

[BUG/ISSUE] Low values of MSA in 13.0.0 #72

msulprizio opened this issue Jan 5, 2021 · 21 comments
Assignees
Labels
category: Bug Something isn't working

Comments

@msulprizio
Copy link
Contributor

Daniel Jacob wrote:

In comparing GCHP and GCC 13.0 masses I notice that GCHP has essentially no MSA (>99% less than GCC, which seems to have reasonable amounts). There are no comparison plots for MSA but it’s a pretty simple species – it’s produced by DMS+OH and is removed by aerosol deposition. I suspect a simple bug.

This issue was originally discovered in the 1-year benchmarks for 13.0.0 but can also be observed in the 1-month benchmarks as shown below.

From the GCC 13.0.0 vs GCHP 13.0.0 Global Mass Table:

###############################################################################
### Global mass (Gg) at end of simulation (Trop only)                       ###
### Ref = GCC_13.0.0-beta.0; Dev = GCHP_13.0.0-beta.0                       ###
###############################################################################
MSA          :          41.327873            1.609690     -39.718183   -96.105

From the GCHP 12.9.0 vs GCHP 13.0.0 Global Mass Table:

###############################################################################
### Global mass (Gg) at end of simulation (Trop only)                       ###
### Ref = GCHP_12.9.0; Dev = GCHP_13.0.0-beta.0                             ###
###############################################################################
MSA          :          43.370704            1.609690     -41.761014   -96.289
@msulprizio msulprizio added the category: Bug Something isn't working label Jan 5, 2021
@msulprizio
Copy link
Contributor Author

@lizziel ran internal benchmarks for GCHP alpha tags. See http://ftp.as.harvard.edu/gcgrid/geos-chem/validation/GCHPctm/. This issue dates back to at least 13.0.0-alpha.5, which is the first version for which we have global mass tables.

###############################################################################
### Global mass (Gg) at end of simulation (Trop only)                       ###
### Ref = GCHP_13.0.0-alpha.5; Dev = GCHP_13.0.0-alpha.6                    ###
###############################################################################
MSA          :           1.531496            1.531496       0.000000     0.000

@yantosca yantosca self-assigned this Jan 5, 2021
@yantosca
Copy link
Contributor

I am currently looking into this. Short runs (~1hr) result in roughly the same amount of MSA in GCHP as opposed to an equivalent run in GEOS-Chem Classic. Am currently running longer simulations to see if the loss of MSA is steady over time.

@yantosca
Copy link
Contributor

Looking at the 10-year benchmarks with GCHP-13.0.0-beta.2 and GCC-13.0.0-beta.2:

###################################################################################
### Global mass (Gg) at end of simulation (Trop + Strat)                        ###
### Ref = GCC_13.0.0-beta.2; Dev = GCHP_13.0.0-beta.2                           ###
###################################################################################
                                    Ref                 Dev     Dev - Ref    % diff
MSA                :          86.406128            1.997560    -84.408568   -97.688

The problem exists in the 1st month of the 10-year spinup for GCHP. I wonder if this is a bad value in the GCHP restart file.

@yantosca
Copy link
Contributor

It appears that the MSA in the GCHP restart file used to initialize the 10yr spinup is corrupted. I looked at the initial restart files from the GCC_13.0.0-beta.2 and GCHP_13.0.0-beta.2 10 year spinups, i.e:

  • GCC_13.0.0-beta.2/GEOSChem.Restart.20100101_0000z.nc4
  • GCHP_13.0.0-beta.2/gcchem_internal_checkpoint.restart.20100101_000000.nc4

and this is the plot at the surface level:
msa

which results in the following whole-atmosphere totals:

###################################################################################
### Global mass (Gg) at end of simulation (Trop + Strat)                        ###
### Ref = GCC_13.0.0-beta.2; Dev = GCHP_13.0.0-beta.2                           ###
###################################################################################
                                    Ref                 Dev     Dev - Ref    % diff
212:MSA                :          86.406128            1.997560    -84.408568   -97.688

It seems like MSA might be the only species affected in this way.
GlobalMass_TropStrat.txt

This might have been a regridding issue (if the initial restart file was regridded from a GEOS-Chem Classic output), or it may point to a problem in the code that ended up saving out bad MSA values.

@yantosca
Copy link
Contributor

yantosca commented Jan 15, 2021

For comparison, here is a plot of the same GCC restart file vs. the GCHP restart file that ships out-of-the-box when you create a GCHP run directory. As you can see, the MSA in the GCHP restart doesn't exactly match, but this may be because it is for a different year/season. But at least the MSA concentrations are comparable to each other.

msa2

So somewhere along the line the MSA in the GCHP restart file used to initialize the GCHP 1-yr and 10-yr simulations was corrupted.

@msulprizio
Copy link
Contributor Author

Thanks, @yantosca! I think it's safe to say this is an issue in the benchmarks only because of a bad value in the restart file. We can make sure that the default restart file and restart files used in future benchmarks have more appropriate values for MSA. We've added MSA to the Sulfur benchmark category in the GCPy benchmark plots so we can keep an eye on this species.

@yantosca
Copy link
Contributor

I concur, @msulprizio. I think we can close this issue out, as it is an initialization problem rather than a bug in the code.

@msulprizio
Copy link
Contributor Author

This issue needs more investigation. The low values for MSA persist in the 10-year GCHP benchmarks. We would expect MSA to rebound even if the 10-year simulation was initialized on 01/01/2010 with a bad value for MSA in the restart file. MSA has a lifetime of ~1 week.

@msulprizio msulprizio reopened this Feb 12, 2021
@lizziel
Copy link
Contributor

lizziel commented Feb 19, 2021

For what it's worth, I took a look at the 1-month benchmark initial restarts we use from ExtData/GEOSCHEM_RESTARTS. They match very well (see below). Taking a look at the 1-month benchmark run should therefore be sufficient to find the MSA sink.

GC_12.9.0 folder, July 1st 2016 restart comparison:
MSA_rst_GC_12 9

GC_13.0.0 folder, July 1st 2019 restart comparison:
MSA_rst_GC_13 0

@lizziel
Copy link
Contributor

lizziel commented Mar 9, 2021

I isolated the problem to dry deposition in the model. I ran a 1-month benchmark with all on except dry deposition and global masses agrees much better (4% diff versus -84% in the original benchmark). Differences for other species are notably lower as well so I think this issue goes beyond MSA. I am looking into what is causing it.

@lizziel
Copy link
Contributor

lizziel commented Mar 9, 2021

@yantosca, I haven't been able to reproduce the MSA plot you posted earlier in this thread. How did you generate that?

@yantosca
Copy link
Contributor

yantosca commented Mar 9, 2021

@ELundgren, I think I used the examples/diagnostics/compare_diags.py script to create a sixplot of the restart file.

@lizziel
Copy link
Contributor

lizziel commented Mar 9, 2021

That should work if the levels are flipped for GCHP, and it looks like they were. The odd thing is the GCC plot you have looks off. Was that for a month other than July?

@yantosca
Copy link
Contributor

yantosca commented Mar 9, 2021

@lizziel In the SpeciesRst plot I was using the restart files that come from out of the box. I think it might have been a July vs. January thing. But in your plot the restart files line up better.

@lizziel
Copy link
Contributor

lizziel commented Mar 9, 2021

Hmm, GCHP and GCC should not have different out-of-the-box restarts. I wonder if this is related to this issue just reported about GEOS-Chem using Jan 1 restart instead of July 1. GCHP always links to July 1 restarts for full chem simulations in out-of-the-box run directories. I'll follow that other issue to see what the resolution is so we don't keep discussing in this low MSA issue thread.

@yantosca
Copy link
Contributor

yantosca commented Mar 9, 2021

I think that was a problem in the 12.9.x series, that the rundir for GCC used a January restart file.

This problem is fixed in the GC_13.0.0 version, as we pull from a 20190701 restart file for both GCC and GCHP.

Speaking of which, there is a minor bug in the ./createRunDir.sh that points to a restart file for 20160701 (I think it happened in a very recent Git merge). I'll fix that. I also will add some code to download the restart file from s3://gcgrid if you are creating a rundir on AWS.

@lizziel
Copy link
Contributor

lizziel commented Mar 10, 2021

Strange, I don't recall GCC benchmark simulations ever using a January restart file. We've always benchmarked July for 1-month runs and the directory was set up for that by default.

Regardless, back to the MSA issue... I upgraded from GCHP 13.0.0-beta.0 tag to the GCHP 13.0.0-final branch and the issue appears to be fixed. There have been a lot of updates going in lately and I am not sure what fixed it. Perhaps it does not matter as long as it is fixed prior to the release.

We will wait until the 1-month benchmark of 13.0.0-final to officially confirm there are no problems with MSA in GCHP before closing this issue.

@lizziel
Copy link
Contributor

lizziel commented Mar 18, 2021

This issue was also present in the 1-month benchmark for 13.0.0-rc.2. I did two fresh benchmark runs of that tag with daily checkpoints enabled. The only difference between the two runs is the executable; one was built with gfortran 8.3 and the other ifort 18. The 1-month runs are still in progress. However, comparing the initial restart with the checkpoint on day 4 shows MSA loss is occurring only in the Intel run, not GNU. This explains why I thought the issue was fixed in GCHP 13.0.0-final, since I used GNU while the 1-month benchmark for beta.0 used Intel.

MSA comparison for run using executable built with Intel compiler (ifort18):
MSA_intel

MSA comparison for run using executable built with GNU compiler (gfortran 8.3):
MSA_GNU

I am not sure where the issue is occurring, or if other species are impacted. Once the 1-month runs are done I will produce comparison plots: (1) GNU vs Intel for GCHP, and (2) GCHP vs GCC to see how this improves the benchmark results. I will post the plots on the ftp site.

@yantosca
Copy link
Contributor

Thanks @lizziel. I tested with GNU as well, which explains why I didn't see any loss of MSA.

@lizziel
Copy link
Contributor

lizziel commented Mar 19, 2021

1-month run comparisons are now available to view:
GCHP with gfortran versus GC-Classic with ifort
GCHP with gfortran versus GCHP with ifort

Most noteworthy is the difference in global mass in MSA and DMS between compiling GCHP with gfortran and ifort. MSA global mass with gfortran is 56.5 Gg, versus 8.3 with Intel. DMS global mass with gfortran is 271 Gg, versus 412 with Intel. Compared to GC-Classic, MSA and DMS masses agree much better after building GCHP with gfortran. For reference, GC-Classic had 49.3 Gg MSA and 244 Gg DMS.

I isolated the the difference to coming in via flexchem in chemistry, but did not go further than that.

I wonder if there is an issue with the compiler flags we are using for intel. This would explain why we didn't see big diffs in MSA and DMS between GCHP and GC-Classic when building with GNU Make (pre-13 versions for GCHP). The early 13-alphas for GCHP used the same core GEOS-Chem code as GCHP 12.7+, but gave different results for these species.

Default flags are set in ESMA_cmake/intel.cmake. I tried changing the optimization from O3 to O2, but that did not seem to change things. More investigation is needed.

@lizziel
Copy link
Contributor

lizziel commented Apr 13, 2021

This issue is fixed in 13.1 as a side effect of other updates that went in. The fix likely came in via geoschem/geos-chem#663, with an update to treat REAL as REAL*8 in KPP when using Intel compiler, although this has not be verified.

@lizziel lizziel closed this as completed Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants