Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/ISSUE] Problem with choosing the correct Restart file #648

Closed
pkasibhatla opened this issue Mar 9, 2021 · 9 comments
Closed

[BUG/ISSUE] Problem with choosing the correct Restart file #648

pkasibhatla opened this issue Mar 9, 2021 · 9 comments
Assignees
Labels
category: Bug Something isn't working

Comments

@pkasibhatla
Copy link

pkasibhatla commented Mar 9, 2021

Report a GEOS-Chem bug or technical issue

Describe the bug:

I was doing some test runs with V13. My run starts on 20190701. My run directory contains a restart for both 20190701 and 20190101. Though the run starts on 20190701, the run seems to use the restart for 20190101.
I can tell this is the case because when I run for a day and look at the output GEOSChem.SpeciesConc*nc4 file, it looks like the Jan field.

Expected behavior:

Run should use the restart file for 20190701

@yantosca
Copy link
Contributor

yantosca commented Mar 9, 2021

Thanks @pkasibhatla for writing. Are you on the main branch of the GCClassic wrapper (version 13)? When I created a run directory it pulled a restart file for July.

Can you list the steps that you took to generate the run directory? Also if you can post the rundir.version file from the run directory, that has the commit at which the run directory was created.

@yantosca yantosca mentioned this issue Mar 10, 2021
@yantosca yantosca self-assigned this Mar 10, 2021
@pkasibhatla
Copy link
Author

pkasibhatla commented Mar 10, 2021

@yantosca, yes I am on the main branch of version 13 - I followed steps 1, 3, and 4 from http://wiki.seas.harvard.edu/geos-chem/index.php/Downloading_GEOS-Chem_source_code_(13.0.0_and_later_versions).

I created the run directory for the fullchem run following the steps in http://wiki.seas.harvard.edu/geos-chem/index.php/Creating_run_directories_for_GEOS-Chem_13.0.0_and_later - except for the last step, where I answered 'n' to 'Do you want to track run directory changes with git?'.

This did not pull the required restart file. So I downloaded the 20190101 and 20190701 fullchem restart files from http://ftp.as.harvard.edu/gcgrid/data/ExtData/GEOSCHEM_RESTARTS/GC_13.0.0/ and renamed them to GEOSChem.Restart.20190101_0000z.nc4 and GEOSChem.Restart.20190701_0000z.nc4.

I then noticed that though I was running for a day starting on 20190701, the run was using the GEOSChem.Restart.20190101_0000z.nc4 restart file. Only when I deleted this restart file, did it use the correct GEOSChem.Restart.20190701_0000z.nc4 restart file.

Here is the rundir.version file:

This run directory was created with /hpc/home/psk9/GCClassic/src/GEOS-Chem/run/GCClassic/createRunDir.sh.

GEOS-Chem repository version information:

  Remote URL: https://github.com/geoschem/geos-chem.git
  Branch: HEAD
  Commit: Update end year for meteorology fields, GFAS, and BCs to 2021 in HEMCO_Config.rc
  Date: Wed Jan 6 17:14:52 2021 -0500
  User: Melissa Sulprizio
  Hash: dc4999053

@yantosca
Copy link
Contributor

Hi @pkasibhatla. Can you also check if you have a ~/.geoschem/config file? The first time that you run createRunDir.sh it should create this file (if it's not there already). It should have this line:

export GC_DATA_ROOT=/n/holylfs/EXTERNAL_REPOS/GEOS-CHEM/gcgrid/data/ExtData

which is the path to the ExtData directory where all the GEOS-Chem data is stored. If you don't have that, the createRunDir.sh script not be able to find the restart files.

You can regenerate .geoschem/config it by removing it and then running createRunDir.sh again.

@pkasibhatla
Copy link
Author

@yantosca, yes I created this the first time I did this. It has the line
export GC_DATA_ROOT=/work/psk9/Data/ExtData

@yantosca
Copy link
Contributor

What could be going on is this. In the 13.0.0-rc code, the run directory script will try to pull the restart file from the 12.9.0 folder of ExtData/GEOSCHEM_RESTARTS. This is because the 1-year benchmarks weren't finished when the 13.0.0-rc was released. The script looks like this:

#--------------------------------------------------------------------
# Copy sample restart file to run directory
#--------------------------------------------------------------------
if [[ "x${sim_name}" == "xfullchem" ]]; then
    # Use restart file saved out from latest 1-year benchmark
    sample_rst=${GC_DATA_ROOT}/GEOSCHEM_RESTARTS/GC_12.9.0/GEOSChem.Restart.fullchem.20160701
_0000z.nc4
elif [[ ${sim_name} = "TransportTracers" ]]; then
    # Use restart file saved out from latest 1-year benchmark
    sample_rst=${GC_DATA_ROOT}/GEOSCHEM_RESTARTS/GC_12.8.0/GEOSChem.Restart.TransportTracers.
20170101_0000z.nc4
else
    sample_rst=${GC_DATA_ROOT}/GEOSCHEM_RESTARTS/v2018-11/initial_GEOSChem_rst.${grid_res}_${
sim_name}.nc
fi
if [[ -f ${sample_rst} ]]; then
    cp ${sample_rst} ${rundir}/GEOSChem.Restart.${startdate}_0000z.nc4
else
    printf "\n  -- No sample restart provided for this simulation."
    printf "\n     You will need to provide an initial restart file or disable"
    printf "\n     GC_RESTARTS in HEMCO_Config.rc to initialize your simulation"
    printf "\n     with default background species concentrations.\n"
fi

and the GEOSCHEM_RESTARTS folder on the Harvard server looks like this:

drwxrwsr-x+ 2 msulprizio jacob_gcst  4096 2020-08-27 11:47 GC_12.4.0/
drwxrwsr-x+ 2 msulprizio jacob_gcst  4096 2020-08-27 11:47 GC_12.6.0/
drwxrwsr-x+ 3 msulprizio jacob_gcst  4096 2020-08-28 17:49 GC_12.8.0/
drwxrwsr-x+ 3 msulprizio jacob_gcst  4096 2020-08-28 17:46 GC_12.9.0/
drwxrwsr-x+ 3 msulprizio jacob_gcst  4096 2021-02-10 16:24 GC_13.0.0/
-rw-rw-r--+ 1 msulprizio jacob_gcst  1410 2021-02-10 11:54 README
drwxrwsr-x+ 2 georep     jacob_gcst 20480 2019-06-07 16:55 v2015-09/
drwxrwsr-x+ 3 georep     jacob_gcst 16384 2020-08-25 09:42 v2016-07/
drwxrwsr-x+ 2 msulprizio jacob_gcst 12288 2020-08-03 14:48 v2018-11/
drwxrwsr-x+ 2 msulprizio jacob_gcst  4096 2021-02-10 16:25 v2020-02/

For the 13.0.0-final code (which is still pending benchmark approval), we will update the createRunDir.sh so that it pulls restarts from GEOSCHEM_RESTARTS/GC_13.0.0.

So long story short, you might need to download the ExtData/GEOSCHEM_RESTARTS/GC_12.9.0 folder for use with the 13.0.0-rc code.

@msulprizio msulprizio changed the title [BUG/ISSUE] Problem with choosing the correct Resart file [BUG/ISSUE] Problem with choosing the correct Restart file Mar 10, 2021
@pkasibhatla
Copy link
Author

I guess what I still don't understand is this:

  1. If I don't have a restart file in my run directory, the run fails;
  2. If I have only GEOSChem.Restart.20190701_0000z.nc4 in my run directory, the run succeeds and seems to use this restart;
  3. If I have only GEOSChem.Restart.20190101_0000z.nc4 or both GEOSChem.Restart.20190101_0000z.nc4 and GEOSChem.Restart.20190701_0000z.nc4, the run seems to use GEOSChem.Restart.20190101_0000z.nc4 incorrectly and the run succeeds.

I don't understand why the run does not fail because I do not have /work/psk9/Data/ExtData/GEOSCHEM_RESTARTS or use the correct restart file that is available in the run directory - why does it use the wrong restart file?

@msulprizio
Copy link
Contributor

Hi @pkasibhatla. Can you attach your HEMCO and GEOS-Chem log files? If you could run a quick simulation with verbose and warnings set to 3 in HEMCO_Config.rc and provide that HEMCO log file that would give us more information. With verbose set to 3 the HEMCO log file should list for each file the dates it finds, preferred date, and selected date.

If you check the entry in HEMCO_Config.rc for the restart file, by default it is:

* SPC_           ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 SpeciesRst_?ALL?    $YYYY/$MM/$DD/$HH CYS xyz 1 * - 1 1

The C in the time cycle flag CYS tells HEMCO to cycle or find the closes available date if the file for the simulation date does not exist. I'm wondering of there was a typo in the July restart filename for your original run which caused it to select the January restart file which was the next closest date available. To remove this behavior, you can change the CYS to an EY telling HEMCO to only use a restart file for the exact date. (Note: The Y tells HEMCO to use the simulation year and not the emissions year if specified while the S tells HEMCO to skip if a variable is not found in the restart file thus defaulting to background concentrations for that species.)

@pkasibhatla
Copy link
Author

Hi @msulprizio, I went back and redid everything and things seem to be working ok now. So perhaps I did have the file names wrong. Attached below are 3 geos.log files and 3 HEMCO.log files.

*_1 is the case where the run directory contains only GEOSChem.Restart.20190701_0000z.nc4;

*_2 is where it contains only GEOSChem.Restart.20190101_0000z.nc4

*_3 is where it contains both GEOSChem.Restart.20190101_0000z.nc4 and GEOSChem.Restart.20190701_0000z.nc4.

If you agree that this all looks ok, we can close this issue.

Thanks so much for your help.
geos.log_1.txt
geos.log_2.txt
geos.log_3.txt
HEMCO.log_1.txt
HEMCO.log_2.txt
HEMCO.log_3.txt

@msulprizio
Copy link
Contributor

Hi @pkasibhatla. This all looks OK to me too.

Also, I pushed a fix to 13.0.1 so that the restart file entries in HEMCO_Config.rc use the EFY time cycle flag. This tells HEMCO to make sure the restart file date matches the simulation date and forces HEMCO to stop with an error if it does not. That should hopefully avoid issues like the one you reported. See Github issue #667 for more details on that fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants