Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/ISSUE] dry run for data download w/ 13.0.1 not recognizing files are missing past day 2 of month for MERRA-2 global and GEOS-FP nested NA #686

Closed
hmhorow opened this issue Apr 8, 2021 · 12 comments
Assignees
Labels
category: Bug Something isn't working
Milestone

Comments

@hmhorow
Copy link

hmhorow commented Apr 8, 2021

Description of the problem

I cloned the GEOS-Chem code for version 13.0.1. I created two different run directories, one for GEOS-FP 0.25x0.3125 North America, one for MERRA-2 global 4x5, following the wiki; compiled the code; and did a dry run. for GEOS-FP 0.25x0.3125 NA it was for the month of July 2019. For MERRA-2 global 4x5 it was January - April 2018. I just noticed that in both cases, the only meteorological data that was recognized as required and missing files stopped on the 2nd day of the month. There are no files past day 2 of the initial month in the log files for the dry run, and so when I did the download data scripts based on those dry runs, no files past day 2 were downloaded of course. I tried to see if anyone had a similar issue but I haven't found.

GEOS-FP files listed in the dry run log file, and downloaded for July 2019:
GEOSFP.20190701.A1.025x03125.nc GEOSFP.20190701.A3cld.025x03125.nc GEOSFP.20190701.A3dyn.025x03125.nc GEOSFP.20190701.A3mstC.025x03125.nc GEOSFP.20190701.A3mstE.025x03125.nc GEOSFP.20190701.I3.025x03125.nc GEOSFP.20190702.I3.025x03125.nc

Same but for MERRA-2 for January - April 2018, only has one folder for January 2018 with these files:
MERRA2.20180101.A1.4x5.nc4 MERRA2.20180101.A3cld.4x5.nc4 MERRA2.20180101.A3dyn.4x5.nc4 MERRA2.20180101.A3mstC.4x5.nc4 MERRA2.20180101.A3mstE.4x5.nc4 MERRA2.20180101.I3.4x5.nc4 MERRA2.20180102.I3.4x5.nc4

I followed the exact same workflow as what I had done with 13.0.0-rc, which successfully recognized all the files that were needed so for those dry runs of different time periods and meteorological fields I have been able to download multiple full months with all of the days of meteorological data. After switching to 13.0.1 they both seem to stop in day 2 after the I3 file.

GEOS-Chem version

13.0.1

Description of modifications

none

Log files

Software versions

  • CMake version: 3.19.2
  • Compilers (Intel or GNU, and version): gcc 9.3.0
  • NetCDF version: 4.7.4 (C); 4.5.3 (Fortran)
@hmhorow hmhorow added the category: Bug Something isn't working label Apr 8, 2021
@yantosca
Copy link
Contributor

yantosca commented Apr 8, 2021

Thanks @hmhorow. Yes, I can see that the 0.25 x 0.3125 dry run is only picking up a day of data:

HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.A1.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.A3cld.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.A3dyn.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.A3mstC.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.A3mstE.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190701.I3.025x03125.nc
HEMCO: REQUIRED FILE NOT FOUND /projects/horowitz_group/GEOSChem_input_data/ExtData/GEOS_0.25x0.3125/GEOS_FP/2019/07/GEOSFP.20190702.I3.025x03125.nc

I think it might be related to this commit in HEMCO, which was added in 13.0.1. geoschem/HEMCO@1506fb7. This is related to feature request #667.

Long story short, we introduced a new HEMCO flag ( EFY), which means "exact year, force error if not found, only read once"). This seems to be the flag for the met field entries in your HEMCO_Config.rc file:

# --- A1 fields ---
* ALBEDO    $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC     ALBEDO   1980-2021/1-12/1-31/*
/+30minute EFY xy  1  * -  1 1
* CLDTOT    $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC     CLDTOT   1980-2021/1-12/1-31/*
/+30minute EFY xy  1  * -  1 1
* EFLUX     $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC     EFLUX    1980-2021/1-12/1-31/*
/+30minute EFY xy  1  * -  1 1
* EVAP      $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC     EVAP     1980-2021/1-12/1-31/*
/+30minute EFY xy  1  * -  1 1
* FRSEAICE  $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC     FRSEAICE 1980-2021/1-12/1-31/*
/+30minute EFY xy  1  * -  1 1

We needed EFY to force an error if the user's simulation start date didn't match the restart file date. Otherwise it could be possible that the simulation would start but would use default background mixing ratios for species (which may not be the desired outcome). Prasad Kasibhatla originally reported in issue #648.

In any case, try changing the EFY in those met field entries of HEMCO_Config.rc to ECY, which will cause the data to be read continously. Then do another dry-run and see if that picks up the rest of the missing met field files for July 2019 GEOS-FP 0.25 x 0.3125. If that works then we might have to issue another patch version 13.0.2 and change the default settings for met fields in the template HEMCO_Config.rc files.

Cc: @msulprizio @jimmielin @lizziel

@hmhorow
Copy link
Author

hmhorow commented Apr 8, 2021

Thanks so much for your help!! I will try this out ASAP and get back to you.

@hmhorow
Copy link
Author

hmhorow commented Apr 8, 2021

I replaced EFY with ECY for all of the met fields (but not the restart file, I kept that EFY).
Here's the revised HEMCO_Config.rc:

I tried for both the MERRA-2 4x5 and GEOS-FP 0.25x0.3125 dry runs that I mentioned previously. Both of them seg faulted due to 'invalid memory reference'. here are the error files:
MERRA-2: dryrun_201801_201804.e2015124.txt
GEOS-FP: nest_dryrun07.e2015126.txt

For reference this is the first error, line 344 of hco_driver_mod.F90 starts here:
IF ( HcoState%Options%HcoWritesDiagn .AND. .NOT. ERROR ) THEN CALL HcoDiagn_Write( HcoState, .FALSE., RC ) CALL HcoDiagn_Write( HcoState, .TRUE., RC ) ENDIF

I'm confident the only thing I changed was the HEMCO_Config.rc file, I did not change input.geos or the run scripts that I had previously sent along. But I don't know if there's anything wonky happening on our campus cluster.

@yantosca
Copy link
Contributor

yantosca commented Apr 9, 2021

Thanks @hmhorow. I just spun up a cloud instance to do a dry-run and confirmed that ECF is not the proper setting. It should be ECY (use exact year, read continuously).

If I grep for the A1 met field in my dry-run logfile, I now get:

426:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190701.A1.4x5.nc4
531:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190702.A1.4x5.nc4
544:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190703.A1.4x5.nc4
557:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190704.A1.4x5.nc4
570:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190705.A1.4x5.nc4
583:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190706.A1.4x5.nc4
596:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190707.A1.4x5.nc4
609:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190708.A1.4x5.nc4
622:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190709.A1.4x5.nc4
635:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190710.A1.4x5.nc4
648:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190711.A1.4x5.nc4
661:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190712.A1.4x5.nc4
674:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190713.A1.4x5.nc4
687:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190714.A1.4x5.nc4
700:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190715.A1.4x5.nc4
713:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190716.A1.4x5.nc4
726:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190717.A1.4x5.nc4
739:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190718.A1.4x5.nc4
752:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190719.A1.4x5.nc4
765:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190720.A1.4x5.nc4
778:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190721.A1.4x5.nc4
791:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190722.A1.4x5.nc4
804:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190723.A1.4x5.nc4
817:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190724.A1.4x5.nc4
830:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190725.A1.4x5.nc4
843:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190726.A1.4x5.nc4
856:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190727.A1.4x5.nc4
869:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190728.A1.4x5.nc4
882:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190729.A1.4x5.nc4
895:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190730.A1.4x5.nc4
908:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/07/MERRA2.20190731.A1.4x5.nc4
1053:HEMCO: REQUIRED FILE NOT FOUND /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190801.A1.4x5.nc4

so as you can see, ECY is now picking up all the files for the month of my dry-run.

I think we'll have to issue a patch version 13.0.2 ASAP to fix this. Thanks for bringing this to our attention!

@yantosca
Copy link
Contributor

yantosca commented Apr 9, 2021

After consulting with the GCST, ECY might not be correct, as it will keep cycling to the closest available, which is not what we want. I think we have to add a new flag in HEMCO to cover the met fields. EFY was correct but then the behavior changed in 13.0.1.

Thanks for your patience, we'll fix it soon!

@hmhorow
Copy link
Author

hmhorow commented Apr 9, 2021

Thanks Bob!

@yantosca
Copy link
Contributor

yantosca commented Apr 9, 2021

Hi @hmhorow. For now I have a fix in the bugfix/bmy/TimeCycleEFYO branch. You can get it with:

git clone -b bugfix/bmy/TimeCycleEFYO https://github.com/geoschem/geos-chem.git
cd GCClassic
git submodule update --init --recursive
cd run
./createRunDir.sh  # follow the prompts etc

or if you want to get that into an existing code directory you can do:

cd GCClassic # or whatever your top-level code dir is named
git fetch
git checkout bugfix/bmy/TimeCycleEFYO
git submodule update --init --recursive
cd run
./createRunDir.sh  # follow the prompts

I probably won't have time to push version 13.0.2 tonight but I'll do it 1st thing Monday. This will also give me time to run some integration tests.

Long story short:

  1. The EFY flag has now been restored to its prior behavior, and is used with met fields
  2. The EFYO (EFY + "read only once") is only used for the restart file

@hmhorow
Copy link
Author

hmhorow commented Apr 9, 2021

Thank you so much @yantosca! I will try out updating from my existing 13.0.1 code directory.

@hmhorow
Copy link
Author

hmhorow commented Apr 12, 2021

FYI I tried the bugfix/bmy/TimeCycleEFYO branch, made a new run directory, did a dry run with no changes to anything except input.geos to set the time period, and it now lists all the files needed for the entire time, which also successfully downloaded based on the log file from that dry run. thanks!

@yantosca
Copy link
Contributor

Thanks @hmhorow. I am fixing a minor issue that was uncovered during the integration tests and should be able to push 13.0.2 later today. Thank you for reporting this issue! Good catch!

@yantosca
Copy link
Contributor

Also note, the integration tests revealed the need to modify this statement in the run directory generation file run/GCClassic/createRunDir.sh so that certain specialty simulations will not halt with an error:

# Sample restarts for several simulations do not contain all species. For those
# simulations, print a warning and change the time cycle option in HEMCO config
# so that we do not force an error if not found (i.e. EFYO --> EY)
if [[ "x${sim_extra_option}" == "xaciduptake"        ||
      "x${sim_extra_option}" == "xmarinePOA"         ||
      "x${sim_extra_option}" == "xcomplexSOA_SVPOA"  ||
      "x${sim_extra_option}" == "xAPM"               ||
      "x${sim_name}"         == "xPOPs"              ||
      "x${sim_name}"         == "xtagCH4"            ||
      "x${sim_name}"         == "xtagO3"             ]]; then
    old="SpeciesRst_?ALL?    \$YYYY/\$MM/\$DD/\$HH EFYO"
    new="SpeciesRst_?ALL?    \$YYYY/\$MM/\$DD/\$HH EY  "
    sed_ie "s|${old}|${new}|" HEMCO_Config.rc

    printf "\n  -- The sample restart provided for this simulation may not"
    printf "\n     contain all species defined in this simulation. Missing"
    printf "\n     species will be assigned default background concentrations."
    printf "\n     Check your GEOS-Chem log file for details. As always, it"
    printf "\n     is recommended that you spin up your simulation to ensure"
    printf "\n     proper initial conditions.\n"
fi

I will push this fix today and run a new set of integration tests. Then I'll release 13.0.2.

@yantosca
Copy link
Contributor

This issue is now fixed and included in GEOS-Chem 13.0.2. Please update to this version when it is convenient.

I will close out this issue for now but please keep us posted of any other problems you encounter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants