Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMORizer for NASA MERRA reanalysis #3039

Merged
merged 14 commits into from
Aug 18, 2023
Merged

Conversation

axel-lauer
Copy link
Contributor

@axel-lauer axel-lauer commented Feb 6, 2023

Description

This PR adds scripts for downloading and formatting selected variables from the NASA MERRA reanalysis (not to be confused with MERRA2). As not all variables of interest are contained in the CREATE-IP version of this dataset, the original files provided by NASA are used. These are provided in .hdf-eos format only.

This PR also fixes a small bug in the shared function format_time (esmvaltool/cmorizers/data/formatters/utilities.ncl) that prevented all steps of the time coordinate to be initialized correctly. This fix is needed for this formatting script to work properly.

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated data reformatting script

@axel-lauer axel-lauer changed the title CMORier for NASA MERRA reanalysis CMORizer for NASA MERRA reanalysis Feb 6, 2023
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheers muchly @axel-lauer 🍺 just a quick observation (pun intended) from me. Also - do we want to tell @glpotter we are also working on this? He is involved with our MERRA2 data handling and CREATE-IP project too 👍

datestr = datestr + tostring(mm)

fname = systemfunc("ls " + input_dir_path + "MERRA???.prod.assim." + \
SOURCEFILE(vv) + datestr + ".hdf")
Copy link
Contributor

@valeriupredoi valeriupredoi Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the file extension is hardwired here, but the downloaded accepts options for nc or nc4 as well, is there a file handler anywhere else that is converting/extracting those to HDF5 or MERRA will never come in as netCDF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original MERRA data are only available in .hdf format. Unfortunately... Since the stuff is archived, I do not expect this to change... But I am happy to change ".hdf" to ".*" if you like that better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha gotcha! But then I'd not turn on the download of any nc or nc4 files at download point (via cmd line option) since those files will be ignored, if they do get downloaded at all - either that, or add an "*" option here, but in that case will the file handling have to change inside the NCL code ie to load and process nc vs HDF5? (sorry, I still don't speak any NCL 😁 )

@axel-lauer
Copy link
Contributor Author

cheers muchly @axel-lauer 🍺 just a quick observation (pun intended) from me. Also - do we want to tell @glpotter we are also working on this? He is involved with our MERRA2 data handling and CREATE-IP project too 👍

I am aware of this. Unfortunately, The CREATE-IP version of the MERRA data (at least the one that is published to ESGF) does not contain the cloud variables I am interested in...

Copy link
Contributor

@remi-kazeroni remi-kazeroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @axel-lauer! The code looks good to me from a technical point of view. I just have 2 minor comments you might want to check. I have run the downloader, cmorizer and recipe_check_obs on data for one year and all works fine.

Before that PR is merged, could you please move the raw and CMORized data to the right folder on Levante? I assume you have already downloaded the large dataset and CMORized all years available.

Edit: the remaining codacy issues can be ignored.

esmvaltool/recipes/examples/recipe_check_obs.yml Outdated Show resolved Hide resolved

; Global attributes
SOURCE = "https://goldsmr3.gesdisc.eosdis.nasa.gov/data/MERRA_MONTHLY/"
REF = "doi: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be needed to add the doi here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is fine, because the actual doi of the variable written to the output file will added at the time the file is written (that is because every variable migh have a different doi). I can see that this is confusing and moved the definition of REF to where the output file is actually written.

axel-lauer and others added 2 commits August 17, 2023 07:15
@axel-lauer
Copy link
Contributor Author

Before that PR is merged, could you please move the raw and CMORized data to the right folder on Levante? I assume you have already downloaded the large dataset and CMORized all years available.

Done.

Copy link
Contributor

@hb326 hb326 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axel-lauer: looks good from my side! I checked some of the available files on Levante, and also ran the formatter to see if it would work. I only found one little thing that I commented on. Otherwise it is good to be merged in my opinion! :)

base = getenv("cmor_tables")

CMOR_TABLE = getenv("cmor_tables") + \
(/"/cmip5/Tables/CMIP5_" + MIP, "/cmip5/Tables/CMIP5_" + MIP, \ ; 3d asm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think the expression "getenv("cmor_tables")" could be replaced by "base" which is defined two lines earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! You are totally right, this is indeed redundant. I deleted the definition of base as this variable is not used anywhere.

@remi-kazeroni remi-kazeroni requested a review from a team as a code owner August 18, 2023 13:31
@remi-kazeroni remi-kazeroni added this to the v2.10.0 milestone Aug 18, 2023
@remi-kazeroni remi-kazeroni merged commit b244fef into main Aug 18, 2023
1 of 2 checks passed
@remi-kazeroni remi-kazeroni deleted the cmorizer_nasa_merra_ncl branch August 18, 2023 14:03
ehogan added a commit that referenced this pull request Aug 30, 2023
…_RTW

* recipe_test_workflow_prototype: (237 commits)
  Add version to dataset in python example recipe to avoid "Unknown file format" issue on JASMIN (#3322)
  CMORizer for NASA MERRA reanalysis (#3039)
  Add `OBS-maintainers` team to documentation on OBS data maintenance and CMORizer reviews (#3335)
  Fixed provenance tracking for NCL multipanel PNGs (#3332)
  Cmorizer for NOAA-CIRES-20CR v3 reanalysis (clt, clwvi, hus, prw, rlut, rlutcs, rsut, rsutcs) (#3137)
  [Condalock] Update Linux condalock file (#3321)
  Slight refactoring of diag `galytska23/select_variables_for_tigramite.py` for generality and portability (for Changelog v2.10: authors: @valeriupredoi and @egalytska) (#3298)
  Removed recipe_carvalhais14nat from list of broken recipes (#3319)
  add Romain Beucher to CITATION as contributor (#3318)
  update `mamba` version in readthedocs configuration docs builds (#3310)
  [Github Actions] Compress all bash shell setters into one default option per workflow (#3315)
  [condalock] update conda lock creation Github Action workflow and ship updated (bot-generated) conda-lock file (#3307)
  Allow NCL unit conversion `kg s-1` -> `GtC y-1` (#3300)
  Add list of failing recipes for v2.9.0 release (#3294)
  Update diag_shapeselect.py to work with shapely v2 (#3283)
  Update release schedule after release of v2.9.0 (#3289)
  Add merge instructions to release instructions (#3292)
  Made sklearn test backwards-compatible with sklearn < 1.3 (#3285)
  Add release notes for v2.9 (#3266)
  Add release notes for v2.9 (#3266)
  ...
jvegreg pushed a commit that referenced this pull request Jan 14, 2024
Co-authored-by: Rémi Kazeroni <remi.kazeroni@dlr.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants