Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ERA5 data in native format #358

Merged
merged 45 commits into from
Feb 10, 2020
Merged

Support for ERA5 data in native format #358

merged 45 commits into from
Feb 10, 2020

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Nov 4, 2019

This pull request makes it possible to use ERA5 data directly in ESMValCore. CMORization is implemented as a fix.

Related to issue ESMValGroup/ESMValTool#1120.

Related ESMValTool PR ESMValGroup/ESMValTool#1432

Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • This pull request has a descriptive title that can be used in a changelog
  • Add unit tests
  • Public functions should have a numpy-style docstring so they appear properly in the API documentation. For all other functions a one line docstring is sufficient.
  • If writing a new/modified preprocessor function, please update the documentation
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Codacy code quality checks pass. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes
  • If you make backward incompatible changes to the recipe format, make a new pull request in the ESMValTool repository and add the link below

Closes #447
Also closes #311

@mattiarighi mattiarighi added enhancement New feature or request cmor Related to the CMOR standard labels Nov 11, 2019
@bouweandela
Copy link
Member Author

bouweandela commented Dec 23, 2019

TODO:

@bouweandela bouweandela changed the base branch from development to master January 3, 2020 12:36
@Peter9192 Peter9192 self-assigned this Jan 9, 2020
@Peter9192 Peter9192 marked this pull request as ready for review February 1, 2020 11:01
@Peter9192
Copy link
Contributor

Peter9192 commented Feb 1, 2020

@bouweandela and I addressed the remaining issues above, and this PR is now ready for review.

Verifying the output against old cmorizer, I found some differences that I think are actually better in the new version than in the old cmorizer:

  • The time units in old are not changed (hours since 1990.....). In new (days since 1850...), they agree with the cmip6 coordinate definition from the cmor table.
  • long_names for lon and lat are written without capitals in the old cmorizer. The new cmorizer uses capitals, in accordance with the CMIP table.
  • uas van vas are not in the CMIP6_E1hr table. Their variable definition gets looked up in another table, but the time coordinate is inferred from the E1hr table (time1 in this case, instead of 'time'). The result is that the new cmorizers does not add bounds (the old cmorizer just adds bounds to all coordinates). I think the new behaviour makes more sense, because uas and vas are instantaneous variables, but we can change this (e.g. just add bounds to all vars like the old cmorizer) if people disagree.
  • The new cmorizer addresses time values after daily_statistics (or monthly_statistics) #398 by shifting the time points half an hour back, but the result is that we end up with a time coordinate that has 1 point less, because 1st of january 00:00 is moved to the previous year, and the last point 31st december 00:30 cannot be inferred because it should have come from the next year.

PS: Failing circleci tests are addressed in #452

@Peter9192 Peter9192 requested a review from mattiarighi February 1, 2020 11:02
@mattiarighi
Copy link
Contributor

Asking @valeriupredoi to have a look at the changes in _data_finder.py. 🍺

@mattiarighi
Copy link
Contributor

All variables successfully tested, except monthly/orog, see main_log.txt.

@Peter9192
Copy link
Contributor

All variables successfully tested, except monthly/orog, see main_log.txt.

Thanks @mattiarighi ! I removed monthly/orog because all we use it for is to create an fx variable out of it, and since we're doing that for hourly already, we don't need it in daily and monthly as well.

@mattiarighi
Copy link
Contributor

@valeriupredoi are you ok with this?

@valeriupredoi
Copy link
Contributor

@valeriupredoi are you ok with this?

looking at it right now, hold yer horses 🐴

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK a few minor comments - has @jvegasbsc reviewed this PR yet?

esmvalcore/cmor/_fixes/native6/era5.py Outdated Show resolved Hide resolved
esmvalcore/cmor/_fixes/native6/era5.py Show resolved Hide resolved
esmvalcore/cmor/_fixes/native6/era5.py Show resolved Hide resolved
esmvalcore/cmor/fix.py Outdated Show resolved Hide resolved
esmvalcore/preprocessor/_io.py Show resolved Hide resolved
esmvalcore/_data_finder.py Outdated Show resolved Hide resolved
esmvalcore/_data_finder.py Show resolved Hide resolved
@Peter9192
Copy link
Contributor

I ran again with Bouwe's most recent changes and everything works fine. Think it can be merged now.

@mattiarighi mattiarighi merged commit 26b0484 into master Feb 10, 2020
@mattiarighi mattiarighi deleted the native-era5 branch February 10, 2020 15:16
@bascrezee
Copy link
Contributor

bascrezee commented Feb 14, 2020

Nice work @bouweandela, @Peter9192 and others. I want to get started with this in combination with the variable soil moisture (monthly data as provided by the CDS). Could you very briefly help me get started. E.g.:

-where to put the data? (we used to put download instructions at top of the cmorization scripts, where to put them for native6? --> I would be a big fan of having this included in ESMValCore)
-how to specify the dataset in the recipe?

Thanks.

@bouweandela
Copy link
Member Author

Hi Bas,

where to put the data? (we used to put download instructions at top of the cmorization scripts, where to put them for native6? --> I would be a big fan of having this included in ESMValCore)

I guess we (again) failed to provide enough documentation ;-( The native6 project can be configured in the config-user.yml and config-developer.yml files with a DRS, similar to other projects. Still to do is to make it possible to have a dataset specific DRS, so we can have more than just ERA5 under the native6 project:
https://github.com/ESMValGroup/ESMValTool/blob/master/config-user-example.yml#L15

native6:
cmor_strict: false
input_dir:
default: 'Tier{tier}/{dataset}'
era5cli: 'Tier{tier}/{dataset}'
input_file:
default: '{project}_{dataset}_{type}_{version}_{mip}_{short_name}[_.]*nc'
era5cli: 'era5_{era5_name}*{era5_freq}.nc'
output_file: '{project}_{dataset}_{type}_{version}_{mip}_{short_name}'
cmor_type: 'CMIP6'
cmor_default_table_prefix: 'CMIP6_'

how to specify the dataset in the recipe?

For an example recipe using ERA5, see here: https://github.com/ESMValGroup/ESMValTool/blob/master/esmvaltool/recipes/cmorizers/recipe_era5.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmor Related to the CMOR standard enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read ERA5 data in native format Make datafinder understand dates in the middle of the file name
6 participants