Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatibility of ERA5 pressure level variable(s) #1136

Closed
fserva opened this issue May 18, 2021 · 22 comments
Closed

compatibility of ERA5 pressure level variable(s) #1136

fserva opened this issue May 18, 2021 · 22 comments
Labels
bug Something isn't working cmor Related to the CMOR standard iris Related to the Iris package
Milestone

Comments

@fserva
Copy link
Contributor

fserva commented May 18, 2021

Hello all, I would need to process (six) hourly ERA5 data in order to prepare input for the zmnam recipe, which requires daily means geopotential height.

I've tried to process it by using https://github.com/ESMValGroup/ESMValTool/blob/master/esmvaltool/recipes/cmorizers/recipe_era5.yml, however an error is raised when building the cube: ERROR [15512] Failed to run fix_metadata([<iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>, <iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>, [...], <iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>], {'project': 'native6', 'dataset': 'ERA5', 'short_name': 'zg', 'mip': '6hrPlev', 'frequency': '6hr', 'check_level': <CheckLevels.DEFAULT: 3>}), likely due to iris This seems to be an iris problem iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found none.'

I believe this may be due to the missing standard_names, see the header of the raw files:

dimensions:
	longitude = 1440 ;
	latitude = 721 ;
	level = 19 ;
	time = 124 ;
variables:
	float longitude(longitude) ;
		longitude:units = "degrees_east" ;
		longitude:long_name = "longitude" ;
	float latitude(latitude) ;
		latitude:units = "degrees_north" ;
		latitude:long_name = "latitude" ;
	int level(level) ;
		level:units = "hPa" ;
		level:long_name = "pressure_level" ;
	int time(time) ;
		time:units = "hours since 1900-01-01 00:00:00.0" ;
		time:long_name = "time" ;
		time:calendar = "gregorian" ;
	short z(time, level, latitude, longitude) ;
		z:scale_factor = 7.61496044931561 ;
		z:add_offset = 244700.965957275 ;
		z:_FillValue = -32767s ;
		z:missing_value = -32767s ;
		z:units = "m**2 s**-2" ;
		z:long_name = "Geopotential" ;
		z:standard_name = "geopotential" ;

Maybe assigning default attributes would be enough? See e.g. https://groups.google.com/g/scitools-iris/c/9IwC2Rr7xm

Please note the discussion started in: ESMValGroup/ESMValTool#1909

If anyone succeeded cmorizing ERA5 pressure level data, any hint would be useful.

FYI @remi-kazeroni @zklaus @bouweandela

@fserva fserva added bug Something isn't working iris Related to the Iris package cmor Related to the CMOR standard labels May 18, 2021
@bouweandela
Copy link
Member

Could you please post the complete stack trace? Or attach the file run/main_log_debug.txt?

@fserva
Copy link
Contributor Author

fserva commented May 19, 2021

Here it is, thanks.
main_log_debug.txt

@bouweandela
Copy link
Member

bouweandela commented May 27, 2021

I tried to reproduce this with hourly zg and it looks like the crash is due to the axis attribute being set to the empty string, because loading the variable definition from the CMOR table, it picks up zg with an alevel generic coordinate instead of the usual vertical pressure coordinate. This happens because the variable is not present in the E1hr table, so then it searches the other mip tables and ends up picking the wrong one.

@jvegasbsc Could you look into this? We would need some way to select the correct variable from the CMOR table, i.e. one that has e.g. "dimensions": "longitude latitude plev19 time" instead of "dimensions": "longitude latitude alevel time".

Here is an example recipe:

datasets:
  - {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}

diagnostics:

  timeseries:
    description: zg
    variables:
      zg:
        mip: E1hr
        frequency: 1hrPt
        start_year: 1990
        end_year: 1990
    scripts: null

@schlunma
Copy link
Contributor

This is described in #1029 and a possible solution is proposed in #1032.

@bouweandela
Copy link
Member

That looks like a different but related problem. In this case, zg occurs in the CMIP6 CMOR tables with both a generic and a non-generic vertical coordinate, depending on the table. To solve the problem here, it would be sufficient to be able to just specify that you want the version with the non-generic coordinate.

I realized that this is actually already possible, but it's a bit counter intuitive. The following recipe seems to work for me with ERA5 hourly geopotential height on pressure levels. Could you give it a try @fserva? Make sure to replace the frequency with the frequency that you actually need.

datasets:
  - {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}

diagnostics:

  timeseries:
    description: test zg
    variables:
      zg:
        mip: Amon
        frequency: 1hrPt
        start_year: 1990
        end_year: 1990
    scripts: null

@fserva
Copy link
Contributor Author

fserva commented May 28, 2021

Thanks @bouweandela, I've tried your recipe and I needed to do a change adding a documentation section, otherwise it stops immediately.

When using the data I have, I got an error on the time axis ValueError: invalid day number provided in cftime.DatetimeGregorian(2000, 6, 31, 0, 0, 0, 0), which is not present when using cdo (but it makes sense since 31 June does not exist). This seems caused by _fix_monthly_time_coord.

Suggestions or other issues I should consider? Thanks

@fserva
Copy link
Contributor Author

fserva commented May 28, 2021

I've tried various mip and frequency (in my case probably 6hrPt is the correct one?), but that did not help

@bouweandela
Copy link
Member

bouweandela commented May 28, 2021

It looks like that happens because we were only aware of hourly and monthly ERA5 data when we wrote the fix file. It sees that the timestep is more than an hour, so it wrongly assumes that it is monthly. How can I download the six hourly data?

in my case probably 6hrPt is the correct one?

Yes, I think so.

@fserva
Copy link
Contributor Author

fserva commented May 28, 2021

Yes, good point. In fact the highest frequency you can get is hourly, but through the CDS API and the web form one can select specific hours. In this case I took 6-hourly to save some space, but of course I can start from the hourly.

Maybe not a problem, but I see that above you mentioned plev19, while ERA5 provides 37 levels as far as I recall (and also in this case, I sub-selected some levels of interest).

@bouweandela
Copy link
Member

That the pressure levels are different than in the CMOR tables is not a problem, this is almost always the case for non-CMIP data, so long as the coordinate is sufficiently similar it should work.

If you are working on some institutional machine (e.g. Jasmin), you might be able to use ERA5 data that is already available instead of downloading it yourself, see also ESMValGroup/ESMValTool#2183.

@fserva
Copy link
Contributor Author

fserva commented Jul 3, 2021

Thanks Bouwe, sorry for the long time to respond.

Now I have access also to the space esmeval on JASMIN; would you suggest to use the default install (https://docs.esmvaltool.org/projects/esmvalcore/en/latest/quickstart/install.html#pre-installed-versions-on-hpc-clusters) or would it be better to make a new local one (since I understand there are changes in the core packages ESMValGroup/ESMValTool#2198)?

@zklaus zklaus added this to the v2.4.0 milestone Jul 5, 2021
@fserva
Copy link
Contributor Author

fserva commented Jul 16, 2021

Hello @bouweandela from what I can see in /gws/nopw/j04/esmeval/obsdata-v2/Tier3/ERA5/ there is only precipitation.
I do not have access to Mistral, so I guess I will simply download the files in their native format for testing.

@bouweandela
Copy link
Member

Apparently there is not enough space on the group workspace on Jasmin, but there is ERA5 data available in another directory, see here for some recent discussion on the topic: #1246.

@fserva
Copy link
Contributor Author

fserva commented Jul 26, 2021

Ok, I see.

In the Jasmin disks there are surface variables and some model level variables under /badc/ecmwf-era5/data/oper/an_ml/, but the geopotential there is the surface orography short z(time, latitude, longitude). If I well recall it is possible to derive geopotential on pressure levels but that is maybe be too complex.

I have some hourly data for geopotential on pressure levels and I am planning to use those.

@zklaus
Copy link

zklaus commented Jul 27, 2021

It may also be possible to augment the data available on jasmin by asking the right people there. For that, it will be very useful to consult the ERA5 documentation to specify exactly which data one is interested in.

@fserva
Copy link
Contributor Author

fserva commented Aug 5, 2021

I made some test with #1136 (comment)

The run finished correctly, even if no file was produced (I guess it is normal). There were some warnings

WARNING There were warnings in variable zg:
plev: does not contain 100000.0 Pa
 plev: does not contain 70000.0 Pa
 plev: does not contain 25000.0 Pa

related to the fact that units of data downloaded from the CDS are hPa. Maybe they can be changed manually in advance, or there is a better procedure?

In the original files I have level = 10, 30, 50, 100, 500, 850 ;

@bouweandela
Copy link
Member

bouweandela commented Aug 6, 2021

Files are produced, but deleted after a successful run. You can set remove_preproc_dir: false if you want to keep the preprocessed files.

The warning is not about the units, but it tells you that some pressure levels that are mandatory for the zg variable's plev coordinate in the CMOR table are not present, this is only to be expected because this is not CMIP data.

@fserva
Copy link
Contributor Author

fserva commented Aug 6, 2021

Thanks Bouwe, your tip worked and the file is produced; levels are converted to Pa as for CMOR standard.
I was trying to obtain the daily means, so I took some definition from the cmorizer:

datasets:
  - {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}

preprocessors:
  daily_mean:
    daily_statistics:
      operator: mean

diagnostics:
  timeseries:
    description: test zg
    variables:
      zg:
        preprocessor: daily_mean
        mip: day
        frequency: 1hrPt
        start_year: 2006
        end_year: 2006
    scripts: null

Update: This worked (note I did not include add_one_day!

Note that mip: day, using mip: E1hr (table for hourly extra variables) failed.

@bouweandela
Copy link
Member

Note that mip: day, using mip: E1hr (table for hourly extra variables) failed.

It probably doesn't matter too much what mip you choose, but to signify your intent I think mip: E1hr would make most sense, because the variable specifies the input data. What was the error message?

@fserva
Copy link
Contributor Author

fserva commented Aug 6, 2021

Yes, I also think so. It's iris complaining about the coordinates (not new likely)...

2021-08-06 14:24:56,052 UTC [499434] ERROR   Failed to run fix_metadata([<iris 'Cube' of geopotential / (m**2 s**-2) (time: 8760; air_pressure: 6; latitude: 181; longitude: 360)>], {'preprocessor': 'daily_mean', 'mip': 'E1hr', 'frequency': '1hrPt', 'start_year': 2006, 'end_year': 2006, 'variable_group': 'zg', 'short_name': 'zg', 'diagnostic': 'timeseries', 'dataset': 'ERA5', 'project': 'native6', 'tier': 3, 'type': 'reanaly', 'version': 1, 'recipe_dataset_index': 0, 'alias': 'ERA5', 'original_short_name': 'zg', 'standard_name': 'geopotential_height', 'long_name': 'Geopotential Height', 'units': 'm', 'modeling_realm': ['atmos'], 'filename': '/work/users/serva/esmvaltool_output/recipe_cmor_era5_20210806_142455/preproc/timeseries/zg/native6_ERA5_reanaly_1_E1hr_zg_2006-2006.nc', 'check_level': <CheckLevels.DEFAULT: 3>})
2021-08-06 14:24:56,980 UTC [499434] INFO    Maximum memory used (estimate): 0.0 GB
2021-08-06 14:24:56,981 UTC [499434] INFO    Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2021-08-06 14:24:56,981 UTC [499434] ERROR   Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 433, in run
    fire.Fire(ESMValTool())
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 410, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 104, in process_recipe
    recipe.run()
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_recipe.py", line 1434, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 674, in run
    self._run_sequential()
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 685, in _run_sequential
    task.run()
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 252, in run
    self.output_files = self._run(input_files)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 482, in _run
    product.apply(step, self.debug)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 351, in apply
    self.cubes = preprocess(self.cubes, step, **self.settings[step])
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
    return function(items, **kwargs)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/fix.py", line 114, in fix_metadata
    cube_list = fix.fix_metadata(cube_list)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/_fixes/native6/era5.py", line 377, in fix_metadata
    cube = self._fix_coordinates(cube)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/_fixes/native6/era5.py", line 329, in _fix_coordinates
    coord = cube.coord(axis=axis)
  File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/iris/cube.py", line 1829, in coord
    raise iris.exceptions.CoordinateNotFoundError(msg)
iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1  coordinate, but found none.'

@bouweandela
Copy link
Member

Ok, yes, I forgot about that. This is the problem from the top post and it can indeed be solved by picking a table that does contain zg variable with the right coordinates #1136 (comment).

@fserva
Copy link
Contributor Author

fserva commented Aug 6, 2021

So the mip: day is a good choice in this context. For my needs the problem is then solved, I'll let you close the issue in case. Thanks all for the tips!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cmor Related to the CMOR standard iris Related to the Iris package
Projects
None yet
Development

No branches or pull requests

4 participants