To provide cloud-based access to ERA5 reanalysis data, Intertrust is working in conjunction with the AWS Public Dataset Program to publish and maintain regular updates of ERA5 data in S3.
This documentation outlines the dataset's details, available parameters, location and structure on S3, and includes examples of how to access and work with the data.
Please refer to the ECMWF website for the official ERA5 data documentation.
For the list of dataset updates and changes, please refer to the Changelog file.
ERA5 Climate reanalysis provides a numerical assessment of the modern climate. It is produced by a similar process as regular numerical weather forecast, a data assimilation and forecast loop, taking into account most of the available meteorological observations and analyses them with state of the art numerical model, producing a continuous, spatially consistent and homogeneous dataset.
The dataset provides all essential atmospheric meteorological parameters like, but not limited to, air temperature, pressure and wind at different altitudes, along with surface parameters like rainfall, soil moisture content and sea parameters like sea-surface temperature and wave height. ERA5 provides data at a considerably higher spatial and temporal resolution than its legacy counterpart ERA-Interim. ERA5 consists of high resolution version with 31 km horizontal resolution, and a reduced resolution ensemble version with 10 members.
Data is currently available starting 1979 and is updated monthly. As ECMWF is moving towards more frequent data updates, the Intertrust team will work to match the data refresh with the ECMWF source.
Source | ECMWF WebAPI |
Category | Climate Reanalysis |
Format | NetCDF |
License | Generated using Copernicus Climate Change Service Information 2018. See http://apps.ecmwf.int/datasets/licences/copernicus/ for additional information. |
Storage | Amazon S3 |
Location | Amazon Resource Name (ARN) arn:aws:s3:::era5-pds AWS Region us-east-1 URL http://era5-pds.s3.amazonaws.com/ |
Update Frequency | New data is published monthly. The ERA5 Public Release Plan is available at http://climate.copernicus.eu/products/climate-reanalysis |
The table below lists the 18 ERA5 variables that are available on S3. All variables are surface or single level parameters sourced from the HRES sub-daily forecast stream.
Variable names are little different from ECMWF has. You can find explanation of variable names derivation here: https://github.com/planet-os/notebooks/blob/master/aws/variables_name_derivation.md
Variable Name | File Name | Variable type (fc/an) |
---|---|---|
10 metre U wind component | eastward_wind_at_10_metres.nc | an |
10 metre V wind component | northward_wind_at_10_metres.nc | an |
100 metre U wind component | eastward_wind_at_100_metres.nc | an |
100 metre V wind component | northward_wind_at_100_metres.nc | an |
2 metre dew point temperature | dew_point_temperature_at_2_metres.nc | an |
2 metre temperature | air_temperature_at_2_metres.nc | an |
2 metres maximum temperature since previous post-processing | air_temperature_at_2_metres_1hour_Maximum.nc | fc |
2 metres minimum temperature since previous post-processing | air_temperature_at_2_metres_1hour_Minimum.nc | fc |
Mean sea level pressure | air_pressure_at_mean_sea_level.nc | an |
Sea surface temperature | sea_surface_temperature.nc | an |
Mean wave period | sea_surface_wave_mean_period.nc | |
Mean direction of waves | sea_surface_wave_from_direction.nc | |
Significant height of combined wind waves and swell | significant_height_of_wind_and_swell_waves.nc | |
Snow density | snow_density.nc | an |
Snow depth | lwe_thickness_of_surface_snow_amount.nc | an |
Surface pressure | surface_air_pressure.nc | an |
Surface solar radiation downwards | integral_wrt_time_of_surface_direct_downwelling_shortwave_flux_in_air_1hour_Accumulation.nc | fc |
Total precipitation | precipitation_amount_1hour_Accumulation.nc | fc |
The date and time of the variable data is the valid time, with a mapping from forecast time to valid time corresponding to that outlined in Table 0 of the ECMWF ERA5 documentation. ERA5 can have two different versions of a some variables -- either analysis or forecast. Analysis is a field, where observations of the same timestep are mixed into the data. This differs from forecast, which is just a model calculation. For example, variables like 2m temperature and surface pressure are analysed at each timestep, because there are enough near surface observations available. An example of forecast, on the other hand, is precipitation. Full model analysis cycle is performed every 12 hours, at 06:00 and 18:00 UTC, respectively. For forecasted fields, the first 12 forecast hours are used from each forecast run, which occur at 06:00 and 18:00 UTC. A sample highlighting key times of this mapping is included below for reference.
Valid Time | ERA5 HRES Sub-Daily Forecast | |||
---|---|---|---|---|
Date | Time | Date | Forecast Run | Step |
date | 00:00 | date - 1 | 18:00 | 6 |
date | 06:00 | date - 1 | 18:00 | 12 |
date | 07:00 | date | 06:00 | 1 |
date | 18:00 | date | 06:00 | 12 |
date | 19:00 | date | 18:00 | 1 |
date | 23:00 | date | 18:00 | 5 |
If there are specific variables you would like to recommend for future inclusion, please contact datahub@intertrust.com.
The ERA5 dataset has been transformed to optimize access by specific variables and temporal ranges. To accommodate this, data is divided into distinct NetCDF granules organized by year, month, and variable name.
The data is structured as follows:
/{year}/{month}/main.nc
/data/{var1}.nc
/{var2}.nc
/{....}.nc
/{varN}.nc
where year is expressed as four digits (e.g. YYYY) and month as two digits (e.g. MM). Individual data variables (var1 through varN) use names corresponding to NetCDF CF standard names convention plus any applicable additional info, such as vertical coordinate.
Granule variable structure and metadata attributes are stored in main.nc. This file contains coordinate and auxiliary variable data, and is also annotated using NetCDF CF metadata conventions.
A sample path for air temperature would take the following form:
/2008/01/data/air_temperature_at_2_metres.nc
To provide a means for correcting potential processing errors in individual granule files, bucket versioning will be used. This solution allows for consistent S3 file paths for end users of the data, and also allows for recovery of previous file versions if necessary. Should an issue occur that requires the rewriting of data granules, we will publish details of the incident as well as the affected files on the ERA5 dataset page.
In the unlikely event that a major update impacting the data structure or its dimensionality be required, such changes would be published as a distinct version of the dataset.
The data is publicly available in the ERA5 S3 bucket (era5-pds) and may be directly accessed there. Please note that the best transfer speeds will be achieved by accessing the data from an EC2 instance located in the same AWS region as the S3 bucket (us-east-1).
Data may be accessed via http using the S3 REST API. To make a GET request, use the bucket name and the full key name for the object. For example, to download air temperature at 2 meters for January, 2008, submit a GET request to the following url: http://era5-pds.s3.amazonaws.com/2008/01/data/air_temperature_at_2_metres.nc
Another option is to use the AWS SDK or CLI. We’ve published a jupyter notebook on GitHub that provides an example of how to access ERA5 data in python using boto.