-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking recipes (Lauer et al.) #3598
base: main
Are you sure you want to change the base?
Changes from all commits
94c826e
7c7a2e6
8a73469
2daf3a9
244d8d8
f45b25d
530106e
c8aa55c
080b8f5
40d9167
e707342
904b291
a7ab4e4
b9b0a40
b25b9c6
ec4b1c1
2438b26
83d972e
30b8453
b864979
dddc3a5
a8c5e1e
d154eed
128a77e
4ccc12c
a99b522
413cb61
ed1e991
1241f20
dff982e
66a4bc5
50e498b
446b4ee
b37e9b3
9455bf9
2d92633
ec23f76
b3df631
5001ac9
d7653fc
da794dc
b616302
7be6551
013926a
99b5d19
d22c0e7
c71ef02
e95e9e0
118d0a1
ff56640
70012f9
ec9a1d9
69337fe
7f4dbdf
fc99b85
d4a75e1
f807a0e
be0f566
6592cc1
13b5312
6ac3bae
ffb8b8e
d5c5375
4723d44
eae63ca
714b349
7a3c844
1fbdf10
5ec42a2
8d6a8d0
e03db0d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
.. _recipe_benchmarking: | ||
|
||
Model Benchmarking | ||
================== | ||
|
||
Overview | ||
-------- | ||
|
||
These recipes and diagnostics are based on :ref:`recipe_monitor <recipe_monitor>`: that allow plotting arbitrary preprocessor output, i.e., arbitrary variables from arbitrary datasets. An extension of these diagnostics is used to benchmark a model simulation with other datasets (e.g. CMIP6). The benchmarking features are described in `Lauer et al.`_:. | ||
|
||
.. _`Lauer et al.`: A. Lauer, Bock, L., Hassler, B., Jöckel, P., Ruhe, L., and Schlund, M.: Monitoring and benchmarking Earth System Model simulations with ESMValTool v2.12.0, Geosci. Model Dev. (submitted). | ||
|
||
Available recipes and diagnostics | ||
--------------------------------- | ||
|
||
Recipes are stored in `recipes/model_evaluation` | ||
|
||
* recipe_model_benchmarking_annual_cycle.yml | ||
* recipe_model_benchmarking_boxplots.yml | ||
* recipe_model_benchmarking_diurnal_cycle.yml | ||
* recipe_model_benchmarking_maps.yml | ||
* recipe_model_benchmarking_timeseries.yml | ||
* recipe_model_benchmarking_zonal.yml | ||
|
||
Diagnostics are stored in `diag_scripts/monitor/` | ||
|
||
* :ref:`multi_datasets.py | ||
<api.esmvaltool.diag_scripts.monitor.multi_datasets>`: | ||
Monitoring diagnostic to show multiple datasets in one plot (incl. biases). | ||
|
||
|
||
Recipe settings | ||
~~~~~~~~~~~~~~~ | ||
|
||
See :ref:`multi_datasets.py<api.esmvaltool.diag_scripts.monitor.multi_datasets>`: for a list of all possible configuration options that can be specified in the recipe. | ||
|
||
.. note:: | ||
Please note that exactly one dataset (the dataset to be benchmarked) needs to specify the facet ``benchmark_dataset: True`` in the dataset entry of the recipe. For line plots (i.e. annual cycle, seasonal cycle, diurnal cycle, time series), it is recommended, to specify a particular line color and line style in the ``scripts`` section of the recipe for the dataset to be benchmarked (``benchmark_dataset: True``) so that this dataset is easy to identify in the plot. In the example below, MIROC6 is the dataset to be benchmarked and ERA5 is used as a reference dataset. | ||
|
||
.. code-block:: yaml | ||
|
||
scripts: | ||
allplots: | ||
script: monitor/multi_datasets.py | ||
plot_folder: '{plot_dir}' | ||
plot_filename: '{plot_type}_{real_name}_{mip}' | ||
group_variables_by: variable_group | ||
facet_used_for_labels: alias | ||
plots: | ||
diurnal_cycle: | ||
annual_mean_kwargs: False | ||
legend_kwargs: | ||
loc: upper right | ||
plot_kwargs: | ||
'MIROC6': | ||
color: red | ||
label: '{alias}' | ||
linestyle: '-' | ||
linewidth: 2 | ||
zorder: 4 | ||
ERA5: | ||
color: black | ||
label: '{dataset}' | ||
linestyle: '-' | ||
linewidth: 2 | ||
zorder: 3 | ||
MultiModelPercentile10: | ||
color: gray | ||
label: '{dataset}' | ||
linestyle: '--' | ||
linewidth: 1 | ||
zorder: 2 | ||
MultiModelPercentile90: | ||
color: gray | ||
label: '{dataset}' | ||
linestyle: '--' | ||
linewidth: 1 | ||
zorder: 2 | ||
default: | ||
color: lightgray | ||
label: null | ||
linestyle: '-' | ||
linewidth: 1 | ||
zorder: 1 | ||
|
||
Variables | ||
--------- | ||
|
||
Any, but the variables' number of dimensions should match the ones expected by each plot. | ||
|
||
References | ||
---------- | ||
|
||
* Lauer, A., L. Bock, B. Hassler, P. Jöckel, L. Ruhe, and M. Schlund: Monitoring and benchmarking Earth System Model simulations with ESMValTool v2.12.0, Geosci. Model Dev., xx, xxxx-xxxx, | ||
doi: xxx, 202x. | ||
|
||
Example plots | ||
------------- | ||
|
||
.. _fig_benchmarking_annual_cycle: | ||
.. figure:: /recipes/figures/benchmarking/annual_cycle.png | ||
:align: center | ||
:width: 16cm | ||
|
||
(Left) Multi-year global mean (2000-2004) of the seasonal cycle of near-surface temperature in K from a simulation of MIROC6 and the reference dataset HadCRUT5 (black). The thin gray lines show individual CMIP6 models used for comparison, the dashed gray lines show the 10% and 90% percentiles of these CMIP6 models. (Right) same as (left) but for area-weighted RMSE of near-surface temperature. The light blue shading shows the range of the 10% to 90% percentiles of RMSE values from the ensemble of CMIP6 models used for comparison. Created with recipe_model_benchmarking_annual_cycle.yml. | ||
|
||
.. _fig_benchmarking_boxplots: | ||
.. figure:: /recipes/figures/benchmarking/boxplots.png | ||
:align: center | ||
:width: 16cm | ||
|
||
(Left) Global area-weighted RMSE (smaller=better), (middle) weighted Pearson’s correlation coefficient (higher=better) and (right) weighted Earth mover’s distance (smaller=better) of the geographical pattern of 5-year means of different variables from a simulation of MIROC6 (red cross) in comparison to the CMIP6 ensemble (boxplot). Reference datasets for calculating the three metrics are: near-surface temperature (tas): HadCRUT5, surface temperature (ts): HadISST, precipitation (pr): GPCP-SG, air pressure at sea level (psl): ERA5, shortwave (rsut) longwave (rlut) radiative fluxes at TOA and shortwave (swcre) and longwave (lwcre) cloud radiative effects: CERES-EBAF. Each box indicates the range from the first quartile to the third quartile, the vertical lines show the median, and the whiskers the minimum and maximum values, excluding the outliers. Outliers are defined as being outside 1.5 times the interquartile range. Created with recipe_model_benchmarking_boxplots.yml. | ||
|
||
.. _fig_benchmarking_diurn_cycle: | ||
.. figure:: /recipes/figures/benchmarking/diurnal_cycle.png | ||
:align: center | ||
:width: 10cm | ||
|
||
Area-weighted RMSE of the annual mean diurnal cycle (year 2000) of precipitation averaged over the tropical ocean (ocean grid cells in the latitude belt 30°S to 30°N) from a simulation of MIROC6 averaged compared with ERA5 data (black). The light blue shading shows the range of the 10% to 90% percentiles of RMSE values from the ensemble of CMIP6 models used for comparison. Created with recipe_benchmarking_diurnal_cycle.yml. | ||
|
||
.. _fig_benchmarking_map: | ||
.. figure:: /recipes/figures/benchmarking/map.png | ||
:align: center | ||
:width: 10cm | ||
|
||
5-year annual mean (2000-2004) area-weighted RMSE of the precipitation rate in mm day-1 from a simulation of MIROC6 compared with GPCP-SG data. The stippled areas mask grid cells where the RMSE is smaller than the 90% percentile of RMSE values from an ensemble of CMIP6 models. Created with recipe_model_benchmarking_maps.yml | ||
|
||
.. _fig_benchmarking_timeseries: | ||
.. figure:: /recipes/figures/benchmarking/timeseries.png | ||
:align: center | ||
:width: 16cm | ||
|
||
(Left) Time series from 2000 through 2014 of global average monthly mean temperature anomalies (reference period 2000-2009) of the near-surface temperature in K from a simulation of MIROC6 (red) and the reference dataset HadCRUT5 (black). The thin gray lines show individual CMIP6 models used for comparison, the dashed gray lines show the 10% and 90% percentiles of these CMIP6 models. (Right) same as (left) but for area-weighted RMSE of the near-surface air temperature. The light blue shading shows the range of the 10% to 90% percentiles of RMSE values from the ensemble of CMIP6 models used for comparison. Created with recipe_model_benchmarking_timeseries.yml. | ||
|
||
.. _fig_benchmarking_zonal: | ||
.. figure:: /recipes/figures/benchmarking/zonal.png | ||
:align: center | ||
:width: 10cm | ||
|
||
5-year annual mean bias (2000-2004) of the zonally averaged temperature in K from a historical simulation of MIROC6 compared with ERA5 reanalysis data. The stippled areas mask grid cells where the absolute BIAS (${\abs{BIAS}}$) is smaller than the maximum of the absolute 10% (${\abs{p10}}$) and the absolute 90% (${\abs{p90}}$) percentiles from an ensemble of CMIP6 models, i.e. ${\abs{BIAS} \geq max( \abs{p10}, \abs{p90})}$. Created with recipe_model_benchmarking_zonal.yml. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,11 @@ | |
produce multi panel plots for data with `shape_id` or `region` | ||
coordinates of length > 1. Supported coordinates: `time`, `shape_id` | ||
(optional) and `region` (optional). | ||
- Diurnal cycle (plot type ``diurnal_cycle``): Generate a diurnal cycle | ||
plot (timeseries like climatological from 0 to 24 hours). It will | ||
produce multi panel plots for data with `shape_id` or `region` | ||
coordinates of length > 1. Supported coordinates: `time`, `shape_id` | ||
(optional) and `region` (optional). | ||
|
||
Configuration options in recipe | ||
------------------------------- | ||
|
@@ -39,10 +44,10 @@ | |
monitor configuration file can be found :ref:`here <monitor_config_file>`. | ||
plots: dict, optional | ||
Plot types plotted by this diagnostic (see list above). Dictionary keys | ||
must be ``clim``, ``seasonclim``, ``monclim``, ``timeseries`` or | ||
``annual_cycle``. Dictionary values are dictionaries used as options for | ||
the corresponding plot. The allowed options for the different plot types | ||
are given below. | ||
must be ``clim``, ``seasonclim``, ``monclim``, ``timeseries``, | ||
``annual_cycle`` or ``diurnal_cycle``. Dictionary values are dictionaries | ||
used as options for the corresponding plot. The allowed options for the | ||
different plot types are given below. | ||
plot_filename: str, optional | ||
Filename pattern for the plots. | ||
Defaults to ``{plot_type}_{real_name}_{dataset}_{mip}_{exp}_{ensemble}``. | ||
|
@@ -98,6 +103,10 @@ | |
---------------------------------------------------- | ||
None | ||
|
||
Configuration options for plot type ``diurnal_cycle`` | ||
----------------------------------------------------- | ||
None | ||
|
||
.. hint:: | ||
|
||
Extra arguments given to the recipe are ignored, so it is safe to use yaml | ||
|
@@ -166,6 +175,7 @@ def compute(self): | |
|
||
self.timeseries(cube, var_info) | ||
self.plot_annual_cycle(cube, var_info) | ||
self.plot_diurnal_cycle(cube, var_info) | ||
self.plot_monthly_climatology(cube, var_info) | ||
self.plot_seasonal_climatology(cube, var_info) | ||
self.plot_climatology(cube, var_info) | ||
|
@@ -280,6 +290,57 @@ def plot_annual_cycle(self, cube, var_info): | |
caption=caption, | ||
) | ||
|
||
def plot_diurnal_cycle(self, cube, var_info): | ||
"""Plot the diurnal cycle according to configuration. | ||
|
||
The key 'diurnal_cycle' must be passed to the 'plots' option in the | ||
configuration. | ||
|
||
Parameters | ||
---------- | ||
cube: iris.cube.Cube | ||
Data to plot. Must be 1D with time or 2D with an extra 'shape_id' | ||
or 'region' coordinate. In that case, the plot will be a multiple | ||
one with one figure for each region | ||
var_info: dict | ||
Variable's metadata from ESMValTool | ||
|
||
Warning | ||
------- | ||
The hourly climatology is done inside the function so the users can | ||
plot both the timeseries and the diurnal cycle in one go | ||
""" | ||
if 'diurnal_cycle' not in self.plots: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems a curious place to put the control statement, rather than where the method is called, e.g.
However, this seems to be an existing design decision for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is right, I just followed the structure that was there already. |
||
return | ||
cube = climate_statistics(cube, period='hour') | ||
|
||
plotter = PlotSeries() | ||
plotter.outdir = self.get_plot_folder(var_info) | ||
plotter.img_template = self.get_plot_path('diurnalcycle', var_info, | ||
add_ext=False) | ||
plotter.filefmt = self.cfg['output_file_type'] | ||
region_coords = ('shape_id', 'region') | ||
options = { | ||
'xlabel': '', | ||
'xlimits': None, | ||
'suptitle': 'Diurnal cycle', | ||
} | ||
for region_coord in region_coords: | ||
if cube.coords(region_coord): | ||
plotter.multiplot_cube(cube, 'month', region_coord, **options) | ||
return | ||
plotter.plot_cube(cube, 'hour', **options) | ||
caption = (f"Diurnal cycle of {var_info[n.LONG_NAME]} of " | ||
f"dataset {var_info[n.DATASET]} (project " | ||
f"{var_info[n.PROJECT]}) from {var_info[n.START_YEAR]} to " | ||
f"{var_info[n.END_YEAR]}.") | ||
self.record_plot_provenance( | ||
self.get_plot_path('diurnalcycle', var_info), | ||
var_info, | ||
'Diurnal cycle', | ||
caption=caption, | ||
) | ||
|
||
def plot_monthly_climatology(self, cube, var_info): | ||
"""Plot the monthly climatology as a multipanel plot. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume it doesn't show the 0 (=24) hour twice, though that would be a valid choice if it did and was documented as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact the example diurnal plot in this PR only appears to plot hours 1 to 22 inclusive. Is that a data or code limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a data limitation. We wrote 0-24 hours as (if data are available), one could plot 0:00:00 to 23:59:59.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 3-hourly CMIP6 output used in the example provides data at 1.30 (0.00-3.00), 4.30 (3.00-6.00), ..., 22.30 (21.00-24.00). CMIP6 data are converted to full hours by preprocessor
resample_hours
so that they can be compared to ERA5 data (provided at full hours). The preprocessordistance_metric,
applied afterwards requires all datasets to have the same time coordinate.