Skip to content

Commit

Permalink
Fix injection of added parameters (#1981)
Browse files Browse the repository at this point in the history
### What kind of change does this PR introduce?

The issue was that a YAML definition of an indicator was not able to
inject (set) the `indexer` parameter on indicators that inherited from
`IndexingIndicator` (i.e. indicators that rely on a class-level
implementation of the indexing, instead of compute-level one).

This reorganizes how these class-added arguments are handled. It also
renames them "added" arguments instead of "injected" to clear the
confusion. "injection" now only means an indicator argument that is set
when creating a derived indicator (that what most YAML definitions do).
"added" is an argument that is not in the compute function, but added by
the class.

The example YAML is modified and a test is added using this
modification.

### Does this PR introduce a breaking change?
No.

The feature was not really documented and now it works instead of not.
  • Loading branch information
Zeitsperre authored Oct 31, 2024
2 parents 5bf3e9b + d2be937 commit 5e7bd3d
Show file tree
Hide file tree
Showing 5 changed files with 95 additions and 40 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Bug fixes
* Fixed a bug where the units could be changed before a conversion of the magnitudes could occur. Conversion of units for multivariate ``DataArray`` is now properly handled in `sdba.TrainAdjust` and `sdba.Adjust`. (:pull:`1972`).
* Fixed a units formatting bug with indicators that output "delta" Celsius degrees. (:pull:`1973`).
* Corrected the ``"choices"`` of parameter ``op`` in the docstring of ``frost_free_spell_max_length``. (:pull:`1977`).
* Reorganised how ``Indicator`` subclasses can added arguments to the call signature. Injecting such arguments now works. For xclim's subclasses, this bug only affected the ``indexer`` argument of indicators subclassing ``xc.core.indicator.IndexingIndicator``. (:pull:`1981`).

v0.53.1 (2024-10-21)
--------------------
Expand Down
5 changes: 4 additions & 1 deletion docs/notebooks/example/example.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,13 @@ variables:
description: Precipitation flux on the outer surface of the forest
standard_name: precipitation_flux_onto_canopy
indicators:
RX1day:
RX1day_summer:
base: rx1day
cf_attrs:
long_name: Highest 1-day precipitation amount
parameters:
indexer:
month: [5, 6, 7, 8, 9]
context: hydro
RX5day_canopy:
base: max_n_day_precipitation_amount
Expand Down
69 changes: 50 additions & 19 deletions docs/notebooks/extendxclim.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
"nbsphinx": "hidden",
"tags": []
},
"outputs": [],
"source": [
Expand Down Expand Up @@ -89,7 +90,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import xarray as xr\n",
Expand Down Expand Up @@ -214,7 +217,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from xclim.core.indicator import registry\n",
Expand All @@ -231,7 +236,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"display(Code(tg_mean_c.__doc__, language=\"rst\"))"
Expand All @@ -249,7 +256,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"tg_mean_c.__module__ == xclim.atmos.tg_mean.__module__"
Expand All @@ -265,7 +274,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Passing module\n",
Expand All @@ -277,7 +288,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Conventional class inheritance, uses the current module name\n",
Expand Down Expand Up @@ -319,7 +332,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
"nbsphinx": "hidden",
"tags": []
},
"outputs": [],
"source": [
Expand All @@ -338,7 +352,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# These variables were generated by a hidden cell above that syntax-colored them.\n",
Expand All @@ -362,7 +378,7 @@
"\n",
"</div>\n",
"\n",
"- `RX1day` is simply the same as `registry['RX1DAY']`, but with an updated `long_name`.\n",
"- `RX1day` is as `registry['RX1DAY']`, but with an updated `long_name` and an injected argument : its `indexer` arg is now set to only compute over may to september.\n",
"- `RX5day_canopy` is based on `registry['MAX_N_DAY_PRECIPITATION_AMOUNT']`, changed the `long_name` and injects the `window` and `freq` arguments.\n",
" * It also requests a different variable than the original indicator : `prveg` instead of `pr`. As xclim doesn't know about `prveg`, a definition is given in the `variables` section.\n",
"- `R75pdays` is based on `registry['DAYS_OVER_PRECIP_THRESH']`, injects the `thresh` argument and changes the description of the `per` argument.\n",
Expand Down Expand Up @@ -397,7 +413,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from importlib.resources import files\n",
Expand Down Expand Up @@ -431,7 +449,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import xclim as xc\n",
Expand All @@ -444,7 +464,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"docstring = f\"{example.__doc__}\\n---\\n\\n{xc.indicators.example.R99p.__doc__}\"\n",
Expand All @@ -461,7 +483,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from xclim.testing import open_dataset\n",
Expand Down Expand Up @@ -505,7 +529,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"out = xr.merge(outs)\n",
Expand All @@ -527,7 +553,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from xclim.core.indicator import build_indicator_module, registry\n",
Expand Down Expand Up @@ -565,7 +593,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(xc.indicators.awesome.__doc__)"
Expand All @@ -575,7 +605,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
"nbsphinx": "hidden",
"tags": []
},
"outputs": [],
"source": [
Expand All @@ -597,7 +628,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
"version": "3.12.6"
},
"toc": {
"base_numbering": 1,
Expand Down
6 changes: 6 additions & 0 deletions tests/test_modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,12 @@ def test_custom_indices(open_dataset):
example_path / "example.yml", name="ex4", mode="ignore"
)

# Check that indexer was added and injected correctly
assert "indexer" not in ex1.RX1day_summer.parameters
assert ex1.RX1day_summer.injected_parameters["indexer"] == {
"month": [5, 6, 7, 8, 9]
}


@pytest.mark.requires_docs
def test_indicator_module_translations():
Expand Down
54 changes: 34 additions & 20 deletions xclim/core/indicator.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,8 @@
In the following, the section under `<identifier>` is referred to as `data`. When creating indicators from
a dictionary, with :py:meth:`Indicator.from_dict`, the input dict must follow the same structure of `data`.
Note that kwargs-like parameters like ``indexer`` must be injected as a dictionary (``param data`` above should be a dictionary).
When a module is built from a yaml file, the yaml is first validated against the schema (see xclim/data/schema.yml)
using the YAMALE library (:cite:p:`lopker_yamale_2022`). See the "Extending xclim" notebook for more info.
Expand Down Expand Up @@ -301,7 +303,7 @@ class Indicator(IndicatorRegistrar):
Both are simply views of :py:attr:`xclim.core.indicator.Indicator._all_parameters`.
Compared to their base `compute` function, indicators add the possibility of using dataset as input,
with the injected argument `ds` in the call signature. All arguments that were indicated
with the added argument `ds` in the call signature. All arguments that were indicated
by the compute function to be variables (DataArrays) through annotations will be promoted
to also accept strings that correspond to variable names in the `ds` dataset.
Expand Down Expand Up @@ -425,12 +427,13 @@ def __new__(cls, **kwds): # noqa: C901
# title, abstract, references, notes, long_name
kwds.setdefault(name, value)

# Inject parameters (subclasses can override or extend this through _injected_parameters)
for name, param in cls._injected_parameters():
# Added parameters
# Subclasses can override or extend this through the classmethod _added_parameters
for name, param in cls._added_parameters():
if name in parameters:
raise ValueError(
f"Class {cls.__name__} can't wrap indices that have a `{name}`"
" argument as it conflicts with arguments it injects."
" argument as it conflicts with an argument it adds."
)
parameters[name] = param
else: # inherit parameters from base class
Expand Down Expand Up @@ -529,8 +532,13 @@ def _parse_indice(compute, passed_parameters): # noqa: F841
return parameters, docmeta

@classmethod
def _injected_parameters(cls):
"""Create a list of tuples for arguments to inject, (name, Parameter)."""
def _added_parameters(cls):
"""Create a list of tuples for arguments to add to the call signature (name, Parameter).
These can't be in the compute function signature, the class is in charge of removing them
from the params passed to the compute function, likely through an override of
_preprocess_and_checks.
"""
return [
(
"ds",
Expand Down Expand Up @@ -949,7 +957,7 @@ def _preprocess_and_checks(self, das, params):
def _get_compute_args(self, das, params):
"""Rename variables and parameters to match the compute function's names and split VAR_KEYWORD arguments."""
# Get correct variable names for the compute function.
# Exclude param without a mapping inside the compute functions (those injected by the indicator class)
# Exclude param without a mapping inside the compute functions (those added by the indicator class)
args = {}
for key, p in self._all_parameters.items():
if p.compute_name is not _empty:
Expand Down Expand Up @@ -1534,18 +1542,24 @@ def _preprocess_and_checks(self, das, params):


class IndexingIndicator(Indicator):
"""Indicator that also injects "indexer" kwargs to subset the inputs before computation."""

def __init__(self, *args, **kwargs):
self._all_parameters["indexer"] = Parameter(
kind=InputKind.KWARGS,
description=(
"Indexing parameters to compute the indicator on a temporal "
"subset of the data. It accepts the same arguments as "
":py:func:`xclim.indices.generic.select_time`."
),
)
super().__init__(*args, **kwargs)
"""Indicator that also adds the "indexer" kwargs to subset the inputs before computation."""

@classmethod
def _added_parameters(cls):
"""Create a list of tuples for arguments to add (name, Parameter)."""
return super()._added_parameters() + [
(
"indexer",
Parameter(
kind=InputKind.KWARGS,
description=(
"Indexing parameters to compute the indicator on a temporal "
"subset of the data. It accepts the same arguments as "
":py:func:`xclim.indices.generic.select_time`."
),
),
)
]

def _preprocess_and_checks(self, das: dict[str, DataArray], params: dict[str, Any]):
"""Perform parent's checks and also check if freq is allowed."""
Expand All @@ -1559,7 +1573,7 @@ def _preprocess_and_checks(self, das: dict[str, DataArray], params: dict[str, An


class ResamplingIndicatorWithIndexing(ResamplingIndicator, IndexingIndicator):
"""Resampling indicator that also injects "indexer" kwargs to subset the inputs before computation."""
"""Resampling indicator that also adds "indexer" kwargs to subset the inputs before computation."""


class Daily(ResamplingIndicator):
Expand Down

0 comments on commit 5e7bd3d

Please sign in to comment.