Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for discharge unit compliance in streamflow statistical indicators #1225

Merged
merged 22 commits into from
Dec 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ec95da5
fix: check for discharge unit compliance in streamflow statistical in…
huard Nov 8, 2022
4de562c
update history with pr #
huard Nov 8, 2022
3b33971
Merge branch 'master' into fix-1130
Zeitsperre Nov 8, 2022
7693d99
Update xclim/indicators/land/_streamflow.py
huard Nov 14, 2022
035d1d4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 14, 2022
d53a269
Created generic indicator realm. Moved generic indicators from land r…
huard Nov 14, 2022
109271d
merge
huard Nov 14, 2022
f61cd65
Merge branch 'master' into fix-1130
huard Nov 14, 2022
639fe20
reinstated streamflow indicators to avoid breaking code without a dep…
huard Nov 14, 2022
5ddf6f0
Merge branch 'fix-1130' of github.com:Ouranosinc/xclim into fix-1130
huard Nov 14, 2022
9c97ae4
Merge branch 'master' into fix-1130
huard Nov 16, 2022
e304690
Merge branch 'master' into fix-1130
aulemahal Dec 5, 2022
666e648
Merge branch 'master' into fix-1130
Zeitsperre Dec 5, 2022
1c31a75
Merge branch 'master' into fix-1130
Zeitsperre Dec 9, 2022
3a55e96
Merge branch 'master' into fix-1130
Zeitsperre Dec 9, 2022
542708f
Merge branch 'master' into fix-1130
Zeitsperre Dec 9, 2022
1e2d896
Merge branch 'master' into fix-1130
aulemahal Dec 14, 2022
8052d15
Update HISTORY.rst
aulemahal Dec 14, 2022
a2de036
Update xclim/indicators/land/_streamflow.py
huard Dec 15, 2022
b89440e
Merge branch 'master' into fix-1130
huard Dec 16, 2022
ef14fa9
french translation for discharge_distribution_fit
huard Dec 16, 2022
5f61995
Merge branch 'master' into fix-1130
aulemahal Dec 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,11 @@ Contributors to this version: Trevor James Smith (:user:`Zeitsperre`), Pascal Bo
New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* Virtual modules can add variables to ``xclim.core.utils.VARIABLES`` through the new `variables` section of the yaml files. (:issue:`1129`, :pull:`1231`).
* ``xclim.core.units.convert_units_to`` can now perform automatic conversions based on the standard name of the input when needed. (:issue:`1205`, :pull:`1206`).
- Conversion from amount (thickness) to flux (rate), using ``amount2rate`` and ``rate2amount``.
- Conversion from amount to thickness for liquid water quantities, using the new ``amount2lwethickness`` and ``lwethickness2amount``. This is similar to the implicit transformations enabled by the "hydro" unit context.
* ``xclim.core.units.convert_units_to`` can now perform automatic conversions based on the standard name of the input when needed. (:issue:`1205`, :pull:`1206`).
- Conversion from amount (thickness) to flux (rate), using ``amount2rate`` and ``rate2amount``.
- Conversion from amount to thickness for liquid water quantities, using the new ``amount2lwethickness`` and ``lwethickness2amount``. This is similar to the implicit transformations enabled by the "hydro" unit context.
- Passing ``context='infer'`` will activate the "hydro" context if the source or the target are DataArrays with a standard name that is compatible, as decided by the new ``xclim.core.units.infer_context`` function.
* New `generic` indicator realm. Now holds indicators previously meant for streamflow analysis in the `land` realm: `fit`, `return_level` (previously `freq_analysis`) and `stats`.

Breaking changes
^^^^^^^^^^^^^^^^
Expand All @@ -33,6 +31,7 @@ Breaking changes
- `xclim.subset` (mock submodule) → functions migrated to `clisops.core.subset`
* The ``xclim.testing.utils.get_all_CMIP6_variables`` and ``xclim.testing.utils.update_variable_yaml`` function were removed as the former was extremely slow and unusable. (:pull:`1258`).
* The wind speed input of ``atmos.potential_evapotranspiration`` and ``atmos.water_budget`` was renamed to ``sfcWind`` (capital W) as this is the correct CMIP6 name. (:pull:`1258`).
* Indicator `land.stats`, `land.fit` and `land.freq_analysis` are now deprecated and will be removed in version 0.43. They are being phased out in favor of generic indicators `generic.stats`, `generic.fit` and `generic.return_level` respectively.

Bug fixes
^^^^^^^^^
Expand All @@ -41,6 +40,7 @@ Bug fixes
* The setup step for `pytest` needed to be addressed due to the fact that files were being accessed/modified by multiple tests at a time, causing segmentation faults in some tests. This has been resolved by splitting functions into those that fetch or generate test data (under `xclim.testing.tests.data`) and the fixtures that supply accessors to them (under `xclim.testing.tests.conftest`). (:issue:`1238`, :pull:`1254`).
* Relaxed the expected output for ``test_spatial_analogs[friedman_rafsky]`` to support expected results from `scikit-learn` 1.2.0.
* The MBCn example in documentation has been fixed to properly imitate the source. (:issue:`1249`, :pull:`1250`).
* Streamflow indicators relying on indices defined in `xclim.indices.stats` were not checking input variable units. These indicators will now raise an error if input data units are not m^3/s. (:issue:`1130`, :pull:`1225`).

Internal changes
^^^^^^^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion xclim/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from xclim.core.indicator import build_indicator_module_from_yaml
from xclim.core.locales import load_locale
from xclim.core.options import set_options # noqa
from xclim.indicators import atmos, land, seaIce # noqa
from xclim.indicators import atmos, generic, land, seaIce # noqa

__author__ = """Travis Logan"""
__email__ = "logan.travis@ouranos.ca"
Expand Down
24 changes: 21 additions & 3 deletions xclim/data/fr.json
Original file line number Diff line number Diff line change
Expand Up @@ -599,17 +599,35 @@
"abstract": "Mesure des oscillations du débit relatif au débit moyen, quantifiant la fréquence et la rapidité des changements de débits."
},
"FREQ_ANALYSIS": {
"long_name": "Période de retour du débit",
"long_name": "Valeur de retour du débit",
"description": "Analyse fréquentielle de type {dist:f} des débits moyens sur {window} jours, pour le {mode} {indexer}.",
"title": "Période de retour",
"title": "Valeur de retour",
"abstract": "Analyse fréquentielle des débits selon un mode et une distribution."
},
"STATS": {
"RETURN_LEVEL": {
"long_name": "Valeur de retour",
"description": "Analyse fréquentielle de type {dist:f} des moyennes sur {window} jours, pour le {mode} {indexer}.",
"title": "Valeur de retour",
"abstract": "Analyse fréquentielle selon un mode et une distribution."
},
"DISCHARGE_STATS": {
"long_name": "Statistique des débits quotidiens",
"description": "{op} {freq:f} des débits quotidiens ({indexer}).",
"title": "Calcul de statistiques sur des sous-périodes.",
"abstract": ""
},
"STATS": {
"long_name": "Statistique des valeurs quotidiennes",
"description": "{op} {freq:f} des valeurs quotidiennes ({indexer}).",
"title": "Calcul de statistiques sur des sous-périodes.",
"abstract": ""
},
"DISCHARGE_DISTRIBUTION_FIT": {
"long_name": "Paramètres d'une distribution {dist:f} d'une série de débits",
"description": "Paramètres d'une distribution {dist:f} d'une série de débits.",
"title": "Calcul les paramètres d'une distribution univariée pour une série de débits",
"abstract": ""
},
"FIT": {
"long_name": "Paramètres d'une distribution {dist:f}",
"description": "Paramètres d'une distribution {dist:f}.",
Expand Down
5 changes: 5 additions & 0 deletions xclim/data/variables.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ variables:
cell_methods: "area: sum"
description: Cell area (over the ocean).
standard_name: cell_area
discharge:
canonical_units: m3 s-1
cell_methods: "time: mean"
description: The amount of water, in all phases, flowing in the river channel and flood plain.
standard_name: water_volume_transport_in_river_channel
evspsblpot:
canonical_units: kg m-2 s-1
cell_methods: "time: mean"
Expand Down
6 changes: 6 additions & 0 deletions xclim/indicators/generic/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# noqa: D205,D400
"""
Generic indicators
==================
"""
from ._stats import *
50 changes: 50 additions & 0 deletions xclim/indicators/generic/_stats.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
from xclim.core.indicator import Indicator, ResamplingIndicator
from xclim.core.units import declare_units
from xclim.indices.generic import select_resample_op
from xclim.indices.stats import fit as _fit
from xclim.indices.stats import frequency_analysis

__all__ = ["fit", "return_level", "stats"]


fit = Indicator(
title="Distribution parameters fitted over the time dimension.",
identifier="fit",
var_name="params",
units="",
standard_name="{dist} parameters",
long_name="{dist} distribution parameters",
description="Parameters of the {dist} distribution.",
cell_methods="time: fit",
compute=_fit,
realm="generic",
src_freq=None,
)


return_level = ResamplingIndicator(
title="Return level from frequency analysis",
identifier="return_level",
var_name="fa_{window}{mode:r}{indexer}",
long_name="N-year return level",
description="Frequency analysis for the {mode} {indexer} {window}-day value estimated using the {dist} "
"distribution.",
abstract="Frequency analysis on the basis of a given mode and distribution.",
compute=frequency_analysis,
realm="generic",
src_freq="D",
missing="skip",
)


stats = ResamplingIndicator(
title="Statistic of the daily values for a given period.",
identifier="stats",
var_name="stat_{indexer}{op:r}",
long_name="Daily statistics",
description="{freq} {op} of daily values ({indexer}).",
compute=select_resample_op,
missing="any",
src_freq="D",
realm="generic",
)
66 changes: 24 additions & 42 deletions xclim/indicators/land/_streamflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,36 +30,6 @@ def cfcheck(q):
check_valid(q, "standard_name", "water_volume_transport_in_river_channel")


class Stats(Streamflow):
missing = "any"


class FA(Streamflow):
"""Frequency analysis.

Notes
-----
FA performs three steps:
1. Compute stats over time series (min, max)
2. Fit statistical distribution parameters
3. Compute parametric quantiles for given return periods

Missing value functionality cannot be meaningfully applied here, because indicators apply missing value
operations on input and apply the mask on output. The `freq` of the input could be "YS", but this same
`freq` would then be used to compute the mask, which makes no sense.
"""

missing = "skip"


# Disable the daily checks because the inputs are period extremes.
class Fit(Indicator):
src_freq = None

def cfcheck(self, **das):
pass


base_flow_index = Streamflow(
title="Base flow index",
identifier="base_flow_index",
Expand All @@ -70,16 +40,19 @@ def cfcheck(self, **das):
compute=base_flow_index,
)

freq_analysis = FA(
title="Return period flow amount",
freq_analysis = Streamflow(
title="Return level",
identifier="freq_analysis",
var_name="q{window}{mode:r}{indexer}",
long_name="N-year return period flow amount",
long_name="N-year return level discharge",
description="Streamflow frequency analysis for the {mode} {indexer} {window}-day flow estimated using the {dist} "
"distribution.",
abstract="Streamflow frequency analysis on the basis of a given mode and distribution.",
units="m^3 s-1",
compute=declare_units(da=None)(frequency_analysis),
compute=frequency_analysis,
missing="skip",
input={"da": "discharge"},
_version_deprecated="0.40",
)

rb_flashiness_index = Streamflow(
Expand All @@ -94,26 +67,35 @@ def cfcheck(self, **das):
compute=rb_flashiness_index,
)

stats = Stats(

stats = Streamflow(
title="Statistic of the daily flow for a given period.",
identifier="stats",
identifier="discharge_stats",
var_name="q{indexer}{op:r}",
long_name="Daily flow statistics",
description="{freq} {op} of daily flow ({indexer}).",
units="m^3 s-1",
compute=declare_units(da=None)(generic.select_resample_op),
compute=generic.select_resample_op,
missing="any",
input={"da": "discharge"},
_version_deprecated="0.40",
)

fit = Fit(

fit = Indicator(
title="Distribution parameters fitted over the time dimension.",
identifier="fit",
identifier="discharge_distribution_fit",
var_name="params",
units="",
standard_name="{dist} parameters",
long_name="{dist} distribution parameters",
description="Parameters of the {dist} distribution.",
cell_methods="time: fit",
compute=declare_units(da=None)(_fit),
src_freq=None,
compute=_fit,
input={"da": "discharge"},
realm="land",
_version_deprecated="0.40",
)


Expand All @@ -124,7 +106,7 @@ def cfcheck(self, **das):
long_name="Day of the year of the maximum streamflow over {indexer}",
description="Day of the year of the maximum streamflow over {indexer}.",
units="",
compute=declare_units(da=None)(generic.select_resample_op),
compute=declare_units(da="[discharge]")(generic.select_resample_op),
parameters=dict(op=generic.doymax),
)

Expand All @@ -136,6 +118,6 @@ def cfcheck(self, **das):
long_name="Day of the year of the minimum streamflow over {indexer}",
description="Day of the year of the minimum streamflow over {indexer}.",
units="",
compute=declare_units(da=None)(generic.select_resample_op),
compute=declare_units(da="[discharge]")(generic.select_resample_op),
parameters=dict(op=generic.doymin),
)
1 change: 1 addition & 0 deletions xclim/testing/tests/test_generic.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
"""Tests for generic indices."""
from __future__ import annotations

import numpy as np
Expand Down
49 changes: 49 additions & 0 deletions xclim/testing/tests/test_generic_indicators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import numpy as np

from xclim import generic, set_options


class TestFit:
def test_simple(self, pr_ndseries):
pr = pr_ndseries(np.random.rand(1000, 1, 2))
ts = generic.stats(pr, freq="YS", op="max")
p = generic.fit(ts, dist="gumbel_r")
assert p.attrs["estimator"] == "Maximum likelihood"

def test_nan(self, pr_series):
r = np.random.rand(22)
r[0] = np.nan
pr = pr_series(r)

out = generic.fit(pr, dist="norm")
assert not np.isnan(out.values[0])

def test_ndim(self, pr_ndseries):
pr = pr_ndseries(np.random.rand(100, 1, 2))
out = generic.fit(pr, dist="norm")
assert out.shape == (2, 1, 2)
np.testing.assert_array_equal(out.isnull(), False)

def test_options(self, q_series):
q = q_series(np.random.rand(19))
with set_options(missing_options={"at_least_n": {"n": 10}}):
out = generic.fit(q, dist="norm")
np.testing.assert_array_equal(out.isnull(), False)


class TestFrequencyAnalysis:
"""See other tests in test_land::Test_FA"""

def test_any_variable(self, pr_series):
pr = pr_series(np.random.rand(100))
out = generic.return_level(pr, mode="max", t=2, dist="gamma")
assert out.units == pr.units


class TestStats:
"""See other tests in test_land::TestStats"""

def test_simple(self, pr_series):
pr = pr_series(np.random.rand(400))
out = generic.stats(pr, freq="YS", op="min", season="MAM")
assert out.units == pr.units
39 changes: 10 additions & 29 deletions xclim/testing/tests/test_land.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
"""Tests for indicators in `land` realm."""
from __future__ import annotations

import numpy as np
import pytest
import xarray as xr

from xclim import land, set_options
import xclim.core.utils
from xclim import land


def test_base_flow_index(ndq_series):
Expand All @@ -23,7 +26,6 @@ def test_simple(self, ndq_series):
out = land.freq_analysis(
ndq_series, mode="max", t=[2, 5], dist="gamma", season="DJF"
)
assert out.long_name == "N-year return period flow amount"
assert out.description in [
"Streamflow frequency analysis for the maximal winter 1-day flow "
"estimated using the gamma distribution."
Expand All @@ -34,7 +36,6 @@ def test_simple(self, ndq_series):

def test_no_indexer(self, ndq_series):
out = land.freq_analysis(ndq_series, mode="max", t=[2, 5], dist="gamma")
assert out.long_name == "N-year return period flow amount"
assert out.description in [
"Streamflow frequency analysis for the maximal annual 1-day flow "
"estimated using the gamma distribution."
Expand All @@ -55,6 +56,12 @@ def test_empty(self, ndq_series):
)
assert np.isnan(out.values[:, 0, 0]).all()

def test_wrong_variable(self, pr_series):
with pytest.raises(xclim.core.utils.ValidationError):
land.freq_analysis(
pr_series(np.random.rand(100)), mode="max", t=2, dist="gamma"
)


class TestStats:
def test_simple(self, ndq_series):
Expand All @@ -71,32 +78,6 @@ def test_missing(self, ndq_series):
np.testing.assert_array_equal(out.sel(time="1902").isnull(), True)


class TestFit:
def test_simple(self, ndq_series):
ts = land.stats(ndq_series, freq="YS", op="max")
p = land.fit(ts, dist="gumbel_r")
assert p.attrs["estimator"] == "Maximum likelihood"

def test_nan(self, q_series):
r = np.random.rand(22)
r[0] = np.nan
q = q_series(r)

out = land.fit(q, dist="norm")
assert not np.isnan(out.values[0])

def test_ndim(self, ndq_series):
out = land.fit(ndq_series, dist="norm")
assert out.shape == (2, 2, 3)
np.testing.assert_array_equal(out.isnull(), False)

def test_options(self, q_series):
q = q_series(np.random.rand(19))
with set_options(missing_options={"at_least_n": {"n": 10}}):
out = land.fit(q, dist="norm")
np.testing.assert_array_equal(out.isnull(), False)


def test_qdoy_max(ndq_series, q_series):
out = land.doy_qmax(ndq_series, freq="YS", season="JJA")
assert out.attrs["units"] == ""
Expand Down
Loading