Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

23 xarray utility #125

Merged
merged 49 commits into from
Oct 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
2a82d57
#23 initial push. Validate an xarray against an dictionary which sets…
Aug 20, 2020
d603bb2
#23 Added 3 functions xr_valid_key, xr_check_coords, xr_check_dtype.
mark141 Sep 8, 2020
cc1bc09
added a testfile to show functionality
mark141 Sep 8, 2020
2e1c676
#23 xr_valid_key now raises an error.
mark141 Sep 8, 2020
0b2ff67
add xr_check_coords testcase
CagtayFabry Sep 8, 2020
8d30617
Merge branch '23_xarray_utility' of https://github.com/BAMWelDX/weldx…
CagtayFabry Sep 8, 2020
96b4e57
#23 pytests are now working. Changed one pytest to AttributeError. ad…
mark141 Sep 8, 2020
96998ab
Merge branch 'master' into 23_xarray_utility
CagtayFabry Sep 11, 2020
7097a06
small test changes
CagtayFabry Sep 11, 2020
90ecffb
Added docstrings. Code cleanup.
Sep 28, 2020
8eaf067
Manually linebreaks due to wrong PyCharm settings.
Sep 28, 2020
36191a3
Changed PyCharm to softwrap at 89 characters.
Sep 28, 2020
3bee7eb
added a functionality: it is now allowed to test all numpy strings ag…
Sep 28, 2020
f328506
Added tests to check reachability of all lines of the code. Now all l…
Sep 28, 2020
7277917
Merge branch 'master' into 23_xarray_utility
mark141 Sep 29, 2020
0b8dc04
remove 23_test.py
CagtayFabry Sep 29, 2020
0722113
Merge remote-tracking branch 'origin/23_xarray_utility' into 23_xarra…
CagtayFabry Sep 29, 2020
35cd398
add single entry list testcase for str
CagtayFabry Sep 30, 2020
6124f5d
Added missing lines for str as subtype in a list.
Sep 30, 2020
6dbc693
test numpydoc example syntax
CagtayFabry Sep 30, 2020
58ec790
Merge remote-tracking branch 'remotes/origin/master' into 23_xarray_u…
CagtayFabry Sep 30, 2020
73a3ceb
fix example syntax in xr_check_coords
CagtayFabry Sep 30, 2020
83c9ec5
update docstring syntax
CagtayFabry Sep 30, 2020
227270b
update docstring syntax
CagtayFabry Sep 30, 2020
2df5dca
move example code to end of docstring and test xarray format in Local…
CagtayFabry Sep 30, 2020
9dd59bf
move LCS xarray coord checks to construction check stage
CagtayFabry Sep 30, 2020
d7b5754
fix codacy issue
CagtayFabry Sep 30, 2020
7f159e5
Merge branch 'master' into 23_xarray_utility
CagtayFabry Oct 1, 2020
5f6db5a
fix pydocstring issues
CagtayFabry Oct 1, 2020
70f7066
Merge branch '23_xarray_utility' of https://github.com/BAMWelDX/weldx…
CagtayFabry Oct 1, 2020
b5cabab
Merge branch 'master' into 23_xarray_utility
CagtayFabry Oct 2, 2020
01539b6
added support for general 'timedelta64' and 'datetime64' dtype.
mark141 Oct 7, 2020
3b169e8
change docstring in test functions.
mark141 Oct 7, 2020
203a6e9
now we only llok at the coordinates when procesing the validation.
mark141 Oct 7, 2020
8e55588
added helper function '_check_dtype' to simplify the code.
mark141 Oct 7, 2020
9cbe51a
changed the Error types to the corresponding Error types. TypeError a…
mark141 Oct 7, 2020
6770c33
Update weldx/utility.py
mark141 Oct 7, 2020
c10d4fb
typo change
mark141 Oct 7, 2020
94e20bc
pydocstyle cleanup
CagtayFabry Oct 7, 2020
70968f2
Merge branch 'master' into 23_xarray_utility
CagtayFabry Oct 7, 2020
61cc841
update CHANGELOG.md
CagtayFabry Oct 8, 2020
2532aaf
Merge remote-tracking branch 'origin/23_xarray_utility' into 23_xarra…
CagtayFabry Oct 8, 2020
4a8e8b4
removed _xr_valid_key and changed exception type
CagtayFabry Oct 8, 2020
5ee49cb
change loop signature
CagtayFabry Oct 8, 2020
45b09bc
allow different timedelta dimensions
CagtayFabry Oct 8, 2020
cc1e037
docstring changes
CagtayFabry Oct 8, 2020
680f8c0
add wrong input type exception
CagtayFabry Oct 8, 2020
0699a7c
Merge remote-tracking branch 'origin/23_xarray_utility' into 23_xarra…
CagtayFabry Oct 8, 2020
781b4fe
try fix doc & RTD builds
CagtayFabry Oct 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,11 @@
- add basic schema layout and `GmawProcess` class for arc welding process implementation [#104]
- add example notebook and documentation for arc welding process [#104]
- fix propagating the `name` attribute when reading an ndarray `TimeSeries` object back from ASDF files [#104]
- fix `pint` regression in `TimeSeries` when mixing integer and float values
- fix `pint` regression in `TimeSeries` when mixing integer and float values [#121]
- add `pint` compatibility to some `geometry` classes (**experimental**)
- when passing quantities to constructors (and some functions), values get converted to default unit `mm` and passed on as magnitude
- old behavior is preserved
- add `weldx.utility.xr_check_coords` function to check coordinates of xarray object against dtype and value restrictions [#125]


## 0.2.0 (30.07.2020)
Expand Down
6 changes: 3 additions & 3 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,9 @@
"pandas": ("https://pandas.pydata.org/pandas-docs/stable", None),
"xarray": ("http://xarray.pydata.org/en/stable", None),
"scipy": ("https://docs.scipy.org/doc/scipy/reference", None),
"matplotlib": ("https://matplotlib.org", None),
"dask": ("https://docs.dask.org/en/latest", None),
"numba": ("https://numba.pydata.org/numba-doc/latest", None),
# "matplotlib": ("https://matplotlib.org", None),
# "dask": ("https://docs.dask.org/en/latest", None),
# "numba": ("https://numba.pydata.org/numba-doc/latest", None),
"pint": ("https://pint.readthedocs.io/en/stable", None),
}

Expand Down
68 changes: 68 additions & 0 deletions tests/test_utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,3 +393,71 @@ def test_xf_fill_all():

with pytest.raises(ValueError):
ut.xr_fill_all(da3, order="wrong")


_dax_check = xr.DataArray(
data=np.ones((2, 2, 2, 4, 3)),
dims=["d1", "d2", "d3", "d4", "d5"],
coords={
"d1": np.array([-1, 1], dtype=float),
"d2": np.array([-1, 1], dtype=int),
"d3": pd.DatetimeIndex(["2020-05-01", "2020-05-03"]),
"d4": pd.TimedeltaIndex([0, 1, 2, 3], "s"),
"d5": ["x", "y", "z"],
},
)

_dax_ref = dict(
d1={"values": np.array([-1, 1]), "dtype": "float"},
d2={"values": np.array([-1, 1]), "dtype": int},
d3={
"values": pd.DatetimeIndex(["2020-05-01", "2020-05-03"]),
"dtype": ["datetime64[ns]", "timedelta64[ns]"],
},
d4={
"values": pd.TimedeltaIndex([0, 1, 2, 3], "s"),
"dtype": ["datetime64[ns]", "timedelta64[ns]"],
},
d5={"values": ["x", "y", "z"], "dtype": "<U1"},
)


@pytest.mark.parametrize(
"dax, ref_dict",
[
(_dax_check, _dax_ref),
(_dax_check.coords, _dax_ref),
(_dax_check, {"d1": {"dtype": ["float64", int]}}),
(_dax_check, {"d2": {"dtype": ["float64", int]}}),
(_dax_check, {"no_dim": {"optional": True, "dtype": float}}),
(_dax_check, {"d5": {"dtype": str}}),
(_dax_check, {"d5": {"dtype": [str]}}),
CagtayFabry marked this conversation as resolved.
Show resolved Hide resolved
(_dax_check, {"d4": {"dtype": "timedelta64"}}),
(_dax_check, {"d3": {"dtype": ["datetime64", "timedelta64"]}}),
],
)
def test_xr_check_coords(dax, ref_dict):
"""Test weldx.utility.xr_check_coords function."""
assert ut.xr_check_coords(dax, ref_dict)


@pytest.mark.parametrize(
"dax, ref_dict, exception_type",
[
(_dax_check, {"d1": {"dtype": int}}, TypeError),
(_dax_check, {"d1": {"dtype": int, "optional": True}}, TypeError),
(_dax_check, {"no_dim": {"dtype": float}}, KeyError),
(
_dax_check,
{"d5": {"values": ["x", "noty", "z"], "dtype": "str"}},
ValueError,
),
(_dax_check, {"d1": {"dtype": [int, str, bool]}}, TypeError),
(_dax_check, {"d4": {"dtype": "datetime64"}}, TypeError),
({"d4": np.arange(4)}, {"d4": {"dtype": "int"}}, ValueError),
],
)
def test_xr_check_coords_exception(dax, ref_dict, exception_type):
"""Test weldx.utility.xr_check_coords function."""
with pytest.raises(exception_type):
ut.xr_check_coords(dax, ref_dict)
19 changes: 17 additions & 2 deletions weldx/transformations.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,23 @@ def __init__(
coordinates = self._build_coordinates(coordinates, time)

if construction_checks:
ut.xr_check_coords(
coordinates,
dict(
c={"values": ["x", "y", "z"]},
time={"dtype": "timedelta64", "optional": True},
),
)

ut.xr_check_coords(
orientation,
dict(
c={"values": ["x", "y", "z"]},
v={"values": [0, 1, 2]},
time={"dtype": "timedelta64", "optional": True},
),
)

orientation = xr.apply_ufunc(
normalize,
orientation,
Expand Down Expand Up @@ -636,7 +653,6 @@ def _build_orientation(
"""
if isinstance(orientation, xr.DataArray):
return orientation
# TODO: Test if xarray has correct format

time_orientation = None
if isinstance(orientation, Rot):
Expand Down Expand Up @@ -667,7 +683,6 @@ def _build_coordinates(coordinates, time: pd.DatetimeIndex = None):
"""
if isinstance(coordinates, xr.DataArray):
return coordinates
# TODO: Test if xarray has correct format

time_coordinates = None
if not isinstance(coordinates, (np.ndarray, pint.Quantity)):
Expand Down
131 changes: 131 additions & 0 deletions weldx/utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -562,6 +562,137 @@ def xr_interp_like(
return result


def _check_dtype(var_dtype, ref_dtype: dict) -> bool:
"""Check if dtype matches a reference dtype (or is subdtype).

Parameters
----------
var_dtype : numpy dtype
A numpy-dtype to test against.
ref_dtype : dict
Python type or string description

Returns
-------
bool
True if dtypes matches.

"""
if var_dtype != np.dtype(ref_dtype):
if isinstance(ref_dtype, str):
if (
"timedelta64" in ref_dtype
or "datetime64" in ref_dtype
and np.issubdtype(var_dtype, np.dtype(ref_dtype))
):
return True

if not (
np.issubdtype(var_dtype, np.dtype(ref_dtype)) and np.dtype(ref_dtype) == str
):
return False

return True


def xr_check_coords(dax: xr.DataArray, ref: dict) -> bool:
"""Validate the coordinates of the DataArray against a reference dictionary.

The reference dictionary should have the dimensions as keys and those contain
dictionaries with the following keywords (all optional):

``values``
Specify exact coordinate values to match.

``dtype`` : str or type
Ensure coordinate dtype matches at least one of the given dtypes.

``optional`` : boolean
default ``False`` - if ``True``, the dimension has to be in the DataArray dax

Parameters
----------
dax : xarray.DataArray
xarray object which should be validated
ref : dict
reference dictionary

Returns
-------
bool
True, if the test was a success, else an exception is raised

Examples
--------
>>> import pandas as pd
>>> import xarray as xr
>>> import weldx as wx
>>> dax = xr.DataArray(
... data=np.ones((3, 2, 3)),
... dims=["d1", "d2", "d3"],
... coords={
... "d1": np.array([-1, 0, 2], dtype=int),
... "d2": pd.DatetimeIndex(["2020-05-01", "2020-05-03"]),
... "d3": ["x", "y", "z"],
... }
... )
>>> ref = dict(
... d1={"optional": True, "values": np.array([-1, 0, 2], dtype=int)},
... d2={
... "values": pd.DatetimeIndex(["2020-05-01", "2020-05-03"]),
... "dtype": ["datetime64[ns]", "timedelta64[ns]"],
... },
... d3={"values": ["x", "y", "z"], "dtype": "<U1"},
... )
>>> wx.utility.xr_check_coords(dax, ref)
True

"""
# only process the coords of the xarray
if isinstance(dax, (xr.DataArray, xr.Dataset)):
coords = dax.coords
elif isinstance(
dax,
(
xr.core.coordinates.DataArrayCoordinates,
xr.core.coordinates.DatasetCoordinates,
),
):
coords = dax
else:
raise ValueError("Input variable is not an xarray object")

for key, check in ref.items():
# check if the optional key is set to true
if "optional" in check:
if check["optional"] and key not in coords:
# skip this key - it is not in dax
continue

if key not in coords:
# Attributes not found in coords
raise KeyError(f"Could not find required coordinate '{key}'.")

# only if the key "values" is given do the validation
if "values" in check:
if not (coords[key].values == check["values"]).all():
raise ValueError(f"Value mismatch in DataArray and ref['{key}']")

# only if the key "dtype" is given do the validation
if "dtype" in check:
dtype_list = check["dtype"]
if not isinstance(dtype_list, list):
dtype_list = [dtype_list]
if not any(
_check_dtype(coords[key].dtype, var_dtype) for var_dtype in dtype_list
):
raise TypeError(
f"Mismatch in the dtype of the DataArray and ref['{key}']"
)

return True


def xr_3d_vector(data, times=None) -> xr.DataArray:
"""Create an xarray 3d vector with correctly named dimensions and coordinates.

Expand Down