Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation improvements #3328

Merged
merged 21 commits into from
Sep 29, 2019
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions xarray/core/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,136 @@ def align(
ValueError
If any dimensions without labels on the arguments have different sizes,
or a different size than the size of the aligned dimension labels.

Examples
--------

>>> import xarray as xr
>>> x = xr.DataArray([[25, 35], [10, 24]], dims=('lat', 'lon'),
... coords={'lat': [35., 40.], 'lon': [100., 120.]})
>>> y = xr.DataArray([[20, 5], [7, 13]], dims=('lat', 'lon'),
... coords={'lat': [35., 42.], 'lon': [100., 120.]})

>>> x
<xarray.DataArray (lat: 2, lon: 2)>
array([[25, 35],
[10, 24]])
Coordinates:
* lat (lat) float64 35.0 40.0
* lon (lon) float64 100.0 120.0

>>> y
<xarray.DataArray (lat: 2, lon: 2)>
array([[20, 5],
[ 7, 13]])
Coordinates:
* lat (lat) float64 35.0 42.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y)
>>> a
<xarray.DataArray (lat: 1, lon: 2)>
array([[25, 35]])
Coordinates:
* lat (lat) float64 35.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 1, lon: 2)>
array([[20, 5]])
Coordinates:
* lat (lat) float64 35.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y, join='outer')
>>> a
<xarray.DataArray (lat: 3, lon: 2)>
array([[25., 35.],
[10., 24.],
[nan, nan]])
Coordinates:
* lat (lat) float64 35.0 40.0 42.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 3, lon: 2)>
array([[20., 5.],
[nan, nan],
[ 7., 13.]])
Coordinates:
* lat (lat) float64 35.0 40.0 42.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y, join='outer', fill_value=-999)
>>> a
<xarray.DataArray (lat: 3, lon: 2)>
array([[ 25, 35],
[ 10, 24],
[-999, -999]])
Coordinates:
* lat (lat) float64 35.0 40.0 42.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 3, lon: 2)>
array([[ 20, 5],
[-999, -999],
[ 7, 13]])
Coordinates:
* lat (lat) float64 35.0 40.0 42.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y, join='left')
>>> a
<xarray.DataArray (lat: 2, lon: 2)>
array([[25, 35],
[10, 24]])
Coordinates:
* lat (lat) float64 35.0 40.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 2, lon: 2)>
array([[20., 5.],
[nan, nan]])
Coordinates:
* lat (lat) float64 35.0 40.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y, join='right')
>>> a
<xarray.DataArray (lat: 2, lon: 2)>
array([[25., 35.],
[nan, nan]])
Coordinates:
* lat (lat) float64 35.0 42.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 2, lon: 2)>
array([[20, 5],
[ 7, 13]])
Coordinates:
* lat (lat) float64 35.0 42.0
* lon (lon) float64 100.0 120.0

>>> a, b = xr.align(x, y, join='exact')
Traceback (most recent call last):
...
"indexes along dimension {!r} are not equal".format(dim)
ValueError: indexes along dimension 'lat' are not equal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit awkward to me. How about just ValueError: indexes along dimension 'lat' are not equal? Or other ideas welcome / not terrible atm


>>> a, b = xr.align(x, y, join='override')
>>> a
<xarray.DataArray (lat: 2, lon: 2)>
array([[25, 35],
[10, 24]])
Coordinates:
* lat (lat) float64 35.0 40.0
* lon (lon) float64 100.0 120.0
>>> b
<xarray.DataArray (lat: 2, lon: 2)>
array([[20, 5],
[ 7, 13]])
Coordinates:
* lat (lat) float64 35.0 40.0
* lon (lon) float64 100.0 120.0

"""
if indexes is None:
indexes = {}
Expand Down
111 changes: 94 additions & 17 deletions xarray/core/combine.py
Original file line number Diff line number Diff line change
Expand Up @@ -505,8 +505,7 @@ def combine_by_coords(
----------
datasets : sequence of xarray.Dataset
Dataset objects to combine.
compat : {'identical', 'equals', 'broadcast_equals',
'no_conflicts', 'override'}, optional
compat : {'identical', 'equals', 'broadcast_equals', 'no_conflicts', 'override'}, optional
String indicating how to compare variables of the same name for
potential conflicts:

Expand All @@ -520,9 +519,33 @@ def combine_by_coords(
of all non-null values.
- 'override': skip comparing and pick variable from first dataset
data_vars : {'minimal', 'different', 'all' or list of str}, optional
Details are in the documentation of concat
These data variables will be concatenated together:

* 'minimal': Only data variables in which the dimension already
appears are included.
* 'different': Data variables which are not equal (ignoring
attributes) across all datasets are also concatenated (as well as
all for which dimension already appears). Beware: this option may
load the data payload of data variables into memory if they are not
already loaded.
* 'all': All data variables will be concatenated.
* list of str: The listed data variables will be concatenated, in
addition to the 'minimal' data variables.
If objects are DataArrays, `data_vars` must be 'all'.
coords : {'minimal', 'different', 'all' or list of str}, optional
Details are in the documentation of concat
These coordinate variables will be concatenated together:

* 'minimal': Only coordinates in which the dimension already appears
are included.
* 'different': Coordinates which are not equal (ignoring attributes)
across all datasets are also concatenated (as well as all for which
dimension already appears). Beware: this option may load the data
payload of coordinate variables into memory if they are not already
loaded.
* 'all': All coordinate variables will be concatenated, except
those corresponding to other dimensions.
* list of str: The listed coordinate variables will be concatenated,
in addition to the 'minimal' coordinates.
andersy005 marked this conversation as resolved.
Show resolved Hide resolved
fill_value : scalar, optional
Value to use for newly missing values
join : {'outer', 'inner', 'left', 'right', 'exact'}, optional
Expand Down Expand Up @@ -556,29 +579,83 @@ def combine_by_coords(
they are concatenated based on the values in their dimension coordinates,
not on their position in the list passed to `combine_by_coords`.

>>> import pandas as pd
>>> import numpy as np
>>> import xarray as xr

>>> x1 = xr.Dataset({"temperature": (("time", "x"), 20 * np.random.rand(6).reshape(2, 3)),
... "precipitation": (("time", "x"), np.random.rand(6).reshape(2, 3))},
... coords={"time": pd.date_range(start="2000-01", periods=2, freq='M'),
... "x": [10, 20, 30]})
>>> x2 = xr.Dataset({"temperature": (("time", "x"), 20 * np.random.rand(6).reshape(2, 3)),
... "precipitation": (("time", "x"), np.random.rand(6).reshape(2, 3))},
... coords={"time": pd.date_range(start="2000-03", periods=2, freq='M'),
... "x": [10, 20, 30]})

>>> x3 = xr.Dataset({"temperature": (("time", "x"), 20 * np.random.rand(6).reshape(2, 3)),
... "precipitation": (("time", "x"), np.random.rand(6).reshape(2, 3))},
... coords={"time": pd.date_range(start="2000-03", periods=2, freq='M'),
... "x": [40, 50, 60]})
dcherian marked this conversation as resolved.
Show resolved Hide resolved

>>> x1
<xarray.Dataset>
Dimensions: (x: 3)
Coords:
* position (x) int64 0 1 2
Dimensions: (time: 2, x: 3)
Coordinates:
* time (time) datetime64[ns] 2000-01-31 2000-02-29
* x (x) int64 10 20 30
Data variables:
temperature (x) float64 11.04 23.57 20.77 ...
temperature (time, x) float64 9.022 4.041 7.246 3.474 16.26 3.05
precipitation (time, x) float64 0.5777 0.3621 0.8043 0.5203 0.03721 0.3805

>>> x2
<xarray.Dataset>
Dimensions: (x: 3)
Coords:
* position (x) int64 3 4 5
Dimensions: (time: 2, x: 3)
Coordinates:
* time (time) datetime64[ns] 2000-03-31 2000-04-30
* x (x) int64 10 20 30
Data variables:
temperature (time, x) float64 1.88 2.184 7.438 3.046 6.719 7.501
precipitation (time, x) float64 0.02079 0.8212 0.9924 0.405 0.16 0.2543

>>> x3
<xarray.Dataset>
Dimensions: (time: 2, x: 3)
Coordinates:
* time (time) datetime64[ns] 2000-03-31 2000-04-30
* x (x) int64 40 50 60
Data variables:
temperature (time, x) float64 15.31 7.169 7.927 16.01 16.58 15.33
precipitation (time, x) float64 0.2915 0.4556 0.2424 0.193 0.3184 0.6775

>>> xr.combine_by_coords([x2, x1])
<xarray.Dataset>
Dimensions: (time: 4, x: 3)
Coordinates:
* x (x) int64 10 20 30
* time (time) datetime64[ns] 2000-01-31 2000-02-29 ... 2000-04-30
Data variables:
temperature (time, x) float64 9.022 4.041 7.246 ... 3.046 6.719 7.501
precipitation (time, x) float64 0.5777 0.3621 0.8043 ... 0.405 0.16 0.2543

>>> xr.combine_by_coords([x3, x1])
<xarray.Dataset>
Dimensions: (time: 4, x: 6)
Coordinates:
* time (time) datetime64[ns] 2000-01-31 2000-02-29 ... 2000-04-30
* x (x) int64 10 20 30 40 50 60
Data variables:
temperature (x) float64 6.97 8.13 7.42 ...
temperature (time, x) float64 9.022 4.041 7.246 nan ... 16.01 16.58 15.33
precipitation (time, x) float64 0.5777 0.3621 0.8043 ... 0.3184 0.6775

>>> combined = xr.combine_by_coords([x2, x1])
>>> xr.combine_by_coords([x3, x1], join='override')
<xarray.Dataset>
Dimensions: (x: 6)
Coords:
* position (x) int64 0 1 2 3 4 5
Dimensions: (time: 2, x: 6)
Coordinates:
* time (time) datetime64[ns] 2000-01-31 2000-02-29
* x (x) int64 10 20 30 40 50 60
Data variables:
temperature (x) float64 11.04 23.57 20.77 ...
temperature (time, x) float64 9.022 4.041 7.246 ... 16.01 16.58 15.33
precipitation (time, x) float64 0.5777 0.3621 0.8043 ... 0.3184 0.6775
"""

# Group by data vars
Expand Down
Loading