Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement integrate #2653

Merged
merged 9 commits into from
Jan 31, 2019
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 12 additions & 13 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3867,7 +3867,7 @@ def differentiate(self, coord, edge_order=1, datetime_unit=None):
variables[k] = v
return self._replace_vars_and_dims(variables)

def integrate(self, dim, datetime_unit=None):
def integrate(self, coord, datetime_unit=None):
fujiisoup marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should coord=None have the default behavior of integrating over all dimensions? Or would that be confusing in some way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think it would be a little confusing because the result may change depending on which coordinate is used for integrate, e.g. if the DataArray has a dimension without coordinate but another one-dimensional coordinate, it is not very clear which should be used.

It would be a little convenient for 1d arrays, but aswe disallow default argument for diff, I like to disallow default argument here too.

""" integrate the array with the trapezoidal rule.

.. note::
Expand All @@ -3892,23 +3892,23 @@ def integrate(self, dim, datetime_unit=None):
DataArray.integrate
numpy.trapz: corresponding numpy function
"""
if not isinstance(dim, (list, tuple)):
dim = (dim, )
if not isinstance(coord, (list, tuple)):
coord = (coord, )
result = self
for d in dim:
result = result._integrate_one(d, datetime_unit=datetime_unit)
for c in coord:
result = result._integrate_one(c, datetime_unit=datetime_unit)
return result

def _integrate_one(self, dim, datetime_unit=None):
def _integrate_one(self, coord, datetime_unit=None):
from .variable import Variable

if dim not in self.variables and dim not in self.dims:
if coord not in self.variables and coord not in self.dims:
raise ValueError('Coordinate {} does not exist.'.format(dim))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think splitting these checks into two would be a little clearer:

  • "cannot integrate over dimension {} because it does not exist" (for dim not in self.dims)
  • "cannot integrate over dimension {} because there is no corresponding coordinate" (for dim not in self.variables)


coord_var = self[dim].variable
coord_var = self[coord].variable
if coord_var.ndim != 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is currently not possible due to xarray's data model, but it's a good idea to add this anyways given that we want to change this soon (e.g., see #2405).

I would recommend adjusting this to if coord_var.dims != (dim,), which is a little stricter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first thought that it would be nice if we could integrate even along non-dimensional (1d) coordinate (as interpolate_na, differential do), but it also sounds something too much.
How do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems reasonable to support

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, coord is a better argument rather than dim?
Or we use dim for argument but support integration along non-dimensional coordinate with a slight avoidance of correctness, as it is more consistent with other reduction methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, differentiate uses coord, so maybe integrate should too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, +1 for consistency with differentiate.

raise ValueError('Coordinate {} must be 1 dimensional but is {}'
' dimensional'.format(dim, coord_var.ndim))
' dimensional'.format(coord, coord_var.ndim))

dim = coord_var.dims[0]
if _contains_datetime_like_objects(coord_var):
Expand All @@ -3920,12 +3920,12 @@ def _integrate_one(self, dim, datetime_unit=None):
coord_var, datetime_unit=datetime_unit)

variables = OrderedDict()
coord_names = []
coord_names = set()
for k, v in self.variables.items():
if k in self.coords:
if dim not in v.dims:
variables[k] = v
coord_names.append(k)
coord_names.add(k)
else:
if k in self.data_vars and dim in v.dims:
if _contains_datetime_like_objects(v):
Expand All @@ -3937,8 +3937,7 @@ def _integrate_one(self, dim, datetime_unit=None):
variables[k] = Variable(v_dims, integ)
else:
variables[k] = v
return self._replace_vars_and_dims(variables,
coord_names=set(coord_names))
return self._replace_vars_and_dims(variables, coord_names=coord_names)

@property
def real(self):
Expand Down