Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

built-in accessor documentation #3988

Merged
merged 25 commits into from
Jun 13, 2020
Merged

built-in accessor documentation #3988

merged 25 commits into from
Jun 13, 2020

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Apr 21, 2020

We currently use property to attach our built-in accessors (plot, dt and str) to the DataArray and Dataset classes. However, because property returns itself when trying to access it as a class attribute, this does not work when generating the documentation (sphinx inspects classes, not class instances).

This adds a property-like descriptor that works on both classes and instance objects (the name could be more descriptive) and uses that to document the Dataset.plot.* and DataArray.plot.* methods (see the rendered documentation). I have not been able to get sphinx to work with _PlotMethods.__slots__, though.

A few questions / comments on this:

  1. I noticed we have DataArray.plot.__call__ and xarray.plot.plot but not DataArray.plot.plot. Is that something that is worth adding?

  2. The functions decorated with the custom property define docstrings which currently are lost. Should we patch them on their return values?

  3. Right now, the error message when accidentally trying to call e.g. xr.DataArray.plot.line() is not very helpful:

AttributeError: 'NoneType' object has no attribute 'dims'
  1. Now that we can document the accessors, we need to think about how to structure api.rst. For plot and str, we could just list methods / attributes in subsections of DataArray / Dataset (or keep the Plotting / str / ... sections and add class subsections?), but dt is more complicated since it dispatches for datetime and timedelta

Edit: err, it seems I used the branch I pushed to the main repository for a documentation preview for this...

@keewis keewis mentioned this pull request Apr 21, 2020
2 tasks
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to see this work at last.

xarray/core/utils.py Outdated Show resolved Hide resolved
@keewis
Copy link
Collaborator Author

keewis commented Apr 22, 2020

there are a few of my questions / comments we can resolve by using something like

class UncachedAccessor:
    def __init__(self, accessor):
        self._accessor = accessor

    def __get__(self, obj, cls):
        if obj is None:
            return self._accessor

        return self._accessor(obj)

The only disadvantage is that now pydoc (or help()) show the accessor as a class. That was already the case for str and dt, but plot was a function.

I think we could try to trick inspect.getmembers (which is used to decide which kind of output should be done), but that would require a lot of monkeypatching I'd rather not do.

pandas simply uses the same docstring for the class and __call__, which is not ideal but a lot simpler.

@@ -438,7 +438,7 @@ class _PlotMethods:
For example, DataArray.plot.imshow
"""

__slots__ = ("_da",)
# __slots__ = ("_da",)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to match the behavior of _CachedAccessor, it also accepts the
accessor class (not the object). We lose the ability for custom
docstrings, though.
@shoyer
Copy link
Member

shoyer commented Apr 23, 2020

I think pandas might do something similar to do this? It might be worth checking, though it looks like you've figured most of it out already :)

@shoyer
Copy link
Member

shoyer commented Apr 23, 2020

This is a really clever yet simple fix. I like it!

A few ideas for testing:

  • Verify that DataArray.plot returns the PlotMethods object.
  • Verify that data_array.plot.__doc__ is still there.

@shoyer
Copy link
Member

shoyer commented Apr 23, 2020

there are a few of my questions / comments we can resolve by using something like

class UncachedAccessor:
    def __init__(self, accessor):
        self._accessor = accessor

    def __get__(self, obj, cls):
        if obj is None:
            return self._accessor

        return self._accessor(obj)

The only disadvantage is that now pydoc (or help()) show the accessor as a class. That was already the case for str and dt, but plot was a function.

I think we could try to trick inspect.getmembers (which is used to decide which kind of output should be done), but that would require a lot of monkeypatching I'd rather not do.

pandas simply uses the same docstring for the class and __call__, which is not ideal but a lot simpler.

👍 for a simple solution here like pandas rather than using inspect. I think it's probably actually slightly more accurate to show the class.

@keewis
Copy link
Collaborator Author

keewis commented Apr 23, 2020

where should the docstring of the accessor come from? Right now, this is the docstring of the return value which seems reasonable to me.

I added tests verifying that cls.property returns the accessor class and obj.property the accessor instance. Do we need additional checks for __doc__?.

@keewis
Copy link
Collaborator Author

keewis commented May 7, 2020

I added the accessor templates from pandas (we should eventually outsource these to a separate package used by both pandas and xarray) and modified the API page to display the accessor version of the plotting functions.

I think we should add a accessor section to each of the main objects (Dataset and DataArray), where we either copy what pandas does – listing every method / attribute of the accessor as a subsection of the class documentation – or we could show them as attributes (just like what we do for the injected numpy functions) and generate separate pages for the accessors.

doc/api.rst Show resolved Hide resolved
doc/api.rst Show resolved Hide resolved
doc/api.rst Show resolved Hide resolved
doc/api.rst Outdated Show resolved Hide resolved
doc/api.rst Show resolved Hide resolved
doc/api.rst Outdated Show resolved Hide resolved
doc/api.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keewis. LGTM and it's a very nice improvement to our documentation.

@keewis
Copy link
Collaborator Author

keewis commented Jun 12, 2020

I think this is ready for merging; the fix of the DataArray.plot summary is going to be a new PR. So unless there are any objections I'm going to merge tomorrow.

@keewis keewis merged commit e26b80f into master Jun 13, 2020
@keewis keewis deleted the accessor-documentation branch June 13, 2020 17:52
dcherian added a commit to TomNicholas/xarray that referenced this pull request Jun 24, 2020
…o-combine

* 'master' of github.com:pydata/xarray: (81 commits)
  use builtin python types instead of the numpy alias (pydata#4170)
  Revise pull request template (pydata#4039)
  pint support for Dataset (pydata#3975)
  drop eccodes in docs (pydata#4162)
  Update issue templates inspired/based on dask (pydata#4154)
  Fix failing upstream-dev build & remove docs build (pydata#4160)
  Improve typehints of xr.Dataset.__getitem__ (pydata#4144)
  provide a error summary for assert_allclose (pydata#3847)
  built-in accessor documentation (pydata#3988)
  Recommend installing cftime when time decoding fails. (pydata#4134)
  parameter documentation for DataArray.sel (pydata#4150)
  speed up map_blocks (pydata#4149)
  Remove outdated note from datetime accessor docstring (pydata#4148)
  Fix the upstream-dev pandas build failure (pydata#4138)
  map_blocks: Allow passing dask-backed objects in args (pydata#3818)
  keep attrs in reset_index (pydata#4103)
  Fix open_rasterio() for WarpedVRT with specified src_crs (pydata#4104)
  Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit (pydata#4099)
  update numpy's intersphinx url (pydata#4117)
  xr.infer_freq (pydata#4033)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation for injected methods
3 participants