Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement integer array add/sub for datetimelike indexes #19959

Merged
merged 16 commits into from
May 29, 2018

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Mar 2, 2018

AFAICT this is the next-to-last non-raising case for these operations. After this is just object-dtype, and for that we already have the correct implementation in addsub_offset_array, just need to make the condition less strict and add appropriate tests.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@codecov
Copy link

codecov bot commented Mar 2, 2018

Codecov Report

Merging #19959 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19959      +/-   ##
==========================================
+ Coverage   91.83%   91.84%   +<.01%     
==========================================
  Files         153      153              
  Lines       49498    49515      +17     
==========================================
+ Hits        45458    45475      +17     
  Misses       4040     4040
Flag Coverage Δ
#multiple 90.23% <100%> (ø) ⬆️
#single 41.87% <8.69%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/datetimelike.py 96.89% <100%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc8d33e...6cb1b43. Read the comment docs.

td = Timedelta(self.freq)
return op(self, td * other)

else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for an else

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you address this comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fell through the cracks; just pushed an update.

@@ -810,6 +851,11 @@ def __rsub__(self, other):
# we need to wrap in DatetimeIndex and flip the operation
from pandas import DatetimeIndex
return DatetimeIndex(other) - self
elif (is_datetime64_any_dtype(self) and hasattr(other, 'dtype') and
not is_datetime64_any_dtype(other)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on why this is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, but GitHub UI not surfacing.


@pytest.mark.parametrize('freq', ['H', 'D'])
@pytest.mark.parametrize('box', [np.array, pd.Index])
def test_dti_add_intarray_tick(self, box, freq):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we must have some tests that do index arithmetic with integers, either move them here if they are not redundant or remove them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests for integers, but nothing ATM for integer-arrays.

# ---------------------------------------------------------------
# __add__/__sub__ with integer arrays

@pytest.mark.parametrize('box', [np.array, pd.Index])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

@jreback jreback added Datetime Datetime data dtype Timedelta Timedelta data type Period Period data type Compat pandas objects compatability with Numpy or Python functions labels Mar 7, 2018
@jbrockmendel
Copy link
Member Author

Comments above acknowledged.

Let's stick a pin in this for a little bit. I think there may be a problem with integer ops, need to double-check before moving forward on this.

@jbrockmendel
Copy link
Member Author

Let's stick a pin in this for a little bit. I think there may be a problem with integer ops, need to double-check before moving forward on this.

I've convinced myself this is OK. Or more accurately, that weird behavior (not affected by this PR, but in integer add/sub) can only come up if a user specifically passes verify_integrity=False to DatetimeIndex or TimedeltaIndex.

@jbrockmendel
Copy link
Member Author

Needs rebasing since its gotten a bit stale, but I think this is otherwise ready. Will push shortly.

@jbrockmendel
Copy link
Member Author

@jreback gentle ping

@jreback
Copy link
Contributor

jreback commented Mar 28, 2018

rebase as well

@jbrockmendel
Copy link
Member Author

@jreback gentle ping. I've got some time this weekend I can put towards finishing this off.

@jbrockmendel
Copy link
Member Author

@gfyoung any idea if everything/everyone is OK?

@gfyoung
Copy link
Member

gfyoung commented Apr 16, 2018

@jbrockmendel : I see a bunch of comments from @jreback that you haven't explicitly addressed. That might be why no one is saying anything.

@jbrockmendel
Copy link
Member Author

@gfyoung Thanks. I think there's just the one where I haven't un-indented an else: clause as requested; might as well do that now. I think the others are addressed, though some are indirect.

@gfyoung
Copy link
Member

gfyoung commented Apr 16, 2018

I think the others are addressed, though some are indirect.

Did you address the commenting request above?

EDIT: I see the comment now! 😄

@gfyoung
Copy link
Member

gfyoung commented Apr 16, 2018

@jreback : ping. I think all comments have been addressed.

@jbrockmendel
Copy link
Member Author

@jreback gentle ping. I'd like to finish this off and get back to unifying Index/Series/DataFrame arithmetic implementations in core.ops.

@jreback
Copy link
Contributor

jreback commented Apr 26, 2018

will have to wait until after the rc

@jbrockmendel
Copy link
Member Author

@jreback gentle ping

@@ -951,7 +951,7 @@ Datetimelike API Changes
rather than ``NotImplementedError`` when index is not a :class:`DatetimeIndex` (:issue:`20725`).
- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`).
- :func:`pandas.merge` provides a more informative error message when trying to merge on timezone-aware and timezone-naive columns (:issue:`15800`)
- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with ``freq=None``, addition or subtraction of integer-dtyped array or ``Index`` will raise ``NullFrequencyError`` instead of ``TypeError`` (:issue:`19895`)
- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with ``freq=None``, addition or subtraction of integer-dtyped array or ``Index`` will raise ``NullFrequencyError`` instead of ``TypeError`` if the index ``freq`` attribute is ``None``, and will return an object of the same class otherwise (:issue:`19895`, :issue:`19959`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.24 (this is an api change right?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just changed.

@jbrockmendel
Copy link
Member Author

@jreback if we can get this through then DatetimeIndex + IntArray will be meaningful and we can flesh out the tests in #21160.

@jreback jreback added this to the 0.24.0 milestone May 29, 2018
@jreback jreback merged commit b64e9d5 into pandas-dev:master May 29, 2018
@jreback
Copy link
Contributor

jreback commented May 29, 2018

thanks @jbrockmendel

it seems there might be some code you can remove from PeriodIndex though? (obviously in a followup)

@jbrockmendel
Copy link
Member Author

it seems there might be some code you can remove from PeriodIndex though?

Sounds plausible; I'll have to refresh my memory. IIRC the next dtype to handle is Period/PeriodIndex subtraction, with the complication that we need to fix the current scalar subtraction behavior.

@jbrockmendel jbrockmendel deleted the intarrays4 branch May 29, 2018 03:47
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Datetime Datetime data dtype Period Period data type Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants