BUG/Perf: Support ExtensionArrays in where #24114

TomAugspurger · 2018-12-05T17:27:22Z

We need some way to do .where on EA object for DatetimeArray. Adding it
to the interface is, I think, the easiest way.

Initially I started to write a version on ExtensionBlock, but it proved
to be unwieldy. to write a version that performed well for all types.
It may be possible to do using _ndarray_values but we'd need a few more
things around that (missing values, converting an arbitrary array to the
"same' ndarary_values, error handling, re-constructing). It seemed easier
to push this down to the array.

The implementation on ExtensionArray is readable, but likely slow since
it'll involve a conversion to object-dtype.

Closes #24077
Closes #24142
Closes #16983

We need some way to do `.where` on EA object for DatetimeArray. Adding it to the interface is, I think, the easiest way. Initially I started to write a version on ExtensionBlock, but it proved to be unwieldy. to write a version that performed well for all types. It *may* be possible to do using `_ndarray_values` but we'd need a few more things around that (missing values, converting an arbitrary array to the "same' ndarary_values, error handling, re-constructing). It seemed easier to push this down to the array. The implementation on ExtensionArray is readable, but likely slow since it'll involve a conversion to object-dtype. Closes pandas-dev#24077

pep8speaks · 2018-12-05T17:27:38Z

Hello @TomAugspurger! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/arrays/base.py !
There are no PEP8 issues in the file pandas/core/arrays/categorical.py !
There are no PEP8 issues in the file pandas/core/arrays/interval.py !
There are no PEP8 issues in the file pandas/core/arrays/period.py !
There are no PEP8 issues in the file pandas/core/arrays/sparse.py !
There are no PEP8 issues in the file pandas/core/dtypes/base.py !
There are no PEP8 issues in the file pandas/core/indexes/category.py !
There are no PEP8 issues in the file pandas/core/internals/blocks.py !
There are no PEP8 issues in the file pandas/tests/arrays/categorical/test_indexing.py !
There are no PEP8 issues in the file pandas/tests/arrays/interval/test_interval.py !
There are no PEP8 issues in the file pandas/tests/arrays/test_period.py !
There are no PEP8 issues in the file pandas/tests/extension/base/methods.py !
There are no PEP8 issues in the file pandas/tests/extension/json/test_json.py !
There are no PEP8 issues in the file pandas/tests/extension/test_sparse.py !

TomAugspurger · 2018-12-05T17:28:17Z

categorical perf:

master

In [2]: import pandas as pd

In [3]: cser = pd.Series(pd.Categorical(['a', 'b', 'c'] * 10000))

In [4]: %timeit cser.where(cser == 'a', 'c')
1.18 ms ± 37.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

PR:

In [3]: %timeit cser.where(cser == 'a', 'c')
755 µs ± 23.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

doc/source/whatsnew/v0.24.0.rst

* Ensure data generated OK. * Remove erroneous comments about alignment. That was user error.

codecov · 2018-12-05T18:34:17Z

Codecov Report

Merging #24114 into master will increase coverage by <.01%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master   #24114      +/-   ##
==========================================
+ Coverage    92.2%    92.2%   +<.01%     
==========================================
  Files         162      162              
  Lines       51714    51782      +68     
==========================================
+ Hits        47682    47747      +65     
- Misses       4032     4035       +3

Flag	Coverage Δ
#multiple	`90.6% <94.44%> (ø)`	⬆️
#single	`43% <12.5%> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/dtypes/base.py	`100% <100%> (ø)`	⬆️
pandas/core/arrays/base.py	`96.81% <100%> (+0.04%)`	⬆️
pandas/core/indexes/category.py	`97.88% <100%> (-0.02%)`	⬇️
pandas/core/arrays/interval.py	`93.16% <100%> (+0.17%)`	⬆️
pandas/core/arrays/sparse.py	`91.92% <88.88%> (-0.04%)`	⬇️
pandas/core/internals/blocks.py	`93.71% <93.33%> (-0.01%)`	⬇️
pandas/core/arrays/categorical.py	`95.37% <94.11%> (-0.03%)`	⬇️
pandas/core/arrays/period.py	`98.31% <94.44%> (-0.16%)`	⬇️
pandas/util/testing.py	`87.51% <0%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f2c716...56470c3. Read the comment docs.

codecov · 2018-12-05T18:34:17Z

Codecov Report

Merging #24114 into master will increase coverage by <.01%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master   #24114      +/-   ##
==========================================
+ Coverage    92.2%    92.2%   +<.01%     
==========================================
  Files         162      162              
  Lines       51714    51782      +68     
==========================================
+ Hits        47682    47747      +65     
- Misses       4032     4035       +3

Flag	Coverage Δ
#multiple	`90.6% <94.44%> (ø)`	⬆️
#single	`43% <12.5%> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/dtypes/base.py	`100% <100%> (ø)`	⬆️
pandas/core/arrays/base.py	`96.81% <100%> (+0.04%)`	⬆️
pandas/core/indexes/category.py	`97.88% <100%> (-0.02%)`	⬇️
pandas/core/arrays/interval.py	`93.16% <100%> (+0.17%)`	⬆️
pandas/core/arrays/sparse.py	`91.92% <88.88%> (-0.04%)`	⬇️
pandas/core/internals/blocks.py	`93.71% <93.33%> (-0.01%)`	⬇️
pandas/core/arrays/categorical.py	`95.37% <94.11%> (-0.03%)`	⬇️
pandas/core/arrays/period.py	`98.31% <94.44%> (-0.16%)`	⬇️
pandas/util/testing.py	`87.51% <0%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f2c716...56470c3. Read the comment docs.

codecov · 2018-12-05T18:34:24Z

Codecov Report

Merging #24114 into master will increase coverage by <.01%.
The diff coverage is 97.43%.

@@            Coverage Diff             @@
##           master   #24114      +/-   ##
==========================================
+ Coverage   92.21%   92.21%   +<.01%     
==========================================
  Files         162      162              
  Lines       51723    51761      +38     
==========================================
+ Hits        47694    47731      +37     
- Misses       4029     4030       +1

Flag	Coverage Δ
#multiple	`90.61% <97.43%> (ø)`	⬆️
#single	`43% <7.69%> (-0.03%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/arrays/base.py	`97.41% <ø> (ø)`	⬆️
pandas/core/indexes/category.py	`97.9% <ø> (ø)`	⬆️
pandas/core/arrays/sparse.py	`92.08% <ø> (ø)`	⬆️
pandas/core/internals/blocks.py	`93.81% <100%> (+0.13%)`	⬆️
pandas/core/arrays/categorical.py	`95.3% <83.33%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 029cde2...539d3cb. Read the comment docs.

TomAugspurger · 2018-12-05T19:38:35Z

Hmm I don't like the return dtype depending on the values. Perhaps we do this with a deprecation warning?

…

On Wed, Dec 5, 2018 at 1:30 PM Jeff Reback ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In doc/source/whatsnew/v0.24.0.rst <#24114 (comment)>: > @@ -1262,6 +1264,7 @@ Categorical - In meth:`Series.unstack`, specifying a ``fill_value`` not present in the categories now raises a ``TypeError`` rather than ignoring the ``fill_value`` (:issue:`23284`) - Bug when resampling :meth:`Dataframe.resample()` and aggregating on categorical data, the categorical dtype was getting lost. (:issue:`23227`) - Bug in many methods of the ``.str``-accessor, which always failed on calling the ``CategoricalIndex.str`` constructor (:issue:`23555`, :issue:`23556`) +- Bug in :meth:`Series.where` losing the categorical dtype for categorical data (:issue:`24077`) if its in the categories then this should work and return categorical, if its not hen i think coercing to object is ok — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24114 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHItYzZZM79tE07IEHNarQnkEaGctNks5u2B6_gaJpZM4ZDPyj> .

jreback · 2018-12-05T19:31:08Z

pandas/core/arrays/base.py

+        Series.where : Similar method for Series.
+        DataFrame.where : Similar method for DataFrame.
+        """
+        return type(self)._from_sequence(np.where(cond, self, other),


hmm this turns it into an array. we have much special handling for this (e.g. see .where for DTI). i think this needs to dispatch somehow.

oh I see you override things. ok then.

TomAugspurger · 2018-12-06T15:20:45Z

@jorisvandenbossche do you have any objections to adding ExtensionArray.where to the interface? Or do the behavior change on Series[category].where?

jorisvandenbossche · 2018-12-06T15:43:50Z

Just for context: how is this different from eaarray[cond] = other ?

The behaviour change to keep the categorical dtype is certainly fine.

TomAugspurger · 2018-12-06T15:47:26Z

I suppose that `np.where` would work on items that don't implement `__setitem__`. But other than that they should be identical for 1-d arrays, right?

…

On Thu, Dec 6, 2018 at 9:44 AM Joris Van den Bossche < ***@***.***> wrote: Just for context: how is this different from eaarray[cond] = other ? The behaviour change to keep the categorical dtype is certainly fine. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24114 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHIkP1MIBozQDarm1XFzpLj6LMJeHJks5u2TtCgaJpZM4ZDPyj> .

jorisvandenbossche · 2018-12-06T15:52:53Z

But other than that they should be identical for 1-d arrays, right?

And since EAs are 1D, and our internal EAs support setitem, why is the new code needed? Or what in setitem is not working as it should right now? (maybe I am missing some context)

TomAugspurger · 2018-12-06T15:55:38Z

This came out of the DatetimeArray refactor. I'll have to take another look at exactly what the failures where. They were pretty deep in the internals.

…

On Thu, Dec 6, 2018 at 9:53 AM Joris Van den Bossche < ***@***.***> wrote: But other than that they should be identical for 1-d arrays, right? And since EAs are 1D, and our internal EAs support setitem, why is the new code needed? Or what in setitem is not working as it should right now? (maybe I am missing some context) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24114 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHInMDbZ9aQt7DQXAVqcteU2aBIjFXks5u2T1ngaJpZM4ZDPyj> .

jschendel · 2018-12-06T21:05:23Z

pandas/core/arrays/base.py

@@ -661,6 +662,42 @@ def take(self, indices, allow_fill=False, fill_value=None):
        # pandas.api.extensions.take
        raise AbstractMethodError(self)

+    def where(self, cond, other):


The other implementations of where (DataFrame.where, Index.where, etc.) have other default to NA. Do we want to maintain that convention here too?

jschendel · 2018-12-06T21:31:09Z

pandas/core/arrays/interval.py

+            lother = other.left
+            rother = other.right
+        left = np.where(cond, self.left, lother)
+        right = np.where(cond, self.right, rother)


left/right should have a where method, so might be a bit safer to do something like:

left = self.left.where(cond, lother) right = self.right.where(cond, rother)

np.where looks like it can cause some problems depending on what left/right are:

In [2]: left = pd.date_range('2018', periods=3); left Out[2]: DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03'], dtype='datetime64[ns]', freq='D') In [3]: np.where([True, False, True], left, pd.NaT) Out[3]: array([1514764800000000000, NaT, 1514937600000000000], dtype=object)

jschendel · 2018-12-06T21:38:25Z

pandas/core/arrays/interval.py

@@ -777,6 +777,17 @@ def take(self, indices, allow_fill=False, fill_value=None, axis=None,

        return self._shallow_copy(left_take, right_take)

+    def where(self, cond, other):


Would be nice to have IntervalIndex use this implementation instead of the naive object array based implementation that it currently uses. Can certainly leave that for a follow-up PR though, and I'd be happy to do it.

jschendel · 2018-12-06T21:38:45Z

pandas/core/arrays/interval.py

@@ -777,6 +777,17 @@ def take(self, indices, allow_fill=False, fill_value=None, axis=None,

        return self._shallow_copy(left_take, right_take)

+    def where(self, cond, other):
+        if is_scalar(other) and isna(other):
+            lother = rother = other


To be safe, I think this should be lother = rother = self.left._na_value to ensure that we're filling left/right with the correct NA value. If we use left/right.where instead of np.where this should be handled automatically iirc, so could maybe just do that instead.

jschendel · 2018-12-06T21:49:45Z

pandas/core/arrays/interval.py

+    def where(self, cond, other):
+        if is_scalar(other) and isna(other):
+            lother = rother = other
+        else:


Can you make this an elif that checks that other is interval-like (something like isinstance(other, Interval) or is_interval_dtype(other)), then have an else clause that raises a ValueError saying other must be interval-like?

As written I think this would raise a somewhat unclear AttributeError in self._check_closed_matches since it assumes other.closed exists.

TomAugspurger · 2018-12-07T03:40:25Z

On further reflection, I realize that ndarrays don't have a where method, so I don't think we should add ExtensionArray.where.

I'll see if setitem on a copy is sufficient.

TomAugspurger · 2018-12-07T17:21:09Z

pandas/core/indexes/category.py

@@ -501,10 +501,13 @@ def _can_reindex(self, indexer):

    @Appender(_index_shared_docs['where'])
    def where(self, cond, other=None):
+        # TODO: Investigate an alternative implementation with


TomAugspurger · 2018-12-07T17:22:19Z

pandas/core/internals/blocks.py

+            # for the type
+            other = self.dtype.na_value
+
+        if is_sparse(self.values):


Without this, we fail in the

result = self._holder._from_sequence( np.where(cond, self.values, other), dtype=dtype,

since the where may change the dtype, if NaN is introduced.

Implementing SparseArray.__setitem__ would allow us to remove this block.

this should be an overriding method in Sparse then, not here

We don't have a SparseBlock anymore. I can add one back if you want, but I figured it'd be easier not to since implementing SparseArray.__setitem__ will remove the need for this, and we'd just have to remove SparseBlock again.

this is pretty hacky. This was why we had originally a .get_values() methon on Sparse to do things like this. We need something to give back the underlying type of the object, which is useful for Categorical as well. Would rather create a generalized soln than hack it like this.

Actually, we don't need this. I think we can just re-infer the dtype from the output of np.where.

so is this changing?

Changing from master? Yes, in the sense that it'll return a SparseArray. But it still densifies when np.where is called.

If you mean "is this changing in the future", yes it'll be removed when SparseArray.__setitem__ is implemented.

oh ok, can you add a TODO comment

TomAugspurger · 2018-12-07T17:56:49Z

Or what in setitem is not working as it should right now? (maybe I am missing some context)

@jorisvandenbossche OK, here's some context :) The most immediate failure is mismatched block dimensions / shapes for in DataFrame.where for EA (not just DatetimeArray, but that was tested). This uses Categorical

In [8]: df = pd.DataFrame({"A": pd.Categorical([1, 2, 3])})

In [9]: df.where(pd.DataFrame({"A": [True, False, True]}))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-56dcebf7e672> in <module>
----> 1 df.where(pd.DataFrame({"A": [True, False, True]}))

... 

~/sandbox/pandas/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
     84             raise ValueError(
     85                 'Wrong number of items passed {val}, placement implies '
---> 86                 '{mgr}'.format(val=len(self.values), mgr=len(self.mgr_locs)))
     87
     88     def _check_ndim(self, values, ndim):

ValueError: Wrong number of items passed 3, placement implies 1

The broadcasting is all messed up since the shapes aren't right (we're using Block.where).

ipdb> cond
array([[ True],
       [False],
       [ True]])
ipdb> values
[1, 2, 3]
Categories (3, int64): [1, 2, 3]
ipdb> other
nan

A hacky, but shorter fix is to use the following (this is in Block.where)

diff --git a/pandas/core/internals/blocks.py b/pandas/core/internals/blocks.py
index 618b9eb12..2356a226d 100644
--- a/pandas/core/internals/blocks.py
+++ b/pandas/core/internals/blocks.py
@@ -1319,12 +1319,20 @@ class Block(PandasObject):
 
         values = self.values
         orig_other = other
+        if not self._can_consolidate:
+            transpose = False
+
         if transpose:
             values = values.T
 
         other = getattr(other, '_values', getattr(other, 'values', other))
         cond = getattr(cond, 'values', cond)
 
+        if not self._can_consolidate:
+            if cond.ndim == 2:
+                assert cond.shape[-1] == 1
+                cond = cond.ravel()
+
         # If the default broadcasting would go in the wrong direction, then
         # explicitly reshape other instead
         if getattr(other, 'ndim', 0) >= 1:

That fixes most of the issues I was having on the DTA branch. Still running the tests to see if any were re-broken.
(edit: it's slightly more complicated, have to handle reshaping other as well, so ~15 more LOC).

So, in summary

we need to support EAs in DataFrame.where
The diff just above is a hacky way of achieving just that (no more).
This PR will still be useful for avoiding the conversion to object (DatetimeTZBlock will avoid it even without this PR, thanks to _try_coerce_args coercing datetimes to ints)

pandas/core/internals/blocks.py

jreback · 2018-12-07T21:09:39Z

pandas/core/internals/blocks.py

+            # for the type
+            other = self.dtype.na_value
+
+        if is_sparse(self.values):


this should be an overriding method in Sparse then, not here

jreback · 2018-12-07T21:10:04Z

pandas/core/internals/blocks.py

+        else:
+            dtype = self.dtype
+
+        # rough heuristic to see if the other array implements setitem


again you don't actually need to do this here, rather override in the appropriate class

We will still need the check for extension, even if we create SparseBlock again.

jreback · 2018-12-07T21:11:08Z

pandas/tests/arrays/categorical/test_indexing.py

@@ -122,6 +162,60 @@ def test_get_indexer_non_unique(self, idx_values, key_values, key_class):
            tm.assert_numpy_array_equal(expected, result)
            tm.assert_numpy_array_equal(exp_miss, res_miss)

+    def test_where_unobserved_nan(self):


is where all of the where tests are?

There weren't any previously (we used to fall back to object).

* Unxfail (most) of the new combine_first tests * Removed stale comment * group conditions

TomAugspurger · 2018-12-07T22:17:19Z

Updated. Main outstanding point is whether or not we should create a SparseBlock just for this. I don't have a preference.

jreback · 2018-12-09T14:13:43Z

pandas/core/internals/blocks.py

+        if isinstance(other, (ABCIndexClass, ABCSeries)):
+            other = other.array
+
+        elif isinstance(other, ABCDataFrame):


can you add some comments here

jreback · 2018-12-09T14:15:02Z

pandas/core/internals/blocks.py

+            # for the type
+            other = self.dtype.na_value
+
+        if is_sparse(self.values):


this is pretty hacky. This was why we had originally a .get_values() methon on Sparse to do things like this. We need something to give back the underlying type of the object, which is useful for Categorical as well. Would rather create a generalized soln than hack it like this.

jreback · 2018-12-09T14:15:12Z

pandas/core/internals/blocks.py

+
+        # rough heuristic to see if the other array implements setitem
+        if (self._holder.__setitem__ == ExtensionArray.__setitem__
+                or self._holder.__setitem__ == SparseArray.__setitem__):


what the heck is this?

The general block is to check if the block implements __setitem__. That specific line is backwards compat for SparseArray, which implements __setitem__ to raise a TypeError instead of a NotImplementedError.

I suppose it'd be cleaner to do this in a try / except block...

TomAugspurger · 2018-12-09T20:45:00Z

Cleaned things up a bit I think.

TomAugspurger · 2018-12-10T12:31:03Z

All green.

jreback

looks pretty reasonable. question about the sparse checks.

jreback · 2018-12-10T13:07:19Z

pandas/core/internals/blocks.py

+            # for the type
+            other = self.dtype.na_value
+
+        if is_sparse(self.values):


so is this changing?

jreback · 2018-12-10T13:08:08Z

pandas/core/internals/blocks.py

+            if lib.is_scalar(other):
+                msg = object_msg.format(other)
+            else:
+                msg = compat.reprlib.repr(other)


why is this needed?

So we don't blow up with a long message for large categoricals. I messed it up though, one sec.

I've removed all this stuff and just print out the text of the message.

With a bit of effort we could figure out exactly which of the new values is causing the fallback of object, but that'd take some work (we don't know the exact type /dtype of other here, so there will be a lot of conditions). Not a high priority.

jreback

small additional comments, lgtm otherwise. ping on green.

jreback · 2018-12-10T14:03:03Z

pandas/tests/extension/test_categorical.py

-    return np.random.choice(list(string.ascii_letters), size=100)
+    while True:
+        values = np.random.choice(list(string.ascii_letters), size=100)
+        # ensure we meet the requirement


no repeated values but duplicates allowed?

Just that the first two are distinct., since the where test requires that data[0] != data[1].

jreback · 2018-12-10T14:03:34Z

pandas/tests/extension/conftest.py

@@ -11,7 +11,11 @@ def dtype():

 @pytest.fixture
 def data():
-    """Length-100 array for this type."""
+    """Length-100 array for this type.


can you copy this doc-string to the categorical one

jreback · 2018-12-10T14:04:09Z

pandas/core/internals/blocks.py

@@ -2658,6 +2708,32 @@ def concat_same_type(self, to_concat, placement=None):
            values, placement=placement or slice(0, len(values), 1),
            ndim=self.ndim)

+    def where(self, other, cond, align=True, errors='raise',
+              try_cast=False, axis=0, transpose=False):
+        # This can all be deleted in favor of ExtensionBlock.where once


can you add TODO(EA) or someting here so we know to remove this

jreback · 2018-12-10T14:04:26Z

pandas/core/internals/blocks.py

+            # for the type
+            other = self.dtype.na_value
+
+        if is_sparse(self.values):


oh ok, can you add a TODO comment

TomAugspurger · 2018-12-10T14:57:47Z

All green.

jreback · 2018-12-10T15:21:12Z

thanks!

Closes pandas-dev#24077

TomAugspurger added Indexing Related to indexing on series/frames, not to indexes themselves ExtensionArray Extending pandas with custom dtypes or arrays. labels Dec 5, 2018

TomAugspurger added this to the 0.24.0 milestone Dec 5, 2018

TomAugspurger commented Dec 5, 2018

View reviewed changes

doc/source/whatsnew/v0.24.0.rst Show resolved Hide resolved

Fixups:

56470c3

* Ensure data generated OK. * Remove erroneous comments about alignment. That was user error.

32-bit compat

6f79282

jreback requested changes Dec 5, 2018

View reviewed changes

TomAugspurger added 2 commits December 5, 2018 15:49

warn for categorical

a69dbb3

debug 32-bit issue

911a2da

TomAugspurger mentioned this pull request Dec 5, 2018

REF: DatetimeLikeArray #24024

Merged

12 tasks

TomAugspurger added 5 commits December 6, 2018 06:21

compat, revert

badb5be

32-bit compat

edff47e

Merge remote-tracking branch 'upstream/master' into ea-where

4715ef6

deprecation note for categorical

d90f384

where versionadded

5e14414

jschendel reviewed Dec 6, 2018

View reviewed changes

Merge remote-tracking branch 'upstream/master' into ea-where

e9665b8

TomAugspurger mentioned this pull request Dec 7, 2018

Alternative implementation for .where on EA-backed Indexes #24144

Closed

TomAugspurger commented Dec 7, 2018

View reviewed changes

jsexauer mentioned this pull request Dec 7, 2018

DEPR: Clean up list of deprecations from prior versions #6581

Closed

1 task

py2 compat

6edd286

TomAugspurger changed the title ~~API: Added ExtensionArray.where~~ API/BUG/Perf: Support ExtensionArrays in where Dec 7, 2018

TomAugspurger changed the title ~~API/BUG/Perf: Support ExtensionArrays in where~~ BUG/Perf: Support ExtensionArrays in where Dec 7, 2018

jreback requested changes Dec 7, 2018

View reviewed changes

TomAugspurger added 2 commits December 7, 2018 16:05

Merge remote-tracking branch 'upstream/master' into ea-where

30775f0

Updated

4de8bb5

* Unxfail (most) of the new combine_first tests * Removed stale comment * group conditions

TomAugspurger mentioned this pull request Dec 9, 2018

BUG GH16983 fix df.where with extension dtypes #24169

Closed

4 tasks

jreback requested changes Dec 9, 2018

View reviewed changes

TomAugspurger added 2 commits December 9, 2018 14:29

Merge remote-tracking branch 'upstream/master' into ea-where

ce04a75

Clarify

f98a82c

Merge remote-tracking branch 'upstream/master' into ea-where

bcfb8f8

jreback requested changes Dec 10, 2018

View reviewed changes

TomAugspurger added 2 commits December 10, 2018 07:22

Simplify error message

8d9b20b

sparse whatsnew

c0351fd

jreback approved these changes Dec 10, 2018

View reviewed changes

updates

539d3cb

jreback merged commit baad046 into pandas-dev:master Dec 10, 2018

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

BUG/Perf: Support ExtensionArrays in where (pandas-dev#24114)

0239802

Closes pandas-dev#24077

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

BUG/Perf: Support ExtensionArrays in where (pandas-dev#24114)

1256ac1

Closes pandas-dev#24077

jreback mentioned this pull request Nov 29, 2019

DEPR: deprecations log for removed issues #13777

Closed

		@@ -777,6 +777,17 @@ def take(self, indices, allow_fill=False, fill_value=None, axis=None,

		return self._shallow_copy(left_take, right_take)

		def where(self, cond, other):

BUG/Perf: Support ExtensionArrays in where #24114

BUG/Perf: Support ExtensionArrays in where #24114

Conversation

TomAugspurger commented Dec 5, 2018 • edited Loading

pep8speaks commented Dec 5, 2018

TomAugspurger commented Dec 5, 2018

codecov bot commented Dec 5, 2018

Codecov Report

codecov bot commented Dec 5, 2018

Codecov Report

codecov bot commented Dec 5, 2018 • edited Loading

Codecov Report

TomAugspurger commented Dec 5, 2018 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 6, 2018

jorisvandenbossche commented Dec 6, 2018

TomAugspurger commented Dec 6, 2018 via email

jorisvandenbossche commented Dec 6, 2018

TomAugspurger commented Dec 6, 2018 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 9, 2018

TomAugspurger commented Dec 10, 2018

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Dec 10, 2018

jreback commented Dec 10, 2018

TomAugspurger commented Dec 5, 2018 •

edited

Loading

codecov bot commented Dec 5, 2018 •

edited

Loading

TomAugspurger commented Dec 7, 2018 •

edited

Loading