Skip to content

Commit

Permalink
BUG: DataFrame.select_dtypes(include=number) still included BooleanDt…
Browse files Browse the repository at this point in the history
  • Loading branch information
mroeschke authored and noatamir committed Nov 9, 2022
1 parent 5af8b2d commit dac5168
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 1 deletion.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.4.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Fixed regressions
- Fixed regression in taking NULL :class:`objects` from a :class:`DataFrame` causing a segmentation violation. These NULL values are created by :meth:`numpy.empty_like` (:issue:`46848`)
- Fixed regression in :func:`concat` materializing :class:`Index` during sorting even if :class:`Index` was already sorted (:issue:`47501`)
- Fixed regression in :func:`cut` using a ``datetime64`` IntervalIndex as bins (:issue:`46218`)
- Fixed regression in :meth:`DataFrame.select_dtypes` where ``include="number"`` included :class:`BooleanDtype` (:issue:`46870`)
- Fixed regression in :meth:`DataFrame.loc` not updating the cache correctly after values were set (:issue:`47867`)
- Fixed regression in :meth:`DataFrame.loc` not aligning index in some cases when setting a :class:`DataFrame` (:issue:`47578`)
- Fixed regression in :meth:`DataFrame.loc` setting a length-1 array like value to a single value in the DataFrame (:issue:`46268`)
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4687,8 +4687,11 @@ def check_int_infer_dtype(dtypes):
raise ValueError(f"include and exclude overlap on {(include & exclude)}")

def dtype_predicate(dtype: DtypeObj, dtypes_set) -> bool:
# GH 46870: BooleanDtype._is_numeric == True but should be excluded
return issubclass(dtype.type, tuple(dtypes_set)) or (
np.number in dtypes_set and getattr(dtype, "_is_numeric", False)
np.number in dtypes_set
and getattr(dtype, "_is_numeric", False)
and not is_bool_dtype(dtype)
)

def predicate(arr: ArrayLike) -> bool:
Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/frame/methods/test_select_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -441,3 +441,18 @@ def test_select_dtypes_float_dtype(self, expected, float_dtypes):
df = df.astype(dtype_dict)
result = df.select_dtypes(include=float_dtypes)
tm.assert_frame_equal(result, expected)

def test_np_bool_ea_boolean_include_number(self):
# GH 46870
df = DataFrame(
{
"a": [1, 2, 3],
"b": pd.Series([True, False, True], dtype="boolean"),
"c": np.array([True, False, True]),
"d": pd.Categorical([True, False, True]),
"e": pd.arrays.SparseArray([True, False, True]),
}
)
result = df.select_dtypes(include="number")
expected = DataFrame({"a": [1, 2, 3]})
tm.assert_frame_equal(result, expected)

0 comments on commit dac5168

Please sign in to comment.