Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into GH_37544
Browse files Browse the repository at this point in the history
  • Loading branch information
ma3da committed Nov 4, 2020
2 parents eb6905c + a057135 commit 54c0465
Show file tree
Hide file tree
Showing 102 changed files with 1,689 additions and 1,331 deletions.
30 changes: 30 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,36 @@ repos:
entry: python scripts/validate_unwanted_patterns.py --validation-type="private_function_across_module"
types: [python]
exclude: ^(asv_bench|pandas/tests|doc)/
- id: FrameOrSeriesUnion
name: Check for use of Union[Series, DataFrame] instead of FrameOrSeriesUnion alias
entry: Union\[.*(Series.*DataFrame|DataFrame.*Series).*\]
language: pygrep
types: [python]
exclude: ^pandas/_typing\.py$
- id: type-not-class
name: Check for use of foo.__class__ instead of type(foo)
entry: \.__class__
language: pygrep
files: \.(py|pyx)$
- id: unwanted-typing
name: Check for use of comment-based annotation syntax and missing error codes
entry: |
(?x)
\#\ type:\ (?!ignore)|
\#\ type:\s?ignore(?!\[)
language: pygrep
types: [python]
- id: no-os-remove
name: Check code for instances of os.remove
entry: os\.remove
language: pygrep
types: [python]
files: ^pandas/tests/
exclude: |
(?x)^
pandas/tests/io/excel/test_writers\.py|
pandas/tests/io/pytables/common\.py|
pandas/tests/io/pytables/test_store\.py$
- repo: https://github.com/asottile/yesqa
rev: v1.2.2
hooks:
Expand Down
23 changes: 0 additions & 23 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -122,29 +122,6 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
RET=$(($RET + $?)) ; echo $MSG "DONE"

# -------------------------------------------------------------------------
# Type annotations

MSG='Check for use of comment-based annotation syntax' ; echo $MSG
invgrep -R --include="*.py" -P '# type: (?!ignore)' pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check for missing error codes with # type: ignore' ; echo $MSG
invgrep -R --include="*.py" -P '# type:\s?ignore(?!\[)' pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check for use of Union[Series, DataFrame] instead of FrameOrSeriesUnion alias' ; echo $MSG
invgrep -R --include="*.py" --exclude=_typing.py -E 'Union\[.*(Series.*DataFrame|DataFrame.*Series).*\]' pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

# -------------------------------------------------------------------------
MSG='Check for use of foo.__class__ instead of type(foo)' ; echo $MSG
invgrep -R --include=*.{py,pyx} '\.__class__' pandas
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check code for instances of os.remove' ; echo $MSG
invgrep -R --include="*.py*" --exclude "common.py" --exclude "test_writers.py" --exclude "test_store.py" -E "os\.remove" pandas/tests/
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check for inconsistent use of pandas namespace in tests' ; echo $MSG
for class in "Series" "DataFrame" "Index" "MultiIndex" "Timestamp" "Timedelta" "TimedeltaIndex" "DatetimeIndex" "Categorical"; do
check_namespace ${class}
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.1.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Fixed regressions

Bug fixes
~~~~~~~~~
-
- Bug in metadata propagation for ``groupby`` iterator (:issue:`37343`)
-

.. ---------------------------------------------------------------------------
Expand Down
11 changes: 11 additions & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ Other enhancements
- ``Styler`` now allows direct CSS class name addition to individual data cells (:issue:`36159`)
- :meth:`Rolling.mean()` and :meth:`Rolling.sum()` use Kahan summation to calculate the mean to avoid numerical problems (:issue:`10319`, :issue:`11645`, :issue:`13254`, :issue:`32761`, :issue:`36031`)
- :meth:`DatetimeIndex.searchsorted`, :meth:`TimedeltaIndex.searchsorted`, :meth:`PeriodIndex.searchsorted`, and :meth:`Series.searchsorted` with datetimelike dtypes will now try to cast string arguments (listlike and scalar) to the matching datetimelike type (:issue:`36346`)
-
- Added methods :meth:`IntegerArray.prod`, :meth:`IntegerArray.min`, and :meth:`IntegerArray.max` (:issue:`33790`)
- Where possible :meth:`RangeIndex.difference` and :meth:`RangeIndex.symmetric_difference` will return :class:`RangeIndex` instead of :class:`Int64Index` (:issue:`36564`)
- Added :meth:`Rolling.sem()` and :meth:`Expanding.sem()` to compute the standard error of mean (:issue:`26476`).
Expand Down Expand Up @@ -341,6 +342,8 @@ Deprecations
- Deprecate use of strings denoting units with 'M', 'Y' or 'y' in :func:`~pandas.to_timedelta` (:issue:`36666`)
- :class:`Index` methods ``&``, ``|``, and ``^`` behaving as the set operations :meth:`Index.intersection`, :meth:`Index.union`, and :meth:`Index.symmetric_difference`, respectively, are deprecated and in the future will behave as pointwise boolean operations matching :class:`Series` behavior. Use the named set methods instead (:issue:`36758`)
- :meth:`Categorical.is_dtype_equal` and :meth:`CategoricalIndex.is_dtype_equal` are deprecated, will be removed in a future version (:issue:`37545`)
- :meth:`Series.slice_shift` and :meth:`DataFrame.slice_shift` are deprecated, use :meth:`Series.shift` or :meth:`DataFrame.shift` instead (:issue:`37601`)


.. ---------------------------------------------------------------------------
Expand Down Expand Up @@ -391,14 +394,18 @@ Datetimelike
- Bug in :class:`DatetimeIndex.shift` incorrectly raising when shifting empty indexes (:issue:`14811`)
- :class:`Timestamp` and :class:`DatetimeIndex` comparisons between timezone-aware and timezone-naive objects now follow the standard library ``datetime`` behavior, returning ``True``/``False`` for ``!=``/``==`` and raising for inequality comparisons (:issue:`28507`)
- Bug in :meth:`DatetimeIndex.equals` and :meth:`TimedeltaIndex.equals` incorrectly considering ``int64`` indexes as equal (:issue:`36744`)
- :meth:`to_json` and :meth:`read_json` now implements timezones parsing when orient structure is 'table'.
- :meth:`astype` now attempts to convert to 'datetime64[ns, tz]' directly from 'object' with inferred timezone from string (:issue:`35973`).
- Bug in :meth:`TimedeltaIndex.sum` and :meth:`Series.sum` with ``timedelta64`` dtype on an empty index or series returning ``NaT`` instead of ``Timedelta(0)`` (:issue:`31751`)
- Bug in :meth:`DatetimeArray.shift` incorrectly allowing ``fill_value`` with a mismatched timezone (:issue:`37299`)
- Bug in adding a :class:`BusinessDay` with nonzero ``offset`` to a non-scalar other (:issue:`37457`)
- Bug in :func:`to_datetime` with a read-only array incorrectly raising (:issue:`34857`)

Timedelta
^^^^^^^^^
- Bug in :class:`TimedeltaIndex`, :class:`Series`, and :class:`DataFrame` floor-division with ``timedelta64`` dtypes and ``NaT`` in the denominator (:issue:`35529`)
- Bug in parsing of ISO 8601 durations in :class:`Timedelta`, :meth:`pd.to_datetime` (:issue:`37159`, fixes :issue:`29773` and :issue:`36204`)
- Bug in :func:`to_timedelta` with a read-only array incorrectly raising (:issue:`34857`)

Timezones
^^^^^^^^^
Expand Down Expand Up @@ -469,6 +476,7 @@ MultiIndex

- Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises ``TypeError`` with message ``"Expected label or tuple of labels"`` (:issue:`35301`)
- Bug in :meth:`DataFrame.reset_index` with ``NaT`` values in index raises ``ValueError`` with message ``"cannot convert float NaN to integer"`` (:issue:`36541`)
- Bug in :meth:`DataFrame.combine_first` when used with :class:`MultiIndex` containing string and ``NaN`` values raises ``TypeError`` (:issue:`36562`)

I/O
^^^
Expand All @@ -492,6 +500,8 @@ I/O
- Bug in output rendering of complex numbers showing too many trailing zeros (:issue:`36799`)
- Bug in :class:`HDFStore` threw a ``TypeError`` when exporting an empty :class:`DataFrame` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`)
- Bug in :class:`HDFStore` was dropping timezone information when exporting :class:`Series` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`)
- :func:`read_csv` was closing user-provided binary file handles when ``engine="c"`` and an ``encoding`` was requested (:issue:`36980`)
- Bug in :meth:`DataFrame.to_hdf` was not dropping missing rows with ``dropna=True`` (:issue:`35719`)

Plotting
^^^^^^^^
Expand Down Expand Up @@ -523,6 +533,7 @@ Groupby/resample/rolling
- Using :meth:`Rolling.var()` instead of :meth:`Rolling.std()` avoids numerical issues for :meth:`Rolling.corr()` when :meth:`Rolling.var()` is still within floating point precision while :meth:`Rolling.std()` is not (:issue:`31286`)
- Bug in :meth:`df.groupby(..).quantile() <pandas.core.groupby.DataFrameGroupBy.quantile>` and :meth:`df.resample(..).quantile() <pandas.core.resample.Resampler.quantile>` raised ``TypeError`` when values were of type ``Timedelta`` (:issue:`29485`)
- Bug in :meth:`Rolling.median` and :meth:`Rolling.quantile` returned wrong values for :class:`BaseIndexer` subclasses with non-monotonic starting or ending points for windows (:issue:`37153`)
- Bug in :meth:`DataFrame.groupby` dropped ``nan`` groups from result with ``dropna=False`` when grouping over a single column (:issue:`35646`, :issue:`35542`)

Reshaping
^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/join.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ ctypedef fused join_t:

@cython.wraparound(False)
@cython.boundscheck(False)
def left_join_indexer_unique(join_t[:] left, join_t[:] right):
def left_join_indexer_unique(ndarray[join_t] left, ndarray[join_t] right):
cdef:
Py_ssize_t i, j, nleft, nright
ndarray[int64_t] indexer
Expand Down
29 changes: 18 additions & 11 deletions pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -896,21 +896,28 @@ def indices_fast(ndarray index, const int64_t[:] labels, list keys,

if lab != cur:
if lab != -1:
tup = PyTuple_New(k)
for j in range(k):
val = keys[j][sorted_labels[j][i - 1]]
PyTuple_SET_ITEM(tup, j, val)
Py_INCREF(val)

if k == 1:
# When k = 1 we do not want to return a tuple as key
tup = keys[0][sorted_labels[0][i - 1]]
else:
tup = PyTuple_New(k)
for j in range(k):
val = keys[j][sorted_labels[j][i - 1]]
PyTuple_SET_ITEM(tup, j, val)
Py_INCREF(val)
result[tup] = index[start:i]
start = i
cur = lab

tup = PyTuple_New(k)
for j in range(k):
val = keys[j][sorted_labels[j][n - 1]]
PyTuple_SET_ITEM(tup, j, val)
Py_INCREF(val)
if k == 1:
# When k = 1 we do not want to return a tuple as key
tup = keys[0][sorted_labels[0][n - 1]]
else:
tup = PyTuple_New(k)
for j in range(k):
val = keys[j][sorted_labels[j][n - 1]]
PyTuple_SET_ITEM(tup, j, val)
Py_INCREF(val)
result[tup] = index[start:]

return result
Expand Down
Loading

0 comments on commit 54c0465

Please sign in to comment.