-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG GH23282 calling min on series of NaT returns NaT #23289
Conversation
Hello @JustinZhengBC! Thanks for updating the PR.
Comment last updated on October 25, 2018 at 23:07 Hours UTC |
ab9313b
to
44eb8d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be sure to always add tests first and foremost
pandas/core/nanops.py
Outdated
@@ -718,6 +718,8 @@ def reduction(values, axis=None, skipna=True, mask=None): | |||
result = np.nan | |||
else: | |||
result = getattr(values, meth)(axis) | |||
if is_integer(result) and result == _int64_max: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks affect to integer dtype, pd.Series([_int64_max]).min() / max()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, now it only applies the conversion from _int64_max
to NaT
if given an appropriate dtype.
@@ -509,3 +509,8 @@ def test_dt_timetz_accessor(self, tz_naive_fixture): | |||
time(22, 14, tzinfo=tz)]) | |||
result = s.dt.timetz | |||
tm.assert_series_equal(result, expected) | |||
|
|||
def test_minmax_nat(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u add test for timedelta dtype and DataFrame (#10390)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more tests, but this PR does not fix #10390
pandas/core/nanops.py
Outdated
@@ -718,6 +718,9 @@ def reduction(values, axis=None, skipna=True, mask=None): | |||
result = np.nan | |||
else: | |||
result = getattr(values, meth)(axis) | |||
if (is_integer(result) and is_datetime_or_timedelta_dtype(dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs handling not here but in _wrap_resulf where a scalar should be turned into NaT if it’s null and of the correct dtype
Codecov Report
@@ Coverage Diff @@
## master #23289 +/- ##
==========================================
- Coverage 92.22% 92.22% -0.01%
==========================================
Files 169 169
Lines 51258 51266 +8
==========================================
+ Hits 47274 47281 +7
- Misses 3984 3985 +1
Continue to review full report at Codecov.
|
3f35609
to
95f3bf6
Compare
95f3bf6
to
ce0a7ee
Compare
i pushed a commit. have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs a release note.
pandas/core/nanops.py
Outdated
@@ -346,7 +350,7 @@ def nanany(values, axis=None, skipna=True, mask=None): | |||
>>> nanops.nanany(s) | |||
False | |||
""" | |||
values, mask, dtype, _ = _get_values(values, skipna, False, copy=skipna, | |||
values, mask, dtype, _, _ = _get_values(values, skipna, False, copy=skipna, | |||
mask=mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
listing error I would guess.
""" wrap our results if needed """ | ||
|
||
if is_datetime64_dtype(dtype): | ||
if not isinstance(result, np.ndarray): | ||
if result == fill_value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it assumed that fill_value
is not NA? If not, this will be wrong, since fill_value
will never equal fill_value
.
Should we assert that it's not NA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this can't be for an i8 type by definition. but yes we can assert it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JustinZhengBC can you add a whatsnew note & some asserts. ping on green.
""" wrap our results if needed """ | ||
|
||
if is_datetime64_dtype(dtype): | ||
if not isinstance(result, np.ndarray): | ||
if result == fill_value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this can't be for an i8 type by definition. but yes we can assert it.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -1020,6 +1020,7 @@ Datetimelike | |||
- Bug in :func:`to_datetime` with an :class:`Index` argument that would drop the ``name`` from the result (:issue:`21697`) | |||
- Bug in :class:`PeriodIndex` where adding or subtracting a :class:`timedelta` or :class:`Tick` object produced incorrect results (:issue:`22988`) | |||
- Bug in :func:`date_range` when decrementing a start date to a past end date by a negative frequency (:issue:`23270`) | |||
- Bug in :func:`min` which would return ``NaN`` instead of ``NaT`` when called on a series of ``NaT`` (:issue:`23282`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the bug in the builtin min
from the standard library, or Series.min
? Right now, you're linking to the builtin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to Series.min
. I think it's fixed now
@jreback I added a whatsnew note and an assert in the datetime64 case (adding the assert to the timedelta64 case causes tests to fail) |
lgtm. @WillAyd over to you. |
Thanks @JustinZhengBC ! |
…y_tests * repo_org/master: (52 commits) ENH: Allow rename_axis to specify index and columns arguments (pandas-dev#20046) STY: proposed isort settings [ci skip] [skip ci] [ciskip] [skipci] (pandas-dev#23366) MAINT: Remove extraneous test.parquet file CLN: Follow-up comments to pandas-devgh-23392 (pandas-dev#23401) BUG GH23282 calling min on series of NaT returns NaT (pandas-dev#23289) unpin openpyxl (pandas-dev#23361) REF: collect ops dispatch functions in one place, try to de-duplicate SparseDataFrame methods (pandas-dev#23060) CLN: Remove pandas.tools module (pandas-dev#23376) CLN: Remove some dtype methods from API (pandas-dev#23390) CLN: Cleanup toplevel namespace shims (pandas-dev#23386) DOC: fixup whatsnew note for GH21394 (pandas-dev#23355) Fix import format at pandas/tests/extension directory (pandas-dev#23365) DOC: Remove Series.sortlevel from api.rst (pandas-dev#23395) API: Disallow dtypes w/o frequency when casting (pandas-dev#23392) BUG/TST/REF: Datetimelike Arithmetic Methods (pandas-dev#23215) STYLE: lint add np.nan* funcs to cython_table (pandas-dev#22109) Run Isort on tests/util single PR (pandas-dev#23347) BUG: Fix date_range overflow (pandas-dev#23345) Run Isort on tests/arrays single PR (pandas-dev#23346) ...
git diff upstream/master -u -- "*.py" | flake8 --diff
For
max
,NaT
values are filled with the lowest possible value. Formin
, they are filled with the highest possible value. The problem is that only the lowest possible value is recognized asNaT
. Sincenanops.py
is responsible for assigning the highest value toNaT
when min is called, it should also be responsible for translating it toNaT
when appropriate.