-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Merge timezone aware data with DST #22825
Conversation
Hello @mroeschke! Thanks for submitting the PR.
|
" scalar") | ||
|
||
@pytest.mark.xfail(reason="ToDo: do not ignore timezone, must be object") | ||
def test_where_index_datetimetz(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was bizarrely structured above. It was given the wrong arguments for everything to pass before it hit the xfail...
Good news is that one of the cases is passing, so I am splitting the tz-aware and tz-naive case and properly xfailing the tz-aware case (which was already failing on master and is orthogonal to this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an issue for this case? if not can you create one and reference it here
Codecov Report
@@ Coverage Diff @@
## master #22825 +/- ##
==========================================
+ Coverage 92.18% 92.19% +<.01%
==========================================
Files 169 169
Lines 50830 50833 +3
==========================================
+ Hits 46860 46863 +3
Misses 3970 3970
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -681,6 +681,7 @@ Timezones | |||
- Bug in :meth:`DatetimeIndex.unique` that did not re-localize tz-aware dates correctly (:issue:`21737`) | |||
- Bug when indexing a :class:`Series` with a DST transition (:issue:`21846`) | |||
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` where an ``AmbiguousTimeError`` or ``NonExistentTimeError`` would raise if a timezone aware timeseries ended on a DST transition (:issue:`19375`, :issue:`10117`) | |||
- Bug in :func:`merge` when merging ``datetime64[ns, tz]`` data that contained a DST transition (:issue:`18885`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to reshaping
pandas/core/indexes/datetimelike.py
Outdated
@@ -289,6 +289,7 @@ def _ensure_localized(self, result, ambiguous='raise'): | |||
result : DatetimeIndex / i8 ndarray | |||
ambiguous : str, bool, or bool-ndarray | |||
default 'raise' | |||
from_utc : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the default here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add what this actually means (and similarly for to_utc below).
also maybe rename 'result' to array or input or something, I think these are holdover names and are confusing.
pandas/core/indexes/datetimelike.py
Outdated
@@ -695,14 +699,17 @@ def astype(self, dtype, copy=True): | |||
return super(DatetimeIndexOpsMixin, self).astype(dtype, copy=copy) | |||
|
|||
|
|||
def _ensure_datetimelike_to_i8(other): | |||
def _ensure_datetimelike_to_i8(other, to_utc=False): | |||
""" helper for coercing an input scalar or array to i8 """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string here
pandas/core/indexes/datetimelike.py
Outdated
""" helper for coercing an input scalar or array to i8 """ | ||
if is_scalar(other) and isna(other): | ||
other = iNaT | ||
elif isinstance(other, ABCIndexClass): | ||
# convert tz if needed | ||
if getattr(other, 'tz', None) is not None: | ||
other = other.tz_localize(None).asi8 | ||
if to_utc: | ||
other = other.tz_convert('UTC').asi8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do the .asi8 lower? (and just do it on other) to avoid repeating code
Addressed your comments @jreback and all green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very small comments. ping on green.
pandas/core/indexes/datetimelike.py
Outdated
ambiguous : str, bool, or bool-ndarray | ||
default 'raise' | ||
from_utc : bool | ||
default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default goes on the line before.
pandas/core/indexes/datetimelike.py
Outdated
---------- | ||
other : 1d array | ||
to_utc : bool | ||
default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
" scalar") | ||
|
||
@pytest.mark.xfail(reason="ToDo: do not ignore timezone, must be object") | ||
def test_where_index_datetimetz(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an issue for this case? if not can you create one and reference it here
All green besides a unrelated, failed connection to a s3 parquet test. |
df2 = pd.DataFrame([ | ||
pd.to_datetime('2017-10-29 03:00:00'), | ||
pd.to_datetime('2017-10-29 04:00:00'), | ||
pd.to_datetime('2017-10-29 05:00:00') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you create this one as well with date_range to shorten it a bit? (and if not, I think it is more typical to pass a list of strings to to_datetime
instead of calling it on each string)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can maybe also directly include the 'value' column in the constructor
All green besides a unrelated, failed connection to a s3 parquet test. |
thanks @mroeschke |
git diff upstream/master -u -- "*.py" | flake8 --diff
When merging timezone aware data using
where
, make the comparison in UTC to avoid relocalizing on an ambiguous time.