-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review/Overhaul __array__ and view methods #23569
Comments
So question is what do we want to support in
But do we also want to support |
For datetime/timedelta, this will happen automatically if we do nothing. So the only other option is to raise an error, but that does not seem necessary I think. |
Sorry I missed this earlier. I'm concerned about the default behavior on In [7]: idx = pd.date_range('2000', periods=4, tz='US/Central')
In [8]: ser = pd.Series(idx)
In [9]: np.asarray(ser)
Out[9]:
array(['2000-01-01T06:00:00.000000000', '2000-01-02T06:00:00.000000000',
'2000-01-03T06:00:00.000000000', '2000-01-04T06:00:00.000000000'],
dtype='datetime64[ns]')
In [10]: np.asarray(idx)
Out[10]:
array(['2000-01-01T06:00:00.000000000', '2000-01-02T06:00:00.000000000',
'2000-01-03T06:00:00.000000000', '2000-01-04T06:00:00.000000000'],
dtype='datetime64[ns]') This unfortunately conflicts with In [11]: list(ser)
Out[11]:
[Timestamp('2000-01-01 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-02 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-03 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-04 00:00:00-0600', tz='US/Central', freq='D')]
In [12]: list(idx)
Out[12]:
[Timestamp('2000-01-01 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-02 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-03 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-04 00:00:00-0600', tz='US/Central', freq='D')] Do we care about that (I do)? Should we deprecate the current behavior? diff --git a/pandas/core/arrays/datetimes.py b/pandas/core/arrays/datetimes.py
index e731dd33f..0f0577b1a 100644
--- a/pandas/core/arrays/datetimes.py
+++ b/pandas/core/arrays/datetimes.py
@@ -404,6 +404,9 @@ class DatetimeArrayMixin(dtl.DatetimeLikeArrayMixin,
# Array-Like / EA-Interface Methods
def __array__(self, dtype=None):
+ if dtype is None:
+ warnings.warn("bad", FutureWarning)
+ dtype = object
if is_object_dtype(dtype):
return np.array(list(self), dtype=object)
elif is_int64_dtype(dtype): usage: In [6]: np.asarray(idx)
/Users/taugspurger/sandbox/pandas/pandas/core/arrays/datetimes.py:408: FutureWarning: bad
warnings.warn("bad", FutureWarning)
Out[6]:
array([Timestamp('2000-01-01 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-02 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-03 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-04 00:00:00-0600', tz='US/Central', freq='D')],
dtype=object)
In [7]: np.asarray(ser)
/Users/taugspurger/sandbox/pandas/pandas/core/arrays/datetimes.py:408: FutureWarning: bad
warnings.warn("bad", FutureWarning)
Out[7]:
array([Timestamp('2000-01-01 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-02 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-03 00:00:00-0600', tz='US/Central', freq='D'),
Timestamp('2000-01-04 00:00:00-0600', tz='US/Central', freq='D')],
dtype=object) |
As an example of the kind of strange knock-on effects this has, see https://github.com/pandas-dev/pandas/pull/23601/files#diff-425b4da47d01dc33d86c5c697e196b70R1128. Ideally we'd just write |
I agree here. |
Is changing I have a branch started. Just need to go through and fix up all the warnings. |
Picking this up now. I'm it's an annoyingly large diff to get things through. I'm going with @jorisvandenbossche's suggestion to
This proved to be much easier that doing the deprecation that Unfortunately, |
In light of deprecating |
I didn't realize Series.get_values was being deprecated. But that's a decent idea I'll try exploring. IIUC the main purpose of |
I don't think there is a PR, but it certainly is on the wishlist :) |
See #19617. I think the discussion there is mainly resolved as we now have the explicit |
Thanks. FYI, your suggestion seems to be working quite well.
…On Thu, Jan 3, 2019 at 6:55 AM Joris Van den Bossche < ***@***.***> wrote:
See #19617 <#19617>. I think
the discussion there is mainly resolved as we now have the explicit .array
and np.array/.to_numpy() to cover the two use cases.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#23569 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHImtiHohHDG-MvE_hIftdITI9gtnIks5u_f2wgaJpZM4YU_Iu>
.
|
This deprecates the current behvior when converting tz-aware Series or Index to an ndarray. Previously, we converted to M8[ns], throwing away the timezone information. In the future, we will return an object-dtype array filled with Timestamps, each of which has the correct tz. ```python In [1]: import pandas as pd; import numpy as np In [2]: ser = pd.Series(pd.date_range('2000', periods=2, tz="CET")) In [3]: np.asarray(ser) /bin/ipython:1: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. #!/Users/taugspurger/Envs/pandas-dev/bin/python3 Out[3]: array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00.000000000'], dtype='datetime64[ns]') ``` xref pandas-dev#23569
This deprecates the current behvior when converting tz-aware Series or Index to an ndarray. Previously, we converted to M8[ns], throwing away the timezone information. In the future, we will return an object-dtype array filled with Timestamps, each of which has the correct tz. ```python In [1]: import pandas as pd; import numpy as np In [2]: ser = pd.Series(pd.date_range('2000', periods=2, tz="CET")) In [3]: np.asarray(ser) /bin/ipython:1: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. #!/Users/taugspurger/Envs/pandas-dev/bin/python3 Out[3]: array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00.000000000'], dtype='datetime64[ns]') ``` xref pandas-dev#23569
This deprecates the current behvior when converting tz-aware Series or Index to an ndarray. Previously, we converted to M8[ns], throwing away the timezone information. In the future, we will return an object-dtype array filled with Timestamps, each of which has the correct tz. ```python In [1]: import pandas as pd; import numpy as np In [2]: ser = pd.Series(pd.date_range('2000', periods=2, tz="CET")) In [3]: np.asarray(ser) /bin/ipython:1: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. #!/Users/taugspurger/Envs/pandas-dev/bin/python3 Out[3]: array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00.000000000'], dtype='datetime64[ns]') ``` xref pandas-dev#23569
#23524 fixes a problem with
DatetimeIndex.__array__
; we should do a thorough check of the__array__
methods on other classes to make sure (and test) that they make senseThe text was updated successfully, but these errors were encountered: