-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistent behavior between DataFrame and Series when underlying objects are datetime #3320
Comments
this all works in 0.11, pls try that |
I downloaded the zip of master and installed the package via python setup.py install 0.11.dev1.6.2 AttributeError Traceback (most recent call last) AttributeError: 'numpy.datetime64' object has no attribute 'date' |
the issue is your column is type |
You are trying to do something like this?
|
Yes, the issue is that I understood .values property as numpy,array of the original data In the code above dtSeries.values gets me back a datetime64[ns] in 0.11 where as np.array(dt1d) gets me an array of objects. You can imagine this to be used profusely and probably interchangeably a lot of places in the code when sometimes we don't care about the indices and want to operate on the array directly, so the fact that it's inconsistent with dates causes some errors. I notice that throughout the versions, when they were stored as objects, this was one and the same, but from the code and printout above it looks like the dtype after the creation is different as versions increase in 0.10.0 both dtSeries and dtDf create the column as object, so the operations are consistent when using apply vs iterating over the .values in 0.10.1 Series still keeps them as object whereas DataFrame converts them into datetime64[ns] and so it breaks when not using apply. in 0.11.0 it seems the fix was to have Series also convert the dtype to datetime64[ns] but this works: so how should i now think of .values? |
The type conversions (started in 0.10.1) on datetime like objects are meant to hide a lot of buggy numpy code (a lot has been fixed in 1.7.0 pandas still accepts 1.6.2). Storing data as Trying to access types like So unless you have a really good reason, no reason to use |
following up....internally different dtypes are stored separately, so when you ask for This is how numpy misbehaves (its really just a printing issue though)
|
close this? |
Yea please, thanks for the clarification, not online at the moment if you
|
I just upgraded from 0.10 to to 0.10.1 to fix a previous issue I had, and i think I'm getting regressions or inconsistent behavior as it relates to issue #2627
the test below is a comparison on 0.10.0 and 0.10.1 under windows 32 bit python 2.7.2
The output for 0.10.0 in ipython:
1.6.2
0.10.0
object
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
[datetime.date(2013, 1, 1), datetime.date(2013, 1, 2), datetime.date(2013, 1, 3)
, datetime.date(2013, 1, 4), datetime.date(2013, 1, 5)]
0 object
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
Name: 0
[datetime.date(2013, 1, 1), datetime.date(2013, 1, 2), datetime.date(2013, 1, 3)
, datetime.date(2013, 1, 4), datetime.date(2013, 1, 5)]
[datetime.date(2013, 1, 1), datetime.date(2013, 1, 2), datetime.date(2013, 1, 3)
, datetime.date(2013, 1, 4), datetime.date(2013, 1, 5)]
The output in 0.10.1:
1.6.2
0.10.1
object
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
[datetime.date(2013, 1, 1), datetime.date(2013, 1, 2), datetime.date(2013, 1, 3)
, datetime.date(2013, 1, 4), datetime.date(2013, 1, 5)]
0 datetime64[ns]
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
Name: 0
AttributeError Traceback (most recent call last)
in ()
18 print dtDf.icol(0).apply(lambda x: x.date())
19 #the following do not work on 0.10.1 but work on 0.10.0
---> 20 print [x.date() for x in dtDf.values[:,0]]
21 print [x.date() for x in dtDf.icol(0).values]
AttributeError: 'numpy.datetime64' object has no attribute 'date'
The text was updated successfully, but these errors were encountered: