Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

applymap and misc/float columns #2909

Closed
jankatins opened this issue Feb 21, 2013 · 7 comments
Closed

applymap and misc/float columns #2909

jankatins opened this issue Feb 21, 2013 · 7 comments

Comments

@jankatins
Copy link
Contributor

This works as expected in latest pandas:

> df = pandas.DataFrame(data=[1,"a"])
> print(df)
   0
0  1
1  a
> print(df.applymap(lambda x: x))
   0
0  1
1  a

This does not (note the float "1." instead if "1"):

> df = pandas.DataFrame(data=[1.,"a"])
> print(df)
   0
0  1
1  a
> print(df.applymap(lambda x: x))
     0
0   1
1 NaN

This is a problem in statsmodels/statsmodels#636

pandas.version -> '0.11.0.dev-14a04dd'

@jreback
Copy link
Contributor

jreback commented Feb 21, 2013

print the resulting dtypes of these, you are seeing conversion in the 2nd to float64
(the reason the first is not converted is that this is an int, so the entire column is object)

In [13]: df.applymap(lambda x: x).dtypes
Out[13]: 
0    float64
Dtype: object

In [14]: df1.applymap(lambda x: x).dtypes
Out[14]: 
0    object
Dtype: object

Here is the what applymap does (with the non-conversion operater set to 0

z = df1.apply(lambda x: pd.lib.map_infer(x, lambda y: y, convert=0))

Out[18]: 
   0
0  1
1  a

z.dtypes
Out[19]: 
0    object
Dtype: object

@jankatins
Copy link
Contributor Author

Is that an intended API change? This used to work (aka the dtype was not changed) in former versions as the branch in statsmodels used to work.

@jreback
Copy link
Contributor

jreback commented Feb 21, 2013

do you know what version did it work on?

@jankatins
Copy link
Contributor Author

I remember that I used this branch around christmas/new year. this is a test with commit 1c32ebf

In [1]: import pandas

In [2]: pandas.__version__
Out[2]: '0.10.0'

In [3]: df = pandas.DataFrame(data=[1.,"a"])

In [4]: print(df)
   0
0  1
1  a

In [5]: print(df.dtypes)
0    object

In [6]: print(df.applymap(lambda x: x))
   0
0  1
1  a

In [7]: print(df.applymap(lambda x: x).dtypes)
0    object

@jreback
Copy link
Contributor

jreback commented Feb 21, 2013

was a bug

fyi....convert_objects when passed convert_numeric=True will do things like this
e.g. try to convert any 'number' like to an appropriate dtype, and set everything else to nan
kind of a force conversion

this is set by default to False (I had accidently left it as True), this is a user initiated action

this is new in 0.11-dev

@jankatins
Copy link
Contributor Author

Thanks! I can confirm that applying this commit fixes the problem in statsmodels!

@jreback
Copy link
Contributor

jreback commented Feb 21, 2013

great.....will get around to merging soon

jreback added a commit that referenced this issue Feb 22, 2013
jreback added a commit that referenced this issue Feb 23, 2013
BUG: incorrect default in df.convert_objects was converting object types (#2909)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants