You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I came across the case where if you try setitem with a boolean mask on a frame
that is mixed dtype an exception is raised
this is easily relaxed in the int/float case (and will leave/upcast the int columns as needed)
In [68]: df
Out[68]:
0 1 2 3 4 y
35 NaN NaN NaN NaN 0.342153 0
40 NaN 0.326323 NaN NaN NaN 0
43 NaN NaN 0.290126 NaN NaN 0
49 NaN 0.326323 NaN NaN NaN 0
50 NaN 0.391147 NaN NaN NaN 1
In [75]: df.dtypes
Out[75]:
0 float64
1 float64
2 float64
3 float64
4 float64
y int64
This will currently raise because its mixed_type (this is easily fixed and I think should be,
as the IntBlock will upcast if needed)
In [72]: df[df>0.3] = 1
In [73]: df
Out[73]:
0 1 2 3 4 y
35 NaN NaN NaN NaN 1 0
40 NaN 1 NaN NaN NaN 0
43 NaN NaN 0.290126 NaN NaN 0
49 NaN 1 NaN NaN NaN 0
50 NaN 1 NaN NaN NaN 1
What about a mixed type that invovles non-numerics though,
In [77]: df
Out[77]:
0 1 2 3 4 y foo
35 NaN NaN NaN NaN 0.342153 0 test
40 NaN 0.326323 NaN NaN NaN 0 test
43 NaN NaN 0.290126 NaN NaN 0 test
49 NaN 0.326323 NaN NaN NaN 0 test
50 NaN 0.391147 NaN NaN NaN 1 test
In [78]: df.get_dtype_counts()
Out[78]:
float64 5
int64 1
object 1
Should raise here? or allow just the non-numerics to 'work'?
am leaning toward allowing a purely numeric frame to work (e.g. mixed int/float),
but raising on this last case? (then its explicity that you did something 'wrong')
any opinons?
Note that the getitem case works on mixed....
n [80]: df[df>0.3]
Out[80]:
0 1 2 3 4 y foo
35 NaN NaN NaN NaN 0.3421533 NaN test
40 NaN 0.3263232 NaN NaN NaN NaN test
43 NaN NaN NaN NaN NaN NaN test
49 NaN 0.3263232 NaN NaN NaN NaN test
50 NaN 0.3911472 NaN NaN NaN 1 test
and this would preclude a pathological case where (and maybe this is another bug),
you can fillna this and it doesn't convert to float64, so it 'looks' like a numeric but actually isn't
Note: I am letting this go thru, (e.g. only try the numeric case if the mixed type fails, more
for backward compatibilty that anything else)
I came across the case where if you try setitem with a boolean mask on a frame
that is mixed dtype an exception is raised
this is easily relaxed in the int/float case (and will leave/upcast the int columns as needed)
This will currently raise because its mixed_type (this is easily fixed and I think should be,
as the IntBlock will upcast if needed)
What about a mixed type that invovles non-numerics though,
Should raise here? or allow just the non-numerics to 'work'?
am leaning toward allowing a purely numeric frame to work (e.g. mixed int/float),
but raising on this last case? (then its explicity that you did something 'wrong')
any opinons?
Note that the getitem case works on mixed....
and this would preclude a pathological case where (and maybe this is another bug),
you can fillna this and it doesn't convert to float64, so it 'looks' like a numeric but actually isn't
Note: I am letting this go thru, (e.g. only try the numeric case if the mixed type fails, more
for backward compatibilty that anything else)
The text was updated successfully, but these errors were encountered: