-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Various issues with maybe_convert_objects #2845
Comments
Please provide any feedback on commit 0491971 if you can; will create a PR (with appropriate tests for the new behavior) if there are no objections to the changes. All tests pass (at least on linux 32-bit) |
i would agree with all of these, except for the mixed datetimes-bools...admitedly that is weird, but its essentially a mixed case, so should just be object (and not try to guess), in any event the user could always coerce this case via last one (casting with None) is an issue I think, maybe search and see if you can find it? essentially, I think the philosophy (someone correct me if this is not right!), is that a homogenous array should be typed correctly (and not object), while mixed should be left (be it as float or object or whatever). The user then can coerce explicity via |
Found another one: Booleans and datetimes lost when mixed with
|
@jreback what do you mean w.r.t to mixed datetimes-bools? right now it's taking bools (and reading uninitialized memory by doing so)--I've fixed it to keep the original objects instead |
no...i mean your last 2 exaamples are inherently mixed (e.g. floats and bools)....so result should be the same as input, e.g. it should not change from object type only if ALL are that type (or special cases of ahh...just reread your comments...yes if it actually tries to do something on a mixed-type that is wrong |
ok, unless i'm confused, i think we agree then? in all the mixed cases presented my new code just preserves the existing objects. |
yes! essentially |
somewhat related is #2798
|
Actually I noticed that and made a change to address it locally but it broke a test so I reverted it. It was HTML serialization/deserialization round-trip: basically, it made "nan" show up instead of a blank string in the output where it was expected before. I suspect it probably affects a lot of other unserialization/serialization, too, but it's just not tested thoroughly in the test suite. I don't know what the proper thing to do here is: if you had a bunch of string objects, for example, except that some of them were EDIT: I guess you could change the behavior so that the output is presumed to be numeric unless given something non-numeric and non- |
yep....not really sure on these edge cases.... what should definitly do though: if it is all numeric, None -> np.nan bool is a harder case....I don't think None makes sense, but the concept of missing values.....so its tricky now... essentially these other cases are now left as object.... maybe_convert_objects should't be too much more complicated I think.....can make more handling on a higher level (in _possibily_convert_objects).....as can have more options on what to do |
BUG: Various issues with maybe_convert_objects (GH #2845) thanks!
closed via #2846 |
Found some issues with
lib.maybe_convert_objects
(which is an internal library function, but the same problems as below occur when callingDataFrame({'a': [...]})
, for example)(Also, mostly unrelated to the below, I noticed the inference loop continues for no reason after seeing an object, when it can safely break because the output is a forgone conclusion: added the breaks to make it slightly more efficient.)
Fails on python long integer outside
np.int64
range:Probably should leave the integer unchanged as a long integer object
Datetimes and complexes lost completely when mixed with booleans:
It's not even doing a cast: it's reading uninitialized memory as boolean
None
converts tonp.nan
with floats, but not complexes:safe
option preserves integer types withnp.nan
but notNone
:The comment for
safe
says "don't cast int to float, etc." but that is not true withNone
Not entirely sure if this last one is intended behavior or not; let me know if there is any feedback.
The text was updated successfully, but these errors were encountered: