-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
str.contains - returns series of zeroes instead of series of bools when all values are NaNs. #9184
Comments
INSTALLED VERSIONScommit: None pandas: 0.15.0 |
hmm. The behavior of Thoughts? |
I agree, |
IMO cat and str shouldn't even be attributes on a series unless it has the proper dtype. For example a float64 column should throw an attribute error on str access. This can be achieved by overriding getattr behavior. I also don't think an all nan column of float64 is ambiguous. @shoyer can you elaborate on why you think that's ambiguous? |
.cat does this now but .str is still the original code |
What I'm saying is that it shouldn't even show up in tab completion, just like if you did s.blarg |
hmm it should be taken out of local_dir then |
@cpcloud the all nan column of float64 is somewhat ambiguous only because pandas presumes that all NA lists, for example, should be floats. For example, consider the original example here:
We don't have any clues for the type of column That said... this is an edge case that probably isn't worth worrying about. Likely only expert users even realize that we use |
We should patch pandas to send anonymous usage statistics. Would help answer the frequency of edge cases :/ |
@shoyer presuming an all-na list is float-like has been long-stranding, and the most likely case. Agreed that the user would have to explicity specifiy another dtype. Ok, so this issue is one of fixing the visibility of |
I fully agree that But I don't know if it is worth the effort to also have it not visible in the Series namespace. Still seeing it also on non-string series can make people more aware of its existence, and maybe remind them they can make it strings to use that function. |
Is it feasible to convert series to string type (i.e. apply "astype(str)") internally when calling |
Applying to string column - produces correct result:
Applying to float column - returns zeroes instead of bools and return type is float64:
The text was updated successfully, but these errors were encountered: