-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Series.unique converts uint64 to int64 (with overflow) #14721
Comments
this is a sympton of #4471 in general |
@jreback : IIUC, your statement is incorrect. The bug traces back as follows: When you call The correct patch I think is to change the check to The easy way out but less desirable way I think is to modify this branch here and convert the resulting array back to the original dtype. Thoughts? |
@gfyoung I was pointing to a 'generic' issue with uint64. it doesn't work in lots of places! |
@jreback : Ah, fair enough. But thoughts about this? |
hmm |
@jreback : If someone knows how to navigate through that swamp of hashtable code, then by all means, patch it! 😄 But yes, it's just a matter of adding another hashtable class. |
you can use python hashing, no?
|
@jreback : Perhaps, but then what's going on with this entire hashtable class? Why was it implemented that way if Python hashing is feasible? |
prob not that hard to write some klib type stuff in cython |
klib is >> faster than python for the types of things we do (and doesn't have as much overhead for pyobjects) |
or just add to the klib stuff (might be easier) |
@jreback : I think adding to the klib is the best solution, but I don't really understand how all of it works. |
me neither :< |
its just template stuff in a header only library |
you can prob copy int64 and just do a replace. (maybe) |
@jreback : True that file is, but I have no idea how it gets populated. |
Introduces a UInt64HashTable class to hash uint64 elements and prevent overflow. Closes pandas-devgh-14721.
Introduces a UInt64HashTable class to hash uint64 elements and prevent overflow in functions like Series.unique. Closes pandas-devgh-14721.
Introduces a UInt64HashTable class to hash uint64 elements and prevent overflow in functions like Series.unique. Closes pandas-devgh-14721.
Introduces a UInt64HashTable class to hash uint64 elements and prevent overflow in functions like Series.unique. Closes pandas-devgh-14721.
Introduces a `UInt64HashTable` class to hash `uint64` elements and prevent overflow in functions like `Series.unique`. Closes pandas-dev#14721. Author: gfyoung <gfyoung17@gmail.com> Closes pandas-dev#14915 from gfyoung/uint64-hashtable-patch and squashes the following commits: 380c580 [gfyoung] BUG: Prevent uint64 overflow in Series.unique
Series.unique should preserve whatever type the values are. But instead it changes from uint64 to int64.
INSTALLED VERSIONS
...
python: 3.4.3.final.0
python-bits: 64
...
...
pandas: 0.17.1
numpy: 1.10.4
...
The text was updated successfully, but these errors were encountered: