-
Notifications
You must be signed in to change notification settings - Fork 21
Nullable compatibility and coordination #93
Comments
This sounds great. I'll keep an eye on this issue, and I'll take a look at the |
Hey @tshort, have you seen https://github.com/nalimilan/CategoricalArrays.jl? Just wondering how the PooledElements.jl package compares in approach (since I know @nalimilan has taken a very similar approach it sounds like). |
@quinnj, I have seen @nalimilan's package. We both used code from @johnmyleswhite as a starting point. My package is on hold until the dust settles with the integration of NullableArrays into DataFrames. I'm hopeful that CategoricalArrays will meet my needs, and I won't need PooledElements. One area of difference is that my |
AFAIK the major differences between our packages are (@tshort, correct me if I'm wrong):
|
That's pretty accurate, @nalimilan. PE.jl does support pooling items other than strings, but that part isn't well tested or fleshed out. |
I've drafted a package for pooled elements at the following link. The main purpose of this package is to speed up grouping and joining in DataFrames. If this is used in DataFrames, it will also reduce the use of PooledDataArrays in DataFrames.
https://github.com/tshort/PooledElements.jl
Pooled elements and arrays use an integer or integer array to reference a pool of values. This is similar to categorical data. In PooledElements.jl, I've used an integer reference of zero as a null value. Like NullableArrays, each element is type stable. I've tried to replicate the API from NullableArrays and Base.
I'm starting this issue to make sure we coordinate. Some areas of coordination include:
anynull
method here. The way it was written, it wouldn't work withAbstractArrays
filled with PooledElements.I'm not sure exactly how to do it, but it might be good to have a trait that indicates whether an AbstractArray supports Nulls. Then, it might be easier to support operations on Nullables in arrays for multiple array types.
The text was updated successfully, but these errors were encountered: