You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 4, 2019. It is now read-only.
I'm not thrilled about the current implementation of find(X::NullableArray{Bool}); it seems inefficient to loop through everything twice:
function Base.find(X::NullableArray{Bool}) # -> Array{Int}
ntrue =0@inboundsfor (i, isnull) inenumerate(X.isnull)
ntrue +=!isnull && X.values[i]
end
target =Array(Int, ntrue)
ind =1@inboundsfor (i, isnull) inenumerate(X.isnull)
if!isnull && X.values[i]
target[ind] = i
ind +=1endendreturn target
end
I coded an example alternative, which is faster but uses more memory -- I assume due to the allocation of a larger Array:
functionf(X)
res =Array(Int, length(X))
ind =1@inboundsfor i ineachindex(X)
!X.isnull[i] && X.values[i] ? (res[ind] = i; ind +=1) :nothingendresize!(res, ind-1)
return res
end
Actually, I think it's worth discussing, if briefly, whether or not this is the desired default behavior for find(::NullableArray). The extension of isnan that we provide returns a NullableArray{Bool} and reflects the positions of null entries in the argument NullableArray. I wonder if that should be default behavior for find methods as well and if we should then provide a skipnull kwarg.
It looks like the implementation in Base for StridedArray also goes over the data twice. It sounds reasonable to do the same for NullableArray, as allocating a vector of the same size as the input can be really wasteful.
Regarding the behaviour of find, I don't really see what you're suggesting. isnan can return NULL for missing elements. But it wouldn't make much sense to insert a NULL for each missing element in the array passed to find. Or do would you simply want to raise an error? This would be consistent with how other functions (like sum) currently behave.
I'm not thrilled about the current implementation of
find(X::NullableArray{Bool})
; it seems inefficient to loop through everything twice:I coded an example alternative, which is faster but uses more memory -- I assume due to the allocation of a larger Array:
(After warm-up)
Thoughts anybody?
The text was updated successfully, but these errors were encountered: