-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make StructArrays broadcast aware #136
Conversation
While allowing broadcasting to return a StructArray, this limits it to cases where: - no other arrays in the broadcast operation, including those wrapped by the StructArray, have non-default BroadcastStyle - the eltype returned from the function is a struct type It should be straightforward to define precedence rules to handle other cases, e.g., StructArrays of CuArrays.
CC @Keno |
Thank you! Does this address #90 (comment)? With the previous PR there were some concerns in case the individual arrays composing the From the point of view of semver, is it breaking to change the return type (but not the content) of broadcast? |
I added another test, plus a fallback that seems like a reasonable default for most customizations. |
I never know what to say about that...I would say, safer is better than sorry, and make this 0.5 (or 1.0 if you think most other things are solid). |
That's not quite the issue that I had in mind. What I'm thinking is that it could be useful, if you are working on the GPU with say an array of complex numbers, to store it as a Given that the CPU case behaves like this (with this PR): julia> using StructArrays
julia> c = rand(Float32, 100);
julia> s = StructArray{ComplexF32}((c, c));
julia> s .+ s
100-element StructArray(::Array{Float32,1}, ::Array{Float32,1}) with eltype Complex{Float32}: I imagine that the CUDA case should behave in a similar way: julia> using StructArrays, CUDA
julia> c = CUDA.rand(100);
julia> s = StructArray{ComplexF32}((c, c));
julia> s .+ s
100-element StructArray(::CuArray{Float32,1,Nothing}, ::CuArray{Float32,1,Nothing}) with eltype Complex{Float32}: Instead, with this PR julia> s .+ s
ERROR: MethodError: no method matching similar(::Base.Broadcast.Broadcasted{StructArrays.StructArrayStyle{CUDA.CuArrayStyle{1}},Tuple{Base.OneTo{Int64}},typeof(+),Tuple{StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64},StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64}}}, ::Type{Complex{Float32}}, ::Tuple{Base.OneTo{Int64}})
Closest candidates are:
similar(::AbstractArray, ::Type{T}, ::Tuple{Union{Integer, Base.OneTo},Vararg{Union{Integer, Base.OneTo},N} where N}) where T at abstractarray.jl:634
similar(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{N},Axes,F,Args} where Args<:Tuple where F where Axes, ::Type{ElType}, ::Any) where {N, ElType} at broadcast.jl:197
similar(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayConflict,Axes,F,Args} where Args<:Tuple where F where Axes, ::Type{ElType}, ::Any) where ElType at broadcast.jl:202
...
Stacktrace:
[1] similar(::Base.Broadcast.Broadcasted{StructArrays.StructArrayStyle{CUDA.CuArrayStyle{1}},Tuple{Base.OneTo{Int64}},typeof(+),Tuple{StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64},StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64}}}, ::Type{Complex{Float32}}) at ./broadcast.jl:196
[2] copy at ./broadcast.jl:862 [inlined]
[3] materialize(::Base.Broadcast.Broadcasted{StructArrays.StructArrayStyle{CUDA.CuArrayStyle{1}},Nothing,typeof(+),Tuple{StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64},StructArray{Complex{Float32},1,NamedTuple{(:re, :im),Tuple{CuArray{Float32,1,Nothing},CuArray{Float32,1,Nothing}}},Int64}}}) at ./broadcast.jl:837
[4] top-level scope at REPL[17]:1 I was wondering if there was a smart way to get it to do the right thing, but I guess that's a bit complex in general. The case where the columns are Pinging @vchuravy, as he originally pointed out the issue. |
The more I think about it, the less I'm sure we should try to come up with good fallbacks. So I just pushed another commit that doubles down on that MethodError. In many ways a MethodError is good: the error message tells you exactly what method is needed, and at that point it's up to you to decide what result type that method should produce. In contrast, something that silently succeeds (but returns the wrong type) can be confusing. But you have to decide if that brittleness is worth it. If you don't like the brittleness, you probably want the first two commits, but you won't get a StructArray output. You mention the OffsetArrays package, but it has no custom broadcasting. Everything gets communicated by the type of the It's somewhere between challenging and impossible to solve correctly in general, given the current interface and the technology available in this package. Most of your To fix the specific Base.similar(bc::Broadcasted{StructArrayStyle{S}}, ::Type{ElType}) where {S<:CuArrayStyle,N,ElType} =
isstructtype(ElType) ? _structarray(CuArray, ElType, axes(bc)) : similar(CuArray{ElType}, axes(bc)) However, unless I'm missing something that _structarray(::Type{ArrayConstructor}, ::Type{ElType}, axs) where {ArrayConstructor, ElType} = ? It should build a |
Let me know whether you want just the first, the first two, or all three of the commits here. With commit 1 or 1-3 it's deliberately brittle. With commits 1-2, it has "sensible" fallbacks but these do the wrong thing in the case of the |
Ah, I'm starting to understand - most of this broadcast machinery is above my pay grade :) Then, I would agree that it's better to not have the fallback and figure out in the future where / how add the code to make the GPU case work. That is, I'm happy to merge all three commits, possibly squashing them into one when merging. I think the I have a final concern, but I guess it could be OK in practice. For some structs, getting them into a Problem 1: they use inner constructors.This way, StructArrays has no way to guess how to build the instance from the type and the field values (that is to say, maybe there is a way, but I don't know it). julia> c = rand(100);
julia> s = StructArray{ComplexF64}((c, c));
julia> struct B
x::Float64
y::Float64
B(x) = new(x, x)
end
julia> broadcast(z -> B(abs(z)), s) # here we fail on `getindex` trying to create instances
100-element StructArray(::Array{Float64,1}, ::Array{Float64,1}) with eltype B:
Error showing value of type StructArray{B,1,NamedTuple{(:x, :y),Tuple{Array{Float64,1},Array{Float64,1}}},Int64}:
ERROR: MethodError: no method matching B(::Float64, ::Float64) Would you know, given a struct type Problem 2: they customize
|
Oh, good. Then this is very close to feasible, and a one-line addition to some CUDA package would solve @vchuravy's issue. For the other problems you mention...without digging into detail I'm not certain I know what it is you need, but it seems likely that bypassing the constructors is the way to go. You can see an example in the Serialization stdlib: |
I'm about to tag a release with this new feature. The news entry links to this PR, so I'll just comment here that to fix the method error for a specific array type, one should do things like: julia> using Base.Broadcast: ArrayStyle, Broadcasted
julia> using StructArrays: StructArrayStyle
julia> Base.similar(bc::Broadcasted{StructArrayStyle{S}}, ::Type{ElType}) where {S<:ArrayStyle{MyArray},N,ElType} =
similar(MyArray{ElType}, axes(bc)) where of course I have a fuzzy intuition that there could be a general solution based on Adapt.jl (which is the common CUDA / StructArrays dependency), possibly by creating new "stubs" there that custom array packages (with custom broadcast) could extend. As adding fallbacks is non-breaking, we can do that after the release (when we have more practical usecases). |
Continues #90. This is more conservative, returning a
StructArray
only if none of the other participating arrays (including the ones wrapped in the StructArray) have specialized broadcast behavior. One could add binaryBroadcastStyle
rules to control behavior in other cases, but it seems best to wait until there's a real-world use case.