Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimized implementations of reduce([hv]cat, A) #27188

Merged
merged 2 commits into from
Jun 28, 2018
Merged

Conversation

nalimilan
Copy link
Member

Fixes #21672.

This implements the minimal fix for AbstractVector{<:Union{AbstractVector,AbstractMatrix}}. It doesn't cover the case of Vector{Any} list of vectors or matrices, which could be useful to support given that it is not uncommon to store expensive objects like arrays in Vector{Any} lists. For example, it appears that JSON.jl returns such objects. This could be done by passing mapreduce(typeof, typejoin, A) to an internal function which can then dispatch on the argument type, with a fallback on the normal reduce method. But it's more complex and can introduce ambiguities.

@nalimilan nalimilan added the arrays [a, r, r, a, y, s] label May 21, 2018
@@ -1188,7 +1188,9 @@ typed_hcat(::Type{T}, X::Number...) where {T} = hvcat_fill(Matrix{T}(undef, 1,le
vcat(V::AbstractVector...) = typed_vcat(promote_eltype(V...), V...)
vcat(V::AbstractVector{T}...) where {T} = typed_vcat(T, V...)

function typed_vcat(::Type{T}, V::AbstractVector...) where T
AbstractVecOrTuple{T} = Union{AbstractVector{<:T}, Tuple{Vararg{T}}}
Copy link
Member

@mbauman mbauman May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, crazy — this works? You can make a type covariant through an alias?

In any case, I think I'd prefer doing this the other way around — I saw A::AbstractVecOrTuple{AbstractVecOrMat} and assumed it was an invariance bug. That is:

AbstractVecOrTuple{T} = Union{AbstractVector{T}, Tuple{Vararg{T}}}
# and
_typed_hcat(::Type{T}, A::AbstractVecOrTuple{<:AbstractVecOrMat}) where T

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Unfortunately, I get a bootstrap error if I add <: in the method signatures instead (whether I also change <:T to T in the type alias or not). That's weird since I cannot reproduce at the REPL. Maybe that's a dispatch bug.

REPL  ────────────────── 17.461338 seconds
error during bootstrap:
LoadError("sysimg.jl", 213, LoadError("/home/milan/Dev/julia/usr/share/julia/stdlib/v0.7/Pkg3/src/Pkg3.jl", 32, LoadError("/home/milan/Dev/julia/usr/share/julia/stdlib/v0.7/Pkg3/src/Types.jl", 45, MethodError(Base._typed_vcat, (UInt8, (UInt8[0xc8, 0x30, 0xd4, 0x4f, 0xc0, 0x00, 0xb4, 0x80, 0xd1, 0x11, 0xad, 0x9d, 0x10, 0xb8, 0xa7, 0x6b], UInt8[0x6a, 0x75, 0x6c, 0x69, 0x61, 0x6c, 0x61, 0x6e, 0x67, 0x2e, 0x6f, 0x72, 0x67])), 0x00000000000067bb))))
rec_backtrace at /home/milan/Dev/julia/src/stackwalk.c:94
record_backtrace at /home/milan/Dev/julia/src/task.c:246
jl_throw at /home/milan/Dev/julia/src/task.c:577
jl_method_error_bare at /home/milan/Dev/julia/src/gf.c:1608
jl_method_error at /home/milan/Dev/julia/src/gf.c:1626
jl_apply_generic at /home/milan/Dev/julia/src/gf.c:2117
typed_vcat at ./abstractarray.jl:1271
vcat at ./abstractarray.jl:1189 [inlined]
uuid5 at /home/milan/Dev/julia/usr/share/julia/stdlib/v0.7/Pkg3/src/Types.jl:36

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hunch would be an ambiguity error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but as I said I can't reproduce when pasting the definitions directly at the REPL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed as #27224.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should

AbstractVecOrTuple{T} = Union{AbstractVector{<:T}, Tuple{Vararg{T}}}

be declared const ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a parameter it doesn't matter. Personally I always put const for consistent.

@oxinabox
Copy link
Contributor

oxinabox commented Jun 5, 2018

These shouldn't really be being done to reduce since vcat (etc) are order sensitive.
They should be being done to foldl instead.
No?

@nalimilan
Copy link
Member Author

AFAIK these functions are associative and that's the only difference between reduce and foldl (order refers to commutativity).

@oxinabox
Copy link
Contributor

oxinabox commented Jun 5, 2018

Oh you are right, associativity is all that is required.
Ignore me.

With that said applying these to foldl and is still a good thing to do, maybe? Since the optimization is just as valid there.

@nalimilan nalimilan added the triage This should be discussed on a triage call label Jun 24, 2018
@nalimilan
Copy link
Member Author

For triage: I'd like to get this into 0.7 since it's a very frequent issue, but I can't remove the weird type alias until #27224 is fixed. Is that such a big problem, given that it's internal?

@StefanKarpinski
Copy link
Member

Seems fine to me. @JeffBezanson?

@JeffBezanson
Copy link
Member

The type alias doesn't bother me. But it seems it could also just be removed, since it's only used to restrict the argument type of an internal function.

@mbauman
Copy link
Member

mbauman commented Jun 27, 2018

Yeah, Milan tried that but ran into #27188 (comment). I'm fine with this if you are — the only reason I've not merged this earlier is because I was unsure about the behaviors of that typealias.

@StefanKarpinski
Copy link
Member

It seems like using a sketchy type alias for the signature of a purely internal function to avoid ambiguity is fine... if it breaks in the future, we can fix it then. Maybe leave a FIXME/TODO note?

@nalimilan
Copy link
Member Author

Fair enough, I've added a FIXME.

@JeffBezanson
Copy link
Member

Yeah, Milan tried that but ran into #27188 (comment).

That's not really what I meant; there are other solutions too. But it doesn't matter.

@kbarros
Copy link

kbarros commented Jun 4, 2022

It's great that this feature exists, but it's not very discoverable. The docs currently recommend hcat(c...). Could we instead suggest reduce(hcat, c)? Reference: https://docs.julialang.org/en/v1/base/arrays/#Base.hcat

@stevengj
Copy link
Member

stevengj commented Jun 4, 2022

Yes, the docs for hcat and vcat should definitely mention the option of using reduce if you have a collection of arrays to concatenate. Feel free to submit a PR.

The documentation is in https://github.com/JuliaLang/julia/blob/master/base/abstractarray.jl, so you can just click the edit button at upper right to add in an example and submit a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stack(vec_of_vecs) for vcat(vec_of_vecs...)
9 participants