-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make AbstractArray{T}(...) constructor in LinearAlgebra more consistent #40831
Conversation
As suggested before in #35165, I've added an ImmutableArray type to the testhelpers. The tests really only need a single case of an immutable array that can be converted to a different eltype. Perhaps the new InfiniteArray in testhelpers could also be adapted for that purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distinction between "preserve structure" and "don't preserve structure", is that original or was that an implicit copy_oftype
-contract? Is this by any means "breaking"?
copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A) | ||
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A) | ||
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = copyto!(similar(A,S), A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't seem to need the type parameters T
and N
:
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = copyto!(similar(A,S), A) | |
copy_oftype(A::AbstractArray, ::Type{S}) where {S} = copyto!(similar(A, S), A) |
# copy_similar_oftype: like copy_oftype(A, T), but discarding structure of `A`, | ||
# comparable to how the three-argument function similar(A, S, size(A)) behaves. | ||
# Frequently useful as the second argument to ldiv! or the first argument to rdiv! | ||
copy_similar_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here:
copy_similar_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = | |
copy_similar_oftype(A::AbstractArray, ::Type{S}) where {S} = |
Thanks for looking at this! Where does "preserving structure" come from, that's a good question. I'll try to be concise but I'm not good at that... There are three things combined here. It may serve as documentation later so I'll write it out.
The name
I've removed the middlemen, precisely in order to preserve what
A curious exception was
In conclusion: (i) if the small chance of breaking something is not considered to be worth it, it is best to discard this pull request. I'm fine with that, I've updated it so that it could be closed regardless of whether it is merged or not, (ii) if I got anything wrong then I'd like to hear about it :-) and (iii) the |
Hmm, now that I put it this way, maybe removing middlemen is not the best idea here. If something somewhere does break, intercepting the right function call can always be used to fix it. But there are now fewer hooks to do so. On the other hand, if the conversion is not optimal, having those same hooks also provides means to fix it. I have experience with StaticArrays, but to be honest I don't know how this might interact with GPU arrays, or distributed arrays... Maybe it is not worth it at this point. |
I still need to digest your comments, but as a quick comment: we could have a PkgEval run to see how/whether this affects other packages at some point. |
Looking closer at a few cases, I don't think I got the semantics of A solution might be to go even further and replace the original definition copy_oftype(A::AbstractArray{T}, ::Type{T}) where {T} = copy(A)
copy_oftype(A::AbstractArray{T,N}, ::Type{S}) where {T,N,S} = convert(AbstractArray{S,N}, A) by copy_oftype(A::AbstractArray, ::Type{T}) where {T} = copyto!(similar(A, T), T) But that could have more side effects. I'll think about some examples to show what I mean here. |
Here is an example on the use of (*)(A::AbstractTriangular, D::Diagonal) =
rmul!(copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag))), A), D)
(*)(D::Diagonal, B::AbstractTriangular) =
lmul!(D, copyto!(similar(B, promote_op(*, eltype(B), eltype(D.diag))), B))
(*)(A::AbstractMatrix, D::Diagonal) =
rmul!(copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag)), size(A)), A), D)
(*)(D::Diagonal, A::AbstractMatrix) =
lmul!(D, copyto!(similar(A, promote_op(*, eltype(A), eltype(D.diag)), size(A)), A)) The first two lines can use the |
I came across an unrelated issue: julia> using LinearAlgebra; D = Diagonal(1:3)
3×3 Diagonal{Int64, UnitRange{Int64}}:
1 ⋅ ⋅
⋅ 2 ⋅
⋅ ⋅ 3
julia> D^2
ERROR: MethodError: no method matching Vector{Int64}(::Diagonal{Int64, UnitRange{Int64}})
Closest candidates are:
Array{T, N}(::AbstractArray{S, N}) where {T, N, S} at array.jl:540
Vector{T}() where T at boot.jl:467
Vector{T}(::Core.Compiler.AbstractRange{T}) where T at range.jl:1042
...
Stacktrace:
[1] convert(#unused#::Type{Vector{Int64}}, a::Diagonal{Int64, UnitRange{Int64}})
@ Base ./array.jl:532
[2] Diagonal{Int64, Vector{Int64}}(diag::Diagonal{Int64, UnitRange{Int64}})
@ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/diagonal.jl:10
[3] convert(T::Type{Diagonal{Int64, Vector{Int64}}}, m::Diagonal{Int64, UnitRange{Int64}})
@ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/special.jl:53
[4] to_power_type(x::Diagonal{Int64, UnitRange{Int64}})
@ Base ./intfuncs.jl:240
[5] power_by_squaring(x_::Diagonal{Int64, UnitRange{Int64}}, p::Int64)
@ Base ./intfuncs.jl:255
[6] ^(A::Diagonal{Int64, UnitRange{Int64}}, p::Int64)
@ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/dense.jl:447
[7] macro expansion
@ ./none:0 [inlined]
[8] literal_pow(f::typeof(^), x::Diagonal{Int64, UnitRange{Int64}}, #unused#::Val{2})
@ Base ./none:0
[9] top-level scope
@ REPL[41]:1 It fails because the |
Yes, we should catch that by providing a method (^)(D::Diagonal, a::Number) = Diagonal(D.diag.^a) |
This quickly leads to lots of errors, bad idea :-) |
After some more study of the code I'm reaching new conclusions:
These considerations leave one possible implementation, namely copy_oftype(A, T) = copyto!(similar(A, T), A) The result is mutable and equals A. In spite of what I said above, this actually works - I had made unrelated changes that led to errors before. |
It still leaves the question in my mind what As an example, julia> using LinearAlgebra; lu(Diagonal(rand(3)))
ERROR: MethodError: no method matching lu!(::Diagonal{Float64, Vector{Float64}}, ::Val{true}; check=true) In fact, almost all standard matrix decompositions fail for diagonal matrices, and for several other structured matrices of LinearAlgebra too. Sometimes for different reasons. E.g. The only one that reliably works is |
I've tried it out for Of course it would be even better to look at all the structured matrices in LinearAlgebra and consider their factorization. The fallback for diagonal matrices is of course not efficient when you treat them as a dense array... It seems beneficial to try to find a common way to go from |
Yes, I think we have an issue for that, and I think I started to work on it. IIRC, the main issue was with |
Ok, thanks. In any case I'll remove the changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Very clear. To help users, I think you should add a few comments in the main file to explain the rationale behind the three copy*
functions. And we really need to add lu
methods for every special matrix type soon, now that lu
starts working for them.
copyto!(dest::SymTridiagonal, src::SymTridiagonal) = | ||
(copyto!(dest.dv, src.dv); copyto!(dest.ev, src.ev); dest) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this better be moved to tridiag.jl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :-)
I moved it
I had to do a rebase, because the interface for |
Okay, I've made a new attempt at documenting the copy functions. Unfortunately it is a bit fuzzy, because the behaviour of I have yet to add some tests for |
I think this already in very good shape. The few extra |
Alright, does that mean we're good here? Apart from a new apparent conflict in diagonal.jl. |
I think we should make sure the new methods are tested, but otherwise we're good. |
In diagonal, it's probably the removal of many methods by my very recent commit. |
Do you mean adding tests for lu, or for the copy_* functions (or both)? |
Ideally for just everything. 🤣 but one could argue that the |
It turns out that Tridiagonal was the only type for which the return type of the LU factorization differs. For Symmetric and Hermitian matrices, I could only indirectly check that a different method of |
Yes, actually the |
So just keep the tests for lu itself (I selected a diagonal and bidiagonal matrix) and the one for lu of tridiagonal matrix perhaps? |
Yes, sounds reasonable. I realized that the |
I have applied these changes, but had also turned the comments for the copy_* functions into docstrings in the meantime, so I combined it into a separate commit -> hence the commit. Locally, all tests passed - just now the online build seems to give an error about a repository key. I'll give it some time to finish. |
One last thing: I have just merged #41248. Could you please rebase and adjust those few lines to your new @nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt |
Aha, there seems to be only one relevant issue: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/ada4020_vs_3279e20/TaylorSeries.1.8.0-DEV-7bddc7efdc.log The corresponding code is here: Replacing the |
Ok, let's fix that issue once this is in master. |
@daanhb Only now, I'm reading through this and I have a historical comment and few questions. The original motivation for Originally, we didn't really consider anything but Now the questions, why didn't you use an |
@andreasnoack Thanks for your comments. I could find a lot of information in old issues and pull requests, a very valid resource, but I was not aware of the original motivation for The current implementation of Regarding Also, the interface for |
Regarding the issue that came up in TaylorSeries: this is likely to appear in other places as well. There is always a possibility for ambiguity if I think it may be correct that we can leave out the An alternative that is perhaps cleaner would be something like this: function lu(A::AbstractMatrix, pivot::Union{RowMaximum,NoPivot} = RowMaximum(); check::Bool = true)
B = lucopy(A, lutype(eltype(A))
lu!(B, pivot; check = check)
end
lucopy(A::AbstractMatrix, T) = copy_to_array(A, T)
lucopy(A::StridedMatrix, T) = copy_oftype(A, T)
lucopy(A::HermOrSym, T) = copy_oftype(A, T)
lucopy(A::Tridiagonal, T) = copy_oftype(A, T) That should be functionally equivalent to the current state, and avoid the ambiguity because only one method for |
|
||
See also: `copy_similar`, `copy_to_array`. | ||
""" | ||
copy_oftype(A::AbstractArray, ::Type{T}) where {T} = copyto!(similar(A,T), A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be renamed copymutable_oftype
as it behaves very different than copy
(which is not in general mutable)
This is an update to the old pull request #35165 (with some delay - apologies). The implementation has not changed compared to before, but the underlying issue hasn't changed either, hence the update.
Summary: The main change is to make sure that
AbstractArray{T}(A::SomeStructuredMatrix)
retains the structure of the structured matrixA
for the types defined in LinearAlgebra. If possible, it will only change the eltype toT
. This is the case currently for some arrays but not all. In cases where it isn't, arrays that are defined in terms of immutable arrays end up being mutable. That's not the end of the world, but it is avoidable. The constructor may be called by writingconvert(AbstractArray{T}, A::SomeStructuredMatrix)
.One example:
Here, the wrapped
SVector
has turned into anMVector
(which is mutable).Implementation: a relevant function used frequently in LinearAlgebra is
copy_oftype
, which was defined in terms ofconvert(AbstractArray{T}, A)
. It was assumed that this conversion returns a mutable array. I've replaced it bycopyto!(similar(A, T), A)
in this pull request, which is more explicitly mutable. (And it is close to what it did anyway due to the fallback in array.jl.)I think there are three kinds of matrices similar to
A
that are often used in LinearAlgebra:copy_oftype(A, T)
copy_similar_oftype(T, A)
(because it usessimilar(A, T, size(A))
to make the copy). This is often used to pass tordiv!
orldiv!
, where the result in general does not have the same structure as A.T
, not necessarily mutable and no need to copy: this isconvert(AbstractArray{T}, A)
Status: I have ported the changes from #35165 to the current master and the tests pass locally. However, last year I went through all the code in LinearAlgebra looking at all occurrences of
convert(AbstractArray....)
andcopy_oftype
syntax and I did not (yet) repeat the exercise. Also, the fallback forAbstractArray{T}(...)
in Base performs an axes check before invokingcopyto!
, so thecopyto!
s of all matrices in LinearAlgebra should check axes too. Finally,copy_similar_oftype
is not a good name, and it really comes down tocopyto!(similar(A, T, size(A)), A)
which is short and informative too, so this may not even have to be a function.