-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed the algorithm for powers of a matrix. #21184
Conversation
base/linalg/linalg.jl
Outdated
@@ -113,6 +113,7 @@ export | |||
ordschur, | |||
peakflops, | |||
pinv, | |||
powm, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an internal helper function, it shouldn't be exported
fixed it. |
I changed some and added a few more tests based on: |
Is this good to go? |
I don't know yet. I haven't looked carefully. What is the easiest way to resume? Should I open a PR from my branch or should this PR be against master? |
let's change this |
and close/reopen to trigger CI, apparently changing the target branch doesn't do that |
base/linalg/dense.jl
Outdated
function ^(A::AbstractMatrix, p::Number) | ||
# Matrix power | ||
^(A::AbstractMatrix, p::Integer) = p < 0 ? Base.power_by_squaring(inv(A),-p) : Base.power_by_squaring(A,p) | ||
function ^{T}(A::AbstractMatrix{T}, p::Real) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the right approach for complex p
? is there a different method for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. Maybe we can ask @higham what the right approach is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only approach I found is by definition: expm(p * logm(A))
. But @stevengj pointed out correctly that it's very computationally expensive, so I didn't add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's all there is for complex p. If you do an initial Schur decomp of A then expm and logm are applied to triangular matrices, which reduces the cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be reasonable to have ^(A::AbstractMatrix, p::Number) = expm(p * logm(A))
as a fallback. We can optimize it later if needed.
base/intfuncs.jl
Outdated
@@ -193,7 +193,6 @@ end | |||
|
|||
^{T<:Integer}(x::T, p::T) = power_by_squaring(x,p) | |||
^(x::Number, p::Integer) = power_by_squaring(x,p) | |||
^(x, p::Integer) = power_by_squaring(x,p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this should be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fallback method shouldn't be removed by this PR.
base/linalg/dense.jl
Outdated
if isinteger(p) | ||
return A^Integer(real(p)) | ||
return p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The promotion here is still wrong if p
is not an integer, as explained in #21143. A^2.0
should produce a floating-point matrix even if A
is integer.
The return type should be something like Matrix{Base.promote_op(^, eltype(A), typeof(p))}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, note that power_by_squaring
is pretty inefficient for matrices because it generates a lot of temporary matrices. We should really have a specialized _power_by_squaring!(Aᵖ::Matrix, A::Matrix, p::Integer)
function that minimizes the number of temporary arrays and uses a pre-allocated output array. But that can be a separate PR, since it is just a performance optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If A
is an integer matrix, A^2
should be an integer matrix, but A^2.0
should be a floating-point matrix.
We definitely should not use diagonalization for integer p
, because it will be way slower even than what we are doing now, despite the extra temporary matrices (which mostly waste memory, not time, for large A
). e.g.
julia> A = randn(1000,1000); A = A'*A; # random SPD matrix
julia> @time A^4;
0.428707 seconds (114.28 k allocations: 20.806 MiB, 20.50% gc time)
julia> @time A^4.5;
3.669913 seconds (2.70 M allocations: 172.916 MiB, 2.36% gc time)
base/linalg/dense.jl
Outdated
@@ -310,15 +310,23 @@ kron(a::AbstractMatrix, b::AbstractVector)=kron(a,reshape(b,length(b),1)) | |||
kron(a::AbstractVector, b::AbstractMatrix)=kron(reshape(a,length(a),1),b) | |||
|
|||
# Matrix power | |||
function ^{T}(A::AbstractMatrix{T}, p::Integer) | |||
# non-inversible matrix to negative power is undefined. | |||
if (p < 0) && (det(A) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check does not seem like a good idea. First, inv(A)
will already throw an error if A
is singular, and just as quickly as det
will. Second, computing det(A)
means that you compute the LU factorization of A
twice. Third, you could get false positives because of underflow (e.g. det([1e-300 0; 0 1e-300]) == 0
even though the matrix is not singular).
(Basically, you should forget everything you learned in high school about the determinant as a computational tool.)
base/linalg/dense.jl
Outdated
if isinteger(p) | ||
return p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p)) | ||
if isinteger(p) && !(typeof(p) <: Int64) | ||
return float.(A^Integer(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right. First, Matrix{Integer}^Integer
should be an integer matrix (even if p
is not Int64
). Second, for Matrix{Int}^Float64
we need to convert to floating point and then exponentiate in order to avoid overflow problems. Third, calling float.(X)
will make an extra copy even if the matrix is already floating point.
We want something like:
convert(Matrix{promote_op(^, eltype(A), typeof(p)}, A)^Integer(p)
although that isn't quite right either: if A
is not a Matrix
, but is some other AbstractMatrix
type, then we probably want the same type. Maybe
T = Base.promote_op(^, eltype(A), typeof(p))
return (T == eltype(A) ? A : copy!(similar(A, T), A))^Integer(p)
One thing I'm not sure is what should happen is |
base/linalg/diagonal.jl
Outdated
@@ -285,7 +285,6 @@ logm(D::Diagonal) = Diagonal(log.(D.diag)) | |||
logm(D::Diagonal{<:AbstractMatrix}) = Diagonal(logm.(D.diag)) | |||
sqrtm(D::Diagonal) = Diagonal(sqrt.(D.diag)) | |||
sqrtm(D::Diagonal{<:AbstractMatrix}) = Diagonal(sqrtm.(D.diag)) | |||
^(D::Diagonal, p::Real) = Diagonal((D.diag).^p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this method removed?
base/linalg/dense.jl
Outdated
end | ||
end | ||
if np_real_eigs | ||
warn("Matrix with nonpositive real eigenvalues, a nonprincipal matrix power will be returned.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pass once=true
here so that people don't see repeated warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean in something like (A^0.5) * (A^0.5)
? If so, I'm not sure how to come up with a right key for the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default key is the warning message, which seems fine here.
base/linalg/dense.jl
Outdated
warn("Matrix with nonpositive real eigenvalues, a nonprincipal matrix power will be returned.") | ||
end | ||
|
||
if isreal(A) && ~np_real_eigs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!np_real_eigs
base/linalg/triangular.jl
Outdated
sqrt_diag!(A0, A, s) | ||
|
||
# Compute the Gauss-Legendre quadrature formula | ||
x, w = Base.QuadGK.gauss(Float64, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work now that QuadGK was removed from Base, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it looks like m ≤ length(theta)
, we can just precompute these and store them in a const
array.
@stevengj |
test/linalg/dense.jl
Outdated
@@ -512,6 +512,49 @@ end | |||
@test_throws ArgumentError diag(zeros(0,1),2) | |||
end | |||
|
|||
@testset "New matrix ^ implementation" for elty in (Float64, Complex{Float64}) | |||
A11 = convert(Matrix{elty}, rand(10, 10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to just have rand(elty,10,10)
here so that you actually make complex matrices.
base/linalg/dense.jl
Outdated
# Check whether the matrix has nonpositive real eigs | ||
np_real_eigs = false | ||
for i = 1:n | ||
if imag(d[i]) < eps() && real(d[i]) <= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be the eps of the element type? not sure how likely it would be to matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also typemin()
for example, but I thought default eps()
is generally a safe condition to produce a warning. You think I should change it nevertheless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good way to do the test because it depends on overall scaling of the matrix (if you multiply the whole matrix by 1e±100
the result will change).
Also, shouldn't you be checking abs(imag)
, not imag
? And shouldn't the warning be for real < 0
, not real ≤ 0
?
I'm not sure what a robust test would be here. Maybe abs(imag(d[i])) ≤ ɛ && real(d[i]) < 0
, where ɛ = eps(maxabs(d))
?
It evidently isn't tested, but tightening |
retmat = Base.power_by_squaring(A, p) | ||
end | ||
for i = 1:n | ||
retmat[i,i] = real(retmat[i,i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know there are separate issues where this has been discussed, but just checking - any imaginary components here should only ever be round-off, right? Does blas hemm touch the diagonal imaginary components or wouldn't it leave them alone? Is the issue round-off for julia-generic or non-blas element types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the imaginary part is guaranteed to be zero, so any nonzero imaginary part is purely due to roundoff errors.
The errors arise for any values, including blas floats, because multiplying by the matrix of eigenvectors uses the general matmul routine. I don't think there is a specialized BLAS routine for matrix * real diagonal * matrix'
, which is the operation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, for Hermitian^p the operation is hermitian * hermitian
, but again there is no specialized BLAS method for this because in general the result would not be Hermitian. (The result is Hermitian here because the matrices being multiplied are A^p and A^q, i.e. different powers of the same Hermitian matrix.) So the roundoff errors still arise for any values, including BLAS floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, note the nonzero imaginary parts on the diagonal of:
julia> A = rand(Complex128,10,10); A = Hermitian(A + A')
10×10 Hermitian{Complex{Float64},Array{Complex{Float64},2}}:
0.0861346-0.0im 0.837547-0.743436im … 1.09493-0.163298im
0.837547+0.743436im 1.96064-0.0im 0.491744-0.792954im
0.869364-0.131544im 1.06965-0.252339im 0.370136+0.0155668im
0.83727+0.303993im 1.18724+0.108695im 1.38671+0.191601im
1.50475+0.357721im 1.00663-0.421604im 1.02963-0.13435im
1.1999+0.0748608im 0.370293+0.503285im … 1.13011-0.0552782im
1.56458+0.546159im 0.512528-0.0406777im 0.721986+0.445042im
0.394002+0.645965im 0.935305-0.473984im 0.971132+0.493203im
0.741598-0.175156im 1.91631+0.111909im 0.868631+0.834707im
1.09493+0.163298im 0.491744+0.792954im 0.00985939-0.0im
julia> imag(diag(A^2 * A^3))
10-element Array{Float64,1}:
0.0
5.68434e-14
0.0
0.0
0.0
4.54747e-13
0.0
5.68434e-14
0.0
0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base/linalg/symmetric.jl
Outdated
end | ||
end | ||
|
||
function expm{T<:Real}(A::Symmetric{T}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this, logm, and sqrtm for Symmetric of non-Real element types become no-method errors here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these are covered by Hermitian methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about complex symmetric, non hermitian?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh I'm sorry, for some reason I thought symmetric matrices are always hermitian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit of a niche type, a fair number of routines even in blas or lapack aren't always implemented for c/z symmetric
LGTM. The Travis failure seems unrelated. |
base/linalg/symmetric.jl
Outdated
end | ||
end | ||
function ^{T<:Real}(A::Symmetric{T}, p::Real) | ||
F = eigfact(full(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it we need the full call here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In general, a lot of the full
calls seem odd to me; we seem better off adding additional methods elsewhere, but I thought that could be left for another PR. This particular one is especially egregious though because eigfact
should be especially efficient for real-symmetric matrices. What happens if you just delete all of the full
calls?)
Thanks for pushing with this. It would be really great to have it for 0.6.0 |
base/linalg/dense.jl
Outdated
|
||
function ^(A::AbstractMatrix, p::Number) | ||
# Matrix power | ||
^{T}(A::AbstractMatrix{T}, p::Integer) = p < 0 ? Base.power_by_squaring(inv(A),-Integer(p)) : Base.power_by_squaring(A,Integer(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Integer(p)
needed here?
The tightening of the allowed power type here from Number to Real would currently break |
@tkelman I'm not sure where are you referring to |
it's collapsed right now, but the line at #21184 (comment) |
Currently (I plan on working at |
On master it gives a consistent answer with http://www.wolframalpha.com/input/?i=N%5B%7B%7B1,+2%7D,+%7B3,+4%7D%7D%5EI%5D. I don't know of a use case for this off the top of my head, but that doesn't seem like a good reason to change it from consistent with Wolfram to an error. |
@iamnapo, we should just add the |
base/linalg/dense.jl
Outdated
retmat = retmat * powm!(UpperTriangular(float.(A)), real(p - floor(p))) | ||
end | ||
else | ||
S,Q,d = schur(complex(full(A))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another full that may have been missed, or was this one needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know - what was happening without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if A
is symmetric with complex values, but not hermitian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iamnapo, do you mean that there was a methoderror for schur(::Symmetric{<:Complex}}
? It would be better to solve that by adding a method to shur
or shurfact
that calls full
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we just need:
schur(A::Symmetric{<:Complex}) = schur(full(A))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'll add it.
@eval begin | ||
|
||
function ($funm)(A::Symmetric) | ||
function ($funm){T<:Real}(A::Symmetric{T}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the problem for Symmetric{<:Complex}
that isposdef
doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the issue is that Symmetric{<:Complex}
may have ill-conditioned eigenvectors F.vectors
, so it is not reliable to use diagonalization to compute matrix functions. Whereas for real-symmetric matrices the eigenvectors are unitary, which we are exploiting here: F.vectors' == inv(F.vectors)
. The complex analogue is a Hermitian matrix, not complex-symmetric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Useful to write that down somewhere, I wouldn't have remembered that detail from a linear algebra textbook.
base/linalg/schur.jl
Outdated
@@ -107,6 +107,7 @@ function schur(A::StridedMatrix) | |||
SchurF = schurfact(A) | |||
SchurF[:T], SchurF[:Z], SchurF[:values] | |||
end | |||
schur(A::AbstractMatrix) = schur(full(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this definition is that it will produce an infinite loop for any AbstractMatrix
type where full(A) === A
and there is no more-specific schur
method.
I would just add schur(A::Triangular{<:Complex}) = schur(full(A))
(and anything else as needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't full(A)
always of type StridedMatrix
?
base/linalg/schur.jl
Outdated
@@ -111,8 +111,7 @@ schur(A::Symmetric) = schur(full(A)) | |||
schur(A::Hermitian) = schur(full(A)) | |||
schur(A::UpperTriangular) = schur(full(A)) | |||
schur(A::LowerTriangular) = schur(full(A)) | |||
|
|||
|
|||
schur(A::Tridiagonal) = schur(full(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods can obviously be made vastly more efficient. For example, an upper-triangular matrix is already in Schur form. But that can be left for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. I just added these definitions to make it work for now. I plan to work on schur()
after this PR is closed.
Do the original 2 commits by a different author pass on their own? Any failing commits should be squashed here, but git is bad at attribution if you squash together commits by different authors. Otherwise this lgtm now, thanks for the epic persistence @iamnapo ! |
I don't think there will be a problem, but in the case there is, you can use command line to |
fixed Matrix^Real_Number functionality reverted logm changes remove ambiguous Diagonal^Real method
…ntegers. added a few more test for ^;now uses random 10x10 matrix as well. fixed some ambiguities, corrected the algorithm a bit. correctly promotes Matrix{Int}^Float64 reverted some wrong changes reverted logm changes speedup if A is diagonal, fixed some tests small changes based on feedback powm is now scale-invariant powm scales in-place, fixed testing a bit, corrected some bugs removed full() calls added rollback for Matrix^Complex added schur() support for Symmetric,Hermitian,Triangular,Tridiagonal matrices
and be uniform about space after # in comments
is this good to go? |
Fixed JuliaLang/LinearAlgebra.jl#414
I did a few small bug-fixes and also changed the ^(A::AbstractMatrix, p::Integer) algorithm a bit to make use of powm(A0::UpperTriangular{T}, p::Real).