-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RowVector, pinv, and quasi-division #23067
Conversation
For consistency, we'd then probably also want The rationale for the proposed change is Leaving CC @andyferris as the father of |
I do like consistency. One thing I'm wondering about is what So for instance, should |
If Relatedly, just noticed this: # \(A::StridedMatrix,x::Number) = inv(A)*x Should be added at some point when the old elementwise version has been deprecated long enough
# /(x::Number,A::StridedMatrix) = x*inv(A) which dates back to February 2014 - maybe now is the time. |
Right, I don't disagree, and the method would be next-to-useless if it were strict because of numerical rounding errors, etc. What I was asking was the (heretical?) question of whether non-strict julia> [0.0 1.0 0.0; 0.0 1.0 0.0] \ [0.0, 1.0] # rectangular - works
3-element Array{Float64,1}:
0.0
0.5
0.0
julia> [0.0 1.0; 0.0 1.0] \ [0.0, 1.0] # square - doesn't work
ERROR: Base.LinAlg.LAPACKException(1)
Stacktrace:
[1] chklapackerror(::Int64) at .\linalg\lapack.jl:34
[2] trtrs!(::Char, ::Char, ::Char, ::Array{Float64,2}, ::Array{Float64,1}) at .\linalg\lapack.jl:3291
[3] \(::Array{Float64,2}, ::Array{Float64,1}) at .\linalg\generic.jl:815
julia> reshape([1.0, 1.0], (2,1)) \ [0.0, 1.0] # rectangular - works
1-element Array{Float64,1}:
0.5 So the reason I'm discussing this is that I just want to check that we aren't propagating badness in the name of consistency. |
An example of a non-strict Just pondering out loud here. Maybe we need a |
Regarding the last issue, integer division produces floats, so division of integer matrices should produce float matrices, rendering the InexactError concern moot. Division of integer matrices should produce a float matrix whose values are as close as possible to the true division result. |
Right, this example was a bit of a distraction, sorry. The precise way of saying what I mean is this:
And apart from overconstrained problems involving rectangular matrices I haven't seen this violated elsewhere (but someone please correct me if I'm wrong). Underconstrained rectangular matrix problems aren't an issue - they satisfy the above. My speculation is that perhaps overconstrained matrix (residual L2 minization) problems could instead be solved with |
I think it is fair to ask if the current polyalgorithm for @andyferris The potential runtime errors from square We use the LU in the square case because it is quite a bit faster than the QR. You could argue, though, that users who want speed could just do Finally, if we decide that |
So there are two questions here:
I'd prefer to discuss only the second question here. Care to open a dedicated issue for the first question, @andyferris? |
This is getting more and more involved... If we allow '/' for rectangular matrices, I'd expect julia> zeros(3,4) / zeros(3,4) * zeros(3,4) # OK
3×4 Array{Float64,2}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
julia> zeros(3,2) / zeros(3,2) * zeros(3,2) # not OK
ERROR: DimensionMismatch("A has dimensions (3,2) but B has dimensions (3,2)") Before I start changing that... is there a particular reason for the current behavior I'm not aware of? |
This is on 0.6, right? There was a recent bug fix for zero matrices, #22831. |
It was on an apparently-not-recent-enough master, so yeah, at least that seems to be resolved, phew. |
Just to be clear: I think we should go ahead with this PR and continue the general discussion of |
65a7f3b
to
4a46304
Compare
Fixed some loose ends so that the following passes: let c = [1.0, 2.0], r = [3.0, 4.0]', cm = reshape(c, :, 1), rm = reshape(r', 1, :), m = [1.0 2.0; 3.0 4.0]
for a in (r, rm), b in (c, cm) # inner products
@test (a*b)/b ≈ a*(b/b) ≈ (a*b)*pinv(b) ≈ a*(b*pinv(b))
@test typeof((a*b)/b) == typeof(a*(b/b)) == typeof((a*b)*pinv(b)) == typeof(a*(b*pinv(b)))
@test a\(a*b) ≈ (a\a)*b ≈ (pinv(a)*a)*b ≈ pinv(a)*(a*b)
@test typeof(a\(a*b)) == typeof((a\a)*b) == typeof((pinv(a)*a)*b) == typeof(pinv(a)*(a*b))
end
for (a, b) in ((c, r), (cm, rm)) # outer products
@test (a*b)/b ≈ a*(b/b) ≈ (a*b)*pinv(b) ≈ a*(b*pinv(b))
@test typeof((a*b)/b) == typeof(a*(b/b)) == typeof((a*b)*pinv(b)) == typeof(a*(b*pinv(b)))
@test a\(a*b) ≈ (a\a)*b ≈ (pinv(a)*a)*b ≈ pinv(a)*(a*b)
@test typeof(a\(a*b)) == typeof((a\a)*b) == typeof((pinv(a)*a)*b) == typeof(pinv(a)*(a*b))
end
@test (m*c)/c ≈ m*(c/c) ≈ (m*c)*pinv(c) ≈ m*(c*pinv(c))
@test typeof((m*c)/c) == typeof(m*(c/c)) == typeof((m*c)*pinv(c)) == typeof(m*(c*pinv(c)))
@test r\(r*m) ≈ (r\r)*m ≈ (pinv(r)*r)*m ≈ pinv(r)*(r*m)
@test typeof(r\(r*m)) == typeof((r\r)*m) == typeof((pinv(r)*r)*m) == typeof(pinv(r)*(r*m))
end This verifies the consistency I was aiming for. I'll add those tests and some others that directly verify the results (so that they are not consistently wrong...) in the next days. |
@@ -795,6 +795,19 @@ function inv(A::AbstractMatrix{T}) where T | |||
A_ldiv_B!(factorize(convert(AbstractMatrix{S}, A)), eye(S0, checksquare(A))) | |||
end | |||
|
|||
function pinv(v::AbstractVector{T}, tol::Real=real(zero(T))) where T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be tol::Real=eps(real(float(one(T))))*length(v)
for consistency with pinv(::StridedMatrix{T})
(and likewise below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I was under the impression that pinv(::StridedMatrix{T}, tol)
ignores all singular values below tol
, but the threshold is actually tol * maximum(singular_values)
, so for the vector case, it only matters whether tol < 1
. So the with the computed default tol
consistent with the matrix case, an all-zero vector would be returned if the input is all-zero, which makes sense, or if length(v) > 1/eps(real(float(one(T))))
, which strikes me as pretty bizarre.
I think I'd rather keep the tol=0
default than aim for maximum consistency here. Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pinv(::Diagonal)
also uses tol=0
by default, so there is precedence for inconsistency here. Users striving for consistency across types should probably give tol
explicitly, otherwise exploiting information the type conveys to choose a sensible default seems pretty reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it only matters whether
tol < 1
I don't follow the reasoning here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A vector has one singular value s, which obvious is the maximum one. That singular value will be ignored (resulting in a zero vector) if s<=s*tol, i.e. tol>=1 or s=0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A vector has one singular value s, which obvious is the maximum one.
Of course.
I can see that the length(v) > 1/eps(real(float(one(T))))
condition is a bit weird although it might not be a big practical concern since it would be 32 PB for Float64
. However, I did some simulations and I'm wondering if using the maximum length is the right thing to do. It looks like it is the minimum that determines the error and using the minimum would fix the bizarre case for vectors.
@@ -842,10 +855,11 @@ function (\)(A::AbstractMatrix, B::AbstractVecOrMat) | |||
return qrfact(A,Val(true)) \ B | |||
end | |||
|
|||
(\)(a::AbstractVector, b::AbstractArray) = reshape(a, length(a), 1) \ b | |||
(\)(a::AbstractVector, b::AbstractArray) = pinv(a) * b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But should this use tol=0
or the default? (Likewise below)
base/linalg/generic.jl
Outdated
function pinv(v::AbstractVector{T}, tol::Real=real(zero(T))) where T | ||
res = similar(v, typeof(zero(T) / (abs2(one(T)) + abs2(one(T)))))' | ||
den = sum(abs2, v) | ||
if den <= tol^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with pinv(::StridedMatrix, tol)
, this should be iszero(den) || tol >= one(tol)
. Not sure how much sense that makes, though. Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you decide against consistency with the matrix case then? And to stick with default tolerance above?
Also broaden from StridedVector to AbstractVector while at it and don't employ full matrix SVD.
Also start testing consistency between division and multiplication with pseudo-inverse involving vectors.
Let \(::AbstractVector, ::AbstractMatrix) return a RowVector and \(::AbstractVector, ::AbstractVector) return a scalar.
4a46304
to
703f942
Compare
Good from my side. Travis failure is good old arnoldi. |
(/)(A::AbstractVecOrMat, B::AbstractVecOrMat) = (B' \ A')' | ||
# \(A::StridedMatrix,x::Number) = inv(A)*x Should be added at some point when the old elementwise version has been deprecated long enough | ||
# /(x::Number,A::StridedMatrix) = x*inv(A) | ||
/(x::Number, v::AbstractVector) = x*pinv(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the result between two versions without any deprecation and have just caught me completely by surprise.
One more evidence why giving completely different meaning for operations that used to have elwize meanings makes no sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which versions? It is an error on 0.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, there used to be /(x::Number, r::Range)
:
julia> VERSION
v"0.6.0"
julia> 1 / (1:4)
4-element Array{Float64,1}:
1.0
0.5
0.333333
0.25
julia> VERSION
v"0.7.0-DEV.2158"
julia> 1 / (1:4)
1×4 RowVector{Float64,Array{Float64,1}}:
0.0333333 0.0666667 0.1 0.133333
The old method was removed in #22932 without adding an appropriate deprecation. So I guess we could add a deprecated /(x::Number, r::Range)
to still do the element-wise operation, which would be more specific than the method defined here. @yuyichao is that the case that was bothering you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I may add, when dividing a scalar by a vector, taking the pseudo-inverse of the "vector" and then multiplying it by a constant seems very far from an intuitive behavior to me. I think if one is doing this operation and one is aware it is taking the pseudoinverse, then explicitly calling pinv
seems more appropriate rather than relying on this (in)convenience.
No tests yet, I first would like feedback whether we actually want (all of) this.
Demo:
Closes JuliaLang/LinearAlgebra.jl#451.