Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RowVector, pinv, and quasi-division #23067

Merged
merged 5 commits into from
Aug 22, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion base/linalg/dense.jl
Original file line number Diff line number Diff line change
Expand Up @@ -895,7 +895,6 @@ function pinv(A::StridedMatrix{T}) where T
tol = eps(real(float(one(T))))*maximum(size(A))
return pinv(A, tol)
end
pinv(a::StridedVector) = pinv(reshape(a, length(a), 1))
function pinv(x::Number)
xi = inv(x)
return ifelse(isfinite(xi), xi, zero(xi))
Expand Down
16 changes: 15 additions & 1 deletion base/linalg/generic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -795,6 +795,19 @@ function inv(A::AbstractMatrix{T}) where T
A_ldiv_B!(factorize(convert(AbstractMatrix{S}, A)), eye(S0, checksquare(A)))
end

function pinv(v::AbstractVector{T}, tol::Real=real(zero(T))) where T
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be tol::Real=eps(real(float(one(T))))*length(v) for consistency with pinv(::StridedMatrix{T}) (and likewise below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I was under the impression that pinv(::StridedMatrix{T}, tol) ignores all singular values below tol, but the threshold is actually tol * maximum(singular_values), so for the vector case, it only matters whether tol < 1. So the with the computed default tol consistent with the matrix case, an all-zero vector would be returned if the input is all-zero, which makes sense, or if length(v) > 1/eps(real(float(one(T)))), which strikes me as pretty bizarre.

I think I'd rather keep the tol=0 default than aim for maximum consistency here. Opinions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinv(::Diagonal) also uses tol=0 by default, so there is precedence for inconsistency here. Users striving for consistency across types should probably give tol explicitly, otherwise exploiting information the type conveys to choose a sensible default seems pretty reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it only matters whether tol < 1

I don't follow the reasoning here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A vector has one singular value s, which obvious is the maximum one. That singular value will be ignored (resulting in a zero vector) if s<=s*tol, i.e. tol>=1 or s=0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A vector has one singular value s, which obvious is the maximum one.

Of course.

I can see that the length(v) > 1/eps(real(float(one(T)))) condition is a bit weird although it might not be a big practical concern since it would be 32 PB for Float64. However, I did some simulations and I'm wondering if using the maximum length is the right thing to do. It looks like it is the minimum that determines the error and using the minimum would fix the bizarre case for vectors.

res = similar(v, typeof(zero(T) / (abs2(one(T)) + abs2(one(T)))))'
den = sum(abs2, v)
# as tol is the threshold relative to the maximum singular value, for a vector with
# single singular value σ=√den, σ ≦ tol*σ is equivalent to den=0 ∨ tol≥1
if iszero(den) || tol >= one(tol)
fill!(res, zero(eltype(res)))
else
res .= v' ./ den
end
return res
end

"""
\\(A, B)

Expand Down Expand Up @@ -842,10 +855,11 @@ function (\)(A::AbstractMatrix, B::AbstractVecOrMat)
return qrfact(A,Val(true)) \ B
end

(\)(a::AbstractVector, b::AbstractArray) = reshape(a, length(a), 1) \ b
(\)(a::AbstractVector, b::AbstractArray) = pinv(a) * b
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But should this use tol=0 or the default? (Likewise below)

(/)(A::AbstractVecOrMat, B::AbstractVecOrMat) = (B' \ A')'
# \(A::StridedMatrix,x::Number) = inv(A)*x Should be added at some point when the old elementwise version has been deprecated long enough
# /(x::Number,A::StridedMatrix) = x*inv(A)
/(x::Number, v::AbstractVector) = x*pinv(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the result between two versions without any deprecation and have just caught me completely by surprise.

One more evidence why giving completely different meaning for operations that used to have elwize meanings makes no sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which versions? It is an error on 0.6.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there used to be /(x::Number, r::Range):

julia> VERSION
v"0.6.0"

julia> 1 / (1:4)
4-element Array{Float64,1}:
 1.0     
 0.5     
 0.333333
 0.25
julia> VERSION
v"0.7.0-DEV.2158"

julia> 1 / (1:4)
1×4 RowVector{Float64,Array{Float64,1}}:
 0.0333333  0.0666667  0.1  0.133333

The old method was removed in #22932 without adding an appropriate deprecation. So I guess we could add a deprecated /(x::Number, r::Range) to still do the element-wise operation, which would be more specific than the method defined here. @yuyichao is that the case that was bothering you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I may add, when dividing a scalar by a vector, taking the pseudo-inverse of the "vector" and then multiplying it by a constant seems very far from an intuitive behavior to me. I think if one is doing this operation and one is aware it is taking the pseudoinverse, then explicitly calling pinv seems more appropriate rather than relying on this (in)convenience.


cond(x::Number) = x == 0 ? Inf : 1.0
cond(x::Number, p) = cond(x)
Expand Down
4 changes: 4 additions & 0 deletions base/linalg/rowvector.jl
Original file line number Diff line number Diff line change
Expand Up @@ -227,8 +227,12 @@ Ac_mul_B(::RowVector, ::AbstractVector) = throw(DimensionMismatch("Cannot multip
Ac_mul_B(vec::AbstractVector, rowvec::RowVector) = throw(DimensionMismatch("Cannot multiply two transposed vectors"))
@inline Ac_mul_B(vec1::AbstractVector, vec2::AbstractVector) = ctranspose(vec1)*vec2

# Pseudo-inverse
pinv(v::RowVector, tol::Real=0) = pinv(v', tol)'

# Left Division #

\(rowvec1::RowVector, rowvec2::RowVector) = pinv(rowvec1) * rowvec2
\(mat::AbstractMatrix, rowvec::RowVector) = throw(DimensionMismatch("Cannot left-divide transposed vector by matrix"))
At_ldiv_B(mat::AbstractMatrix, rowvec::RowVector) = throw(DimensionMismatch("Cannot left-divide transposed vector by matrix"))
Ac_ldiv_B(mat::AbstractMatrix, rowvec::RowVector) = throw(DimensionMismatch("Cannot left-divide transposed vector by matrix"))
Expand Down
39 changes: 38 additions & 1 deletion test/linalg/dense.jl
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ bimg = randn(n,2)/2
@test_throws DimensionMismatch b'\b
@test_throws DimensionMismatch b\b'
@test norm(a*x - b, 1)/norm(b) < ε*κ*n*2 # Ad hoc, revisit!
@test zeros(eltya,n)\ones(eltya,n) ≈ zeros(eltya,n,1)\ones(eltya,n,1)
@test zeros(eltya,n)\ones(eltya,n) ≈ (zeros(eltya,n,1)\ones(eltya,n,1))[1,1]
end

@testset "Test nullspace" begin
Expand Down Expand Up @@ -629,6 +629,43 @@ end
end
end

function test_rdiv_pinv_consistency(a, b)
@test (a*b)/b ≈ a*(b/b) ≈ (a*b)*pinv(b) ≈ a*(b*pinv(b))
@test typeof((a*b)/b) == typeof(a*(b/b)) == typeof((a*b)*pinv(b)) == typeof(a*(b*pinv(b)))
end
function test_ldiv_pinv_consistency(a, b)
@test a\(a*b) ≈ (a\a)*b ≈ (pinv(a)*a)*b ≈ pinv(a)*(a*b)
@test typeof(a\(a*b)) == typeof((a\a)*b) == typeof((pinv(a)*a)*b) == typeof(pinv(a)*(a*b))
end
function test_div_pinv_consistency(a, b)
test_rdiv_pinv_consistency(a, b)
test_ldiv_pinv_consistency(a, b)
end

@testset "/ and \\ consistency with pinv for vectors" begin
@testset "Tests for type $elty" for elty in (Float32, Float64, Complex64, Complex128)
c = rand(elty, 5)
r = rand(elty, 5)'
cm = rand(elty, 5, 1)
rm = rand(elty, 1, 5)
@testset "inner prodcuts" begin
test_div_pinv_consistency(r, c)
test_div_pinv_consistency(rm, c)
test_div_pinv_consistency(r, cm)
test_div_pinv_consistency(rm, cm)
end
@testset "outer prodcuts" begin
test_div_pinv_consistency(c, r)
test_div_pinv_consistency(cm, rm)
end
@testset "matrix/vector" begin
m = rand(5, 5)
test_ldiv_pinv_consistency(m, c)
test_rdiv_pinv_consistency(r, m)
end
end
end

@testset "test ops on Numbers for $elty" for elty in [Float32,Float64,Complex64,Complex128]
a = rand(elty)
@test expm(a) == exp(a)
Expand Down
16 changes: 16 additions & 0 deletions test/linalg/pinv.jl
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,18 @@ end
a = onediag_sparse(eltya, m)
test_pinv(a, m, m, default_tol, default_tol, default_tol)
end
@testset "Vector" begin
a = rand(eltya, m)
apinv = @inferred pinv(a)
@test pinv(hcat(a)) ≈ apinv
@test apinv isa RowVector{eltya}
end
@testset "RowVector" begin
a = rand(eltya, m)'
apinv = @inferred pinv(a)
@test pinv(vcat(a)) ≈ apinv
@test apinv isa Vector{eltya}
end
end
end

Expand All @@ -141,6 +153,10 @@ end
@test a[1] ≈ 0.0
@test a[2] ≈ 0.0

a = pinv([zero(eltya); zero(eltya)]')
@test a[1] ≈ 0.0
@test a[2] ≈ 0.0

a = pinv(Diagonal([zero(eltya); zero(eltya)]))
@test a.diag[1] ≈ 0.0
@test a.diag[2] ≈ 0.0
Expand Down