-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient and less intrusive quadform
#444
Conversation
Codecov Report
@@ Coverage Diff @@
## master #444 +/- ##
=======================================
Coverage 92.24% 92.24%
=======================================
Files 83 83
Lines 5092 5096 +4
=======================================
+ Hits 4697 4701 +4
Misses 395 395
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isposdef
changes mean we attempt a cholesky factorization of A
, then if it fails we attempt a cholesky factorization of -A
. So in the negative-definite case, I wonder if this is slower than calculating the eigenvalues once.
For the Hermitian
change, I would expect it to be Symmetric
because we do the issymmetric
check above (and the old code did do Symmetric
). But from the discourse thread I see that using Hermitian
instead avoids a missing method in GenericLinearAlgebra. I think ideally that should be fixed there, but I think it's fine to have this workaround here as long as it behaves the same.
Looking at this, I think it was written for real matrices but actually the reformulation here should hold for complex matrices too, since it is just
||sqrt(A)*x||^2 = <sqrt(A)*x, sqrt(A)*x> = <x, A*x>
which holds when sqrt(A)
is self-adjoint (i.e. Hermitian), which is what we get from sqrt
with a Hermitian input.
We need to drop the real
for that to work, but I think the real
is not needed anyway; e.g.
julia> A = [1 2; 2 50]
2×2 Matrix{Int64}:
1 2
2 50
julia> isposdef(A)
true
julia> sqrt(Hermitian(A))
2×2 Symmetric{Float64, Matrix{Float64}}:
0.968528 0.248904
0.248904 7.06669
gives us a real matrix anyway. (Even on Julia 1.0). So I've suggested changes to that effect. We should also add a test for quadform
with complex Hermitian A
, but I can do that in a separate PR if you don't want to do it here.
Co-authored-by: Eric Hanson <5846501+ericphanson@users.noreply.github.com>
Co-authored-by: Eric Hanson <5846501+ericphanson@users.noreply.github.com>
Let me try to add the test if you do not mind ! For complex inputs, a hermitian and positive semidefinite Matrix will be enough to ensure convexity. For real inputs, a hermitian matrix is a symmetric matrix. In fact what we did is not equivalent to what was there previously : for complex inputs, hermitians matrices are not forced to be symmetric so this new case is now allowed. We should make a test out of it. Cu Monday for more ;) |
Ok, great! Check out the ProblemDepot docs if you haven't already, to get an idea of how the tests work for Convex.jl, and let me know if you run into any difficulties or have any questions. |
So according to https://discourse.julialang.org/t/regresion-eigvals-of-bigfloat-symmetric-matrices-does-not-work-anymore/63947/2?u=lrnv the problem is getting fixed at the source, in Do you still want to make this change, as the extension of current method to complex hermitian matrices ? |
And a test was added. Tell me what you think. |
I think this is a good improvement still :). The code was incorrect in the complex case before, though luckily it seems you would hit a MethodError if you tried to use ERROR: MethodError: no method matching eigvals!(::LinearAlgebra.Symmetric{ComplexF64, Matrix{ComplexF64}}) Now it is correct in the complex case. One thing I wondered is if we should have a separate code path for real + symmetric vs complex + hermitian, but it seems this code works correctly without any overhead in the real-symmetric case (since The last concern I had was the possible cost of calling two julia> A = randn(10,10);
julia> A = transpose(A)*A;
julia> @benchmark isposdef($A)
BechmarkTools.Trial: 10000 samples with 182 evaluations.
Range (min … max): 580.126 ns … 7.092 μs ┊ GC (min … max): 0.00% … 89.98%
Time (median): 595.698 ns ┊ GC (median): 0.00%
Time (mean ± σ): 620.802 ns ± 378.640 ns ┊ GC (mean ± σ): 3.65% ± 5.44%
▄▅▇█▄▃▁▁
▂▂▂▂▂▂▂▄▆████████▅▅▄▃▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂ ▃
580 ns Histogram: frequency by time 654 ns <
Memory estimate: 944 bytes, allocs estimate: 3.
julia> @benchmark eigvals($A)
BechmarkTools.Trial: 10000 samples with 1 evaluations.
Range (min … max): 17.583 μs … 2.496 ms ┊ GC (min … max): 0.00% … 97.86%
Time (median): 20.792 μs ┊ GC (median): 0.00%
Time (mean ± σ): 23.355 μs ± 25.816 μs ┊ GC (mean ± σ): 1.05% ± 0.98%
▂▅█▇▄▁▂▃▁ ▁▁▁▁▂▁▁▁▁ ▂
▇███████████▇▇▇▇▇▇▇▆▆▆▅▆▅▄▄▄▅▄▆▄▁▅▅▄▅▆▇▇██████████▇▇▇▆▅▅▆▅▆ █
17.6 μs Histogram: log(frequency) by time 49.5 μs <
Memory estimate: 5.25 KiB, allocs estimate: 10.
julia> A = randn(100,100);
julia> A = transpose(A)*A;
julia> @benchmark isposdef($A)
BechmarkTools.Trial: 10000 samples with 1 evaluations.
Range (min … max): 47.459 μs … 1.550 ms ┊ GC (min … max): 0.00% … 86.47%
Time (median): 100.167 μs ┊ GC (median): 0.00%
Time (mean ± σ): 109.967 μs ± 51.644 μs ┊ GC (mean ± σ): 1.34% ± 3.62%
▃▄▃ ▁▆█▄
▁▁▂▃▂▅█████▄▃▃▄████▇▅▅▆▅▅▆▆▆▇▆▇▅▆▅▅▄▄▄▃▄▄▄▆▅▅▅▅▄▄▃▃▂▂▂▁▁▁▁▁▁ ▃
47.5 μs Histogram: frequency by time 197 μs <
Memory estimate: 78.25 KiB, allocs estimate: 4.
julia> @benchmark eigvals($A)
BechmarkTools.Trial: 2522 samples with 1 evaluations.
Range (min … max): 1.137 ms … 28.356 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.734 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.978 ms ± 822.023 μs ┊ GC (mean ± σ): 0.12% ± 1.73%
▃ ▂█
▆█▄▃▃▂▃▃▄▅██▇▅▄▄▄▃▄▃▄▅▄▄▃▃▃▃▃▃▃▃▃▃▄▄▄▃▃▂▂▂▂▃▂▂▂▂▂▂▂▂▂▂▁▂▁▂▂ ▃
1.14 ms Histogram: frequency by time 3.98 ms <
Memory estimate: 115.81 KiB, allocs estimate: 11.
julia> A = randn(100,100);
julia> A = transpose(A)*A;
julia> @benchmark isposdef($(-A))
BechmarkTools.Trial: 10000 samples with 3 evaluations.
Range (min … max): 8.194 μs … 331.472 μs ┊ GC (min … max): 0.00% … 76.09%
Time (median): 9.514 μs ┊ GC (median): 0.00%
Time (mean ± σ): 10.915 μs ± 19.037 μs ┊ GC (mean ± σ): 12.25% ± 6.76%
▂▃▄▅▄█▇█▆▄▂▁
▂▁▁▁▁▁▁▁▂▁▁▁▂▁▂▂▂▂▂▂▃▃▄▅▆▇█████████████▆▆▆▅▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂ ▄
8.19 μs Histogram: frequency by time 10.6 μs <
Memory estimate: 78.25 KiB, allocs estimate: 4.
julia> @benchmark eigvals($(-A))
BechmarkTools.Trial: 2853 samples with 1 evaluations.
Range (min … max): 1.144 ms … 7.324 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.661 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.749 ms ± 519.241 μs ┊ GC (mean ± σ): 0.14% ± 1.98%
█▁ ▄▄
▅██▃▂▁▁▁▁▁▂▃▅▄▄▅██▆▃▃▃▂▂▂▂▂▂▂▂▂▂▃▃▂▂▁▁▂▂▁▁▁▂▁▁▁▂▂▂▂▂▂▃▂▂▂▂▂ ▂
1.14 ms Histogram: frequency by time 2.96 ms <
Memory estimate: 115.81 KiB, allocs estimate: 11.
|
Co-authored-by: Eric Hanson <5846501+ericphanson@users.noreply.github.com>
The gain of time from Thanks for fixing the tests ! Let's see if the CI passes. |
Unrelated side-note: when did |
The new printing came from JuliaCI/BenchmarkTools.jl#217 which is in BenchmarkTools v1.1. Super cool, right? |
The nightly failures are weird but unrelated, so I’ll merge this. Thanks @lrnv! |
This follows this thread on discourse : https://discourse.julialang.org/t/regresion-eigvals-of-bigfloat-symmetric-matrices-does-not-work-anymore/63947
After understanding that the algorithm did not need to call eigvals, I tweaked it here: Since A is supposed to be a matrix of values, this should be equivalent while still working for weird types.