Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*syevr eigenvalues depend on whether eigenvectors are computed or not #997

Closed
ev-br opened this issue Mar 19, 2024 · 11 comments
Closed

*syevr eigenvalues depend on whether eigenvectors are computed or not #997

ev-br opened this issue Mar 19, 2024 · 11 comments

Comments

@ev-br
Copy link

ev-br commented Mar 19, 2024

Naively, I'd expect that computed eigenvalues are exactly same whether eigenvectors are computed or not. However, for the symmetric dsyevr solver, this does not seem to be the case. The difference is "small", however it is very visible for e.g. when an eigenvalue is zero is exact arithmetics. Consider the following matrix

 1   -1    0
-1    2   -1
 0   -1    1

with the exact eigenvalues 0, 1, and 3.

Using dsyevr (the driver is below the fold) produces

 jobz = "V" info =            0
 w =    2.6645352591003757E-015   1.0000000000000018        3.0000000000000000     
 jobz = "N" info =            0
 w =    3.9250536344271737E-017  0.99999999999999978        3.0000000000000000  

Note the difference in the "zero" eigenvalue.

The questions is whether my expectation (jobz='N' is exactly equivalent to "run with jobz='V', discard eigenvectors") is too strong, or if it is a potentially fixable defect?

The origin of this question is scipy/scipy#12515

$ cat syevr.f90 
implicit none

integer, parameter :: n = 3
integer, parameter :: lda = n
real*8 :: a(lda, n)
real*8 :: abstol
real*8 :: vl, vu
integer :: il, iu
character*1 :: jobz

! outputs
integer :: m
integer :: ldz = n
real*8 :: w(1:n)
real*8 :: z(n, n)
integer :: isuppz(1:2*n)

integer :: lwork = 100
real*8 :: work(1:100)

integer :: liwork = 100
integer :: iwork(1:100)
integer :: info

external :: dsyevr

a(1, :) = (/ 1., -1., 0. /)
a(2, :) = (/ -1, 2, -1 /)
a(3, :) = (/ 0, -1, 1 /)

! not referenced when range='A'
vl = 0.d0
vu = 3.d0

il = 1
iu = n

abstol = 0.0d0

jobz = 'V'
call dsyevr(jobz, 'A', 'U', n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work, lwork, iwork, liwork, info) 
print*, "jobz = ", jobz, "info = ", info
print*, "w = ", w

jobz = 'N'
call dsyevr(jobz, 'A', 'U', n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work, lwork, iwork, liwork, info) 
print*, "jobz = ", jobz, "info = ", info
print*, "w = ", w

end

$ gfortran syevr.f90 -lblas -llapack
$ ./a.out 
 jobz = Vinfo =            0
 w =    2.6645352591003757E-015   1.0000000000000018        3.0000000000000000     
 jobz = Ninfo =            0
 w =    3.9250536344271737E-017  0.99999999999999978        3.0000000000000000 

with

$ gfortran --version
GNU Fortran (conda-forge gcc 12.3.0-5) 12.3.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

and

$ mamba list |grep lapack
lapack                    3.9.0                    netlib    conda-forge
liblapack                 3.9.0           5_h92ddd45_netlib    conda-forge
liblapacke                3.9.0           5_h92ddd45_netlib    conda-forge
@langou
Copy link
Contributor

langou commented Mar 19, 2024

SYEVR calls STERF when eigenvalues only are requested (jobz='N' )

CALL DSTERF( N, W, WORK( INDEE ), INFO )

SYEVR calls STEMR when eigenvalues and eigenvectors are requested (jobz='V' )

CALL DSTEMR( JOBZ, 'A', N, WORK( INDDD ), WORK( INDEE ),

So the fact that SYEVR does not exactly return the same eigenvalues whether jobz is 'V' or 'N' is to be expected.

Maybe use rcond a little higher in pinvh so that you filter eigenvalues more consistently independently of small variations in the small eigenvalues.

I can ask around for opinions on whether we can make SYEVR returns the same eigenvalues whether jobz is 'V' or 'N'. I understand that this would be a desirable feature.

STEMR has been designed to compute eigenvectors. STERF is a perfectly good and fast algorithm to compute eigenvalues. Then if you want eigenvectors, you might want to switch to STEMR. STEMR should be faster than STERF (when eigenvectors are desired). STEMR is also much more parallelizable than STERF. When seeking eigenvectors, the STEMR algorithm uses a different algorithm than STERF to compute the eigenvalues so the eigenvalues returned by STEMR are slightly different from the ones returned by STERF.

@thijssteel
Copy link
Collaborator

thijssteel commented Mar 20, 2024

This is not the first time I have heard of confusion around different results. The svd can exhibit similar behavior. A professor of mine even once spoke to me about a "bug" in matlab, where the cond(A) could give a different result than [U, S, V] = svd(A); S(1,1)/S(n,n).

I think a good fix would be to explicitly say that there may be differences in the documentation. Even if this documentation is not propagated downstream to scipy and matlab (which I think is ok), it could help library maintainers explain weird behavior.

It could also be useful to explicitly state that the result should be the same in the routines where this is the case. I think all linear factorizations (LU, QR, ...), and the nonsymmetric eigenvalue factorizations should satisfy that requirement.

@christoph-conrads
Copy link
Contributor

christoph-conrads commented Mar 20, 2024

I can ask around for opinions on whether we can make SYEVR returns the same eigenvalues whether jobz is 'V' or 'N'. I understand that this would be a desirable feature.

I suggest to keep the current behavior and make it obvious to developers without numerical linear algebra background that this may be the case. Their expectation of computing the same eigenvalues for the same matrix is perfectly reasonable.

I think a good fix would be to explicitly say that there may be differences in the documentation.

+1

Here is a draft:

diff --git a/SRC/dsyevr.f b/SRC/dsyevr.f
index 8647b0162..7cb0d0229 100644
--- a/SRC/dsyevr.f
+++ b/SRC/dsyevr.f
@@ -39,9 +39,15 @@
 *> \verbatim
 *>
 *> DSYEVR computes selected eigenvalues and, optionally, eigenvectors
-*> of a real symmetric matrix A.  Eigenvalues and eigenvectors can be
+*> of a real symmetric matrix A. Eigenvalues and eigenvectors can be
 *> selected by specifying either a range of values or a range of
-*> indices for the desired eigenvalues.
+*> indices for the desired eigenvalues. Invokations with different
+*> choices for these parameters may result in the computation of
+*> slightly different eigenvalues and eigenvectors for the same matrix.
+*> The reason for this behavior is that there exists a variety of
+*> algorithms -- each performing best for a particular set of options --
+*> with DSYEVR selecting the best. In all cases, the computed values are
+*> accurate within the limits of finite precision arithmetic. 
 *>
 *> DSYEVR first reduces the matrix A to tridiagonal form T with a call
 *> to DSYTRD.  Then, whenever possible, DSYEVR calls DSTEMR to compute
@@ -105,6 +111,9 @@
 *>          JOBZ is CHARACTER*1
 *>          = 'N':  Compute eigenvalues only;
 *>          = 'V':  Compute eigenvalues and eigenvectors.
+*>
+*>          This parameter influences the chosen algorithm and may
+*>          influence the computed values.
 *> \endverbatim
 *>
 *> \param[in] RANGE
@@ -116,6 +125,9 @@
 *>          = 'I': the IL-th through IU-th eigenvalues will be found.
 *>          For RANGE = 'V' or 'I' and IU - IL < N - 1, DSTEBZ and
 *>          DSTEIN are called
+*>
+*>          This parameter influences the chosen algorithm and may
+*>          change the computed values.
 *> \endverbatim
 *>
 *> \param[in] UPLO

@ev-br
Copy link
Author

ev-br commented Mar 20, 2024

Maybe use rcond a little higher in pinvh so that you filter eigenvalues more consistently independently of small variations in the small eigenvalues.

Of course, scipy's pinvh issue can and will be worked around in SciPy.

LAPACK documentation update would indeed be great! This fact is not widely known among people otherwise not complete novices in numerical analysis.

While at it, maybe worth expanding a somewhat cryptic note --- https://netlib.org/lapack/explore-html-3.6.1/d2/d8a/group__double_s_yeigen_ga2ad9f4a91cddbf67fe41b621bd158f5c.html:

    If the eigenvectors are desired, the algorithm attains full
    accuracy of the computed eigenvalues only right before
    the corresponding vectors have to be computed, see steps c) and d).

Either way, I'll see to adding a note to the SciPy docs.

One possibly non-doc issue is that SYEVR result in the zero eigenvalue of the OP matrix is ever so slightly larger than the machine epsilon (2.66E-015 vs 2.22e-15). While the very fact that eigenvalues are computed to almost machine epsilon is remarkable indeed, and it may be not realistic to ask them to be strictly below the machine epsilon, I wonder if there's a relatively simple tweak to the iteration to push it down to below epsilon.

@christoph-conrads
Copy link
Contributor

While the very fact that eigenvalues are computed to almost machine epsilon is remarkable indeed,

Just to avoid potential miscommunication here: in general the accuracy is bounded by the machine epsilon times the largest singular value (for symmetric matrices, the singular values are the moduli of the eigenvalues). This means, for example, that multiplying the matrix by 1000 will cause the error bounds to grow by factor 1000.

@langou
Copy link
Contributor

langou commented Mar 21, 2024

Thanks @christoph-conrads.

Below are some slight edits.

*> DSYEVR computes selected eigenvalues and, optionally, eigenvectors
*> of a real symmetric matrix A. Eigenvalues and eigenvectors can be
*> selected by specifying either a range of values or a range of
*> indices for the desired eigenvalues. Invocations with different
*> choices for these parameters may result in the computation of
*> slightly different eigenvalues and/or eigenvectors for the same 
*> matrix. The reason for this behavior is that there exists a variety
*> of algorithms -- each performing best for a particular set of options
*> -- with DSYEVR attempting to select the best based on the various
*> parameters. In all cases, the computed values are accurate within the
*> limits of finite precision arithmetic. 
*>          This parameter influences the choice of the algorithm and 
*>          may influence the computed values.

@christoph-conrads: It would be great if you submit a PR with these suggested changes.

@langou
Copy link
Contributor

langou commented Mar 21, 2024

Side comment: in retrospect, I wish that SYEVR/HEEVR would only offer to compute "eigenvectors and eigenvalue". I think it is not a good idea to offer the option "eigenvalue only" and then call a completely different algorithm because this algorithm is better.

@langou
Copy link
Contributor

langou commented Mar 21, 2024

Following up with @christoph-conrads comment: "This means, for example, that multiplying the matrix by 1000 will cause the error bounds to grow by factor 1000." This is absolutely correct. We are speaking here about absolute error bounds.

Expanding on this, it also means that "multiplying the largest eigenvalue of the matrix by 1000 and leaving the other constant" will cause the relative error bounds of the n-1 smallest eigenvalues to grow by factor 1000. (Whereas the relative error bound of the largest eigenvalue would be left unchanged.)

And also "multiplying the matrix by 1000" should not change the relative error bounds.

@langou
Copy link
Contributor

langou commented Mar 21, 2024

While at it, maybe worth expanding a somewhat cryptic note --- "If the eigenvectors are desired, the algorithm attains full accuracy of the computed eigenvalues only right before the corresponding vectors have to be computed, see steps c) and d)."

"While the very fact that eigenvalues are computed to almost machine epsilon is remarkable indeed,"

Maybe it would be best to remove these comments from the source files. Let me know your opinion.

I agree that the statement "the algorithm attains full accuracy of the computed eigenvalues" is misleading.

@christoph-conrads is entirely correct on what to expect from SYEVR. See his comment above.

The comment "the algorithm attains full accuracy of the computed eigenvalues" do not relate to the accuracy of the computed eigenvalues with respect to the user's problem, (initial matrix A,) but to the local representation of the problem. The algorithm used here is the MRRR algorithm. The MRRR algorithm uses T - sigma I = L D L^T factorization (also known as Relatively Robust Representations) to represent the eigenproblem. In this representation, L and D define all the wanted eigenvalues to high relative accuracy. This means that small relative changes in the entries of D and L cause only small relative changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the tridiagonal matrix T does not have this property in general. However, while we have high relative accuracy with the L and D matrix eigenvalue problem, the L and D matrix eigenvalue problem is not related to the initial eigenvalue problem with high relative accuracy.

@langou
Copy link
Contributor

langou commented Mar 21, 2024

For SYEV, I think for MRRR (SYEVR) and DD (SYEVD), it would be possible to run the eigenvalue computation without computing the eigenvectors. Similarly, for SVD, and DD (GESDD).

This would be a mess to do, but I think that would be possible. This should be similar to what the QR algorithm is doing where you compute the eigenvalues and then decide whether or not you want the eigenvectors.

I am not a fan of MRRR and DD having an option: “do you want eigenvalue only?” and then, if "yes, eigenvalue only", MRRR and DD bail out to QR.

My preference would be to
() either not have the option "eigenvalue only" for MRRR and DD. MRRR and DD would only compute "eigenvalues and eigenvectors". There would be no option for “eigenvalue option”.
(
) or there is a true option “eigenvalue only” that guarantee bit-by-bit reproducibility for the eigenvalue computation in the two cases "eigenvalue only” and "eigenvalue and eigenvectors".

Option (a) does not feel like an option any longer because of backward compatibility.

Option (b) is quite complicated to implement and I am not even sure whether this is possible. For MRRR, there is some ramification of Gram-Schmidt and re-orthogonalization that comes in and I do not know whether this changes eigenvalue when these kick in. Etc.

There are other flags (in addition to JOBZ) that can change eigenvalues such as "RANGE" in MRRR.

All in all, I am fine with the current design.

@langou langou closed this as completed in 88f15c2 Mar 21, 2024
@ev-br
Copy link
Author

ev-br commented Mar 21, 2024

Side comment: in retrospect, I wish that SYEVR/HEEVR would only offer to compute "eigenvectors and eigenvalue". I think it is not a good idea to offer the option "eigenvalue only" and then call a completely different algorithm because this algorithm is better.

Very much +146 to this. The ship has sailed indeed, though.

Maybe it would be best to remove these comments from the source files. Let me know your opinion.

If you're asking me :-), to a user this comment is not all that helpful. Even with your explanations above, the only way to make sense out of them is to go to the references, as listed in the SYEVR docs (I did not yet). So if this comment relates to peculiarities of how how STEMR is implemented w.r.t. the references, it'd be great to expand the note. If it stresses one of the points of those references, the note is best removed indeed IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants