-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*syevr eigenvalues depend on whether eigenvectors are computed or not #997
Comments
SYEVR calls STERF when eigenvalues only are requested (jobz='N' ) Line 573 in 5426147
SYEVR calls STEMR when eigenvalues and eigenvectors are requested (jobz='V' ) Line 583 in 5426147
So the fact that SYEVR does not exactly return the same eigenvalues whether jobz is 'V' or 'N' is to be expected. Maybe use I can ask around for opinions on whether we can make SYEVR returns the same eigenvalues whether jobz is 'V' or 'N'. I understand that this would be a desirable feature. STEMR has been designed to compute eigenvectors. STERF is a perfectly good and fast algorithm to compute eigenvalues. Then if you want eigenvectors, you might want to switch to STEMR. STEMR should be faster than STERF (when eigenvectors are desired). STEMR is also much more parallelizable than STERF. When seeking eigenvectors, the STEMR algorithm uses a different algorithm than STERF to compute the eigenvalues so the eigenvalues returned by STEMR are slightly different from the ones returned by STERF. |
This is not the first time I have heard of confusion around different results. The svd can exhibit similar behavior. A professor of mine even once spoke to me about a "bug" in matlab, where the I think a good fix would be to explicitly say that there may be differences in the documentation. Even if this documentation is not propagated downstream to scipy and matlab (which I think is ok), it could help library maintainers explain weird behavior. It could also be useful to explicitly state that the result should be the same in the routines where this is the case. I think all linear factorizations (LU, QR, ...), and the nonsymmetric eigenvalue factorizations should satisfy that requirement. |
I suggest to keep the current behavior and make it obvious to developers without numerical linear algebra background that this may be the case. Their expectation of computing the same eigenvalues for the same matrix is perfectly reasonable.
+1 Here is a draft: diff --git a/SRC/dsyevr.f b/SRC/dsyevr.f
index 8647b0162..7cb0d0229 100644
--- a/SRC/dsyevr.f
+++ b/SRC/dsyevr.f
@@ -39,9 +39,15 @@
*> \verbatim
*>
*> DSYEVR computes selected eigenvalues and, optionally, eigenvectors
-*> of a real symmetric matrix A. Eigenvalues and eigenvectors can be
+*> of a real symmetric matrix A. Eigenvalues and eigenvectors can be
*> selected by specifying either a range of values or a range of
-*> indices for the desired eigenvalues.
+*> indices for the desired eigenvalues. Invokations with different
+*> choices for these parameters may result in the computation of
+*> slightly different eigenvalues and eigenvectors for the same matrix.
+*> The reason for this behavior is that there exists a variety of
+*> algorithms -- each performing best for a particular set of options --
+*> with DSYEVR selecting the best. In all cases, the computed values are
+*> accurate within the limits of finite precision arithmetic.
*>
*> DSYEVR first reduces the matrix A to tridiagonal form T with a call
*> to DSYTRD. Then, whenever possible, DSYEVR calls DSTEMR to compute
@@ -105,6 +111,9 @@
*> JOBZ is CHARACTER*1
*> = 'N': Compute eigenvalues only;
*> = 'V': Compute eigenvalues and eigenvectors.
+*>
+*> This parameter influences the chosen algorithm and may
+*> influence the computed values.
*> \endverbatim
*>
*> \param[in] RANGE
@@ -116,6 +125,9 @@
*> = 'I': the IL-th through IU-th eigenvalues will be found.
*> For RANGE = 'V' or 'I' and IU - IL < N - 1, DSTEBZ and
*> DSTEIN are called
+*>
+*> This parameter influences the chosen algorithm and may
+*> change the computed values.
*> \endverbatim
*>
*> \param[in] UPLO |
Of course, scipy's pinvh issue can and will be worked around in SciPy. LAPACK documentation update would indeed be great! This fact is not widely known among people otherwise not complete novices in numerical analysis. While at it, maybe worth expanding a somewhat cryptic note --- https://netlib.org/lapack/explore-html-3.6.1/d2/d8a/group__double_s_yeigen_ga2ad9f4a91cddbf67fe41b621bd158f5c.html:
Either way, I'll see to adding a note to the SciPy docs. One possibly non-doc issue is that SYEVR result in the zero eigenvalue of the OP matrix is ever so slightly larger than the machine epsilon (2.66E-015 vs 2.22e-15). While the very fact that eigenvalues are computed to almost machine epsilon is remarkable indeed, and it may be not realistic to ask them to be strictly below the machine epsilon, I wonder if there's a relatively simple tweak to the iteration to push it down to below epsilon. |
Just to avoid potential miscommunication here: in general the accuracy is bounded by the machine epsilon times the largest singular value (for symmetric matrices, the singular values are the moduli of the eigenvalues). This means, for example, that multiplying the matrix by 1000 will cause the error bounds to grow by factor 1000. |
Thanks @christoph-conrads. Below are some slight edits.
@christoph-conrads: It would be great if you submit a PR with these suggested changes. |
Side comment: in retrospect, I wish that SYEVR/HEEVR would only offer to compute "eigenvectors and eigenvalue". I think it is not a good idea to offer the option "eigenvalue only" and then call a completely different algorithm because this algorithm is better. |
Following up with @christoph-conrads comment: "This means, for example, that multiplying the matrix by 1000 will cause the error bounds to grow by factor 1000." This is absolutely correct. We are speaking here about absolute error bounds. Expanding on this, it also means that "multiplying the largest eigenvalue of the matrix by 1000 and leaving the other constant" will cause the relative error bounds of the n-1 smallest eigenvalues to grow by factor 1000. (Whereas the relative error bound of the largest eigenvalue would be left unchanged.) And also "multiplying the matrix by 1000" should not change the relative error bounds. |
Maybe it would be best to remove these comments from the source files. Let me know your opinion. I agree that the statement "the algorithm attains full accuracy of the computed eigenvalues" is misleading. @christoph-conrads is entirely correct on what to expect from SYEVR. See his comment above. The comment "the algorithm attains full accuracy of the computed eigenvalues" do not relate to the accuracy of the computed eigenvalues with respect to the user's problem, (initial matrix A,) but to the local representation of the problem. The algorithm used here is the MRRR algorithm. The MRRR algorithm uses T - sigma I = L D L^T factorization (also known as Relatively Robust Representations) to represent the eigenproblem. In this representation, L and D define all the wanted eigenvalues to high relative accuracy. This means that small relative changes in the entries of D and L cause only small relative changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the tridiagonal matrix T does not have this property in general. However, while we have high relative accuracy with the L and D matrix eigenvalue problem, the L and D matrix eigenvalue problem is not related to the initial eigenvalue problem with high relative accuracy. |
For SYEV, I think for MRRR (SYEVR) and DD (SYEVD), it would be possible to run the eigenvalue computation without computing the eigenvectors. Similarly, for SVD, and DD (GESDD). This would be a mess to do, but I think that would be possible. This should be similar to what the QR algorithm is doing where you compute the eigenvalues and then decide whether or not you want the eigenvectors. I am not a fan of MRRR and DD having an option: “do you want eigenvalue only?” and then, if "yes, eigenvalue only", MRRR and DD bail out to QR. My preference would be to Option (a) does not feel like an option any longer because of backward compatibility. Option (b) is quite complicated to implement and I am not even sure whether this is possible. For MRRR, there is some ramification of Gram-Schmidt and re-orthogonalization that comes in and I do not know whether this changes eigenvalue when these kick in. Etc. There are other flags (in addition to JOBZ) that can change eigenvalues such as "RANGE" in MRRR. All in all, I am fine with the current design. |
Very much +146 to this. The ship has sailed indeed, though.
If you're asking me :-), to a user this comment is not all that helpful. Even with your explanations above, the only way to make sense out of them is to go to the references, as listed in the SYEVR docs (I did not yet). So if this comment relates to peculiarities of how how STEMR is implemented w.r.t. the references, it'd be great to expand the note. If it stresses one of the points of those references, the note is best removed indeed IMO. |
Naively, I'd expect that computed eigenvalues are exactly same whether eigenvectors are computed or not. However, for the symmetric
dsyevr
solver, this does not seem to be the case. The difference is "small", however it is very visible for e.g. when an eigenvalue is zero is exact arithmetics. Consider the following matrixwith the exact eigenvalues 0, 1, and 3.
Using
dsyevr
(the driver is below the fold) producesNote the difference in the "zero" eigenvalue.
The questions is whether my expectation (jobz='N' is exactly equivalent to "run with jobz='V', discard eigenvectors") is too strong, or if it is a potentially fixable defect?
The origin of this question is scipy/scipy#12515
with
and
The text was updated successfully, but these errors were encountered: