Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xUNCSD2BY1 does not handle submatrices with empty rows properly #549

Open
2 tasks done
christoph-conrads opened this issue May 8, 2021 · 6 comments
Open
2 tasks done

Comments

@christoph-conrads
Copy link
Contributor

christoph-conrads commented May 8, 2021

Description

The 2x1 CS decomposition decomposes a matrix Q = [Q1; Q2], where Q^* Q = I. There are two possibilities for the CS decomposition UΣV = Q1 in the special case where Q2 has no rows:

  • U = Q1, Σ = I, V = I,
  • U = I, Σ = I, V = Q1.

That is, the only thing to do by the algorithm is copying the input matrix into U or V and set the other matrix to the identity matrix.

If

  • one of the matrices Q1 or Q2 has no rows and
  • the other matrix is square,

then SORCSD2BY1 performs out-of-bounds array accesses. I assume the same behavior occurs also for DORCSD2BY1, CUNCSD2BY1, and ZUNCSD2BY1. My expectation is that xORCSD2BY1 does not perform out-of-bounds accesses and returns one of the two possible decompositions shown above.

      PROGRAM testCSD2BY1
      IMPLICIT NONE

      INTEGER            INFO, LDA, LDU1, LDU2, LDV, M, N, P, LWORK, I
      PARAMETER        ( LDA = 1, LDU1 = 5, LDU2 = 5, LDV = 5,
     $                   M = 0, P = 1, N = 1, LWORK = 1024)

      INTEGER            IWORK(M + N + P)
      REAL               THETA( N ), A( LDA, N ),
     $                   U1( LDU1, M ), U2( LDU2, P ), V( LDV, N ),
     $                   WORK( LWORK )

      EXTERNAL           SGGQRCS

      A = 0.0
      DO I = 1, MIN( P, N )
         A( I, I ) = 1.0E0
      END DO

      CALL SORCSD2BY1( 'Y', 'Y', 'Y', M + P, M, N, 
     $                 A( 1, 1 ), LDA, A( M + 1, 1 ), LDA,
     $                 THETA,
     $                 U1, LDU1, U2, LDU2, V, LDV,
     $                 WORK, LWORK, IWORK, INFO )
      END PROGRAM testCSD2BY1

Output:

$ make
gfortran  -frecursive -fcheck=all -Wextra -Wall -pedantic bug-csd2by1.f -llapack -L/tmp/build.lapack/lib
$ env LD_LIBRARY_PATH=/tmp/build.lapack/lib ./a.out
At line 318 of file /home/christoph/lapack/SRC/sorbdb2.f
Fortran runtime error: Index '2' of dimension 1 of array 'x21' above upper bound of 1

Error termination. Backtrace:
#0  0x7fb2529d0c3b in sorbdb2_
	at /home/christoph/lapack/SRC/sorbdb2.f:318
#1  0x7fb2529dd2ce in sorcsd2by1_
	at /home/christoph/lapack/SRC/sorcsd2by1.f:549
#2  0x4009cd in ???
#3  0x400a0b in ???
#4  0x7fb251655bf6 in ???
#5  0x400789 in ???
#6  0xffffffffffffffff in ???

Checklist

  • I've included a minimal example to reproduce the issue
  • I'd be willing to make a PR to solve this issue
@christoph-conrads
Copy link
Contributor Author

Fix

For simultaneous bidiagonalization of the input matrices, xORCSD2BY1 calls xORBDB2 and xORBDB3 and the out-of-bounds array accesses occur within these bidiagonalization subroutines. According the xORBDB{2,3} documentation, the input causing the misbehavior is allowed. Consequently, a fix must modify xORBDB{2,3}.

I see two possibilities for fixing this issue:

  • xORCSD2BY1 checks for the special case where one submatrix has no rows and the other submatrix is square, xORBDB{2,3} are updated (documentation and code) to disallow such input;
  • xORBDB{2,3} are fixed to handle any valid input.

I am in favor of the latter approach because to the best of my knowledge, all LAPACK routines can handle input of dimension zero (whether this is along one axis or several axes). Hence making xORBDB{2,3} reject certain inputs of dimension zero would violate the principle of least astonishment.

@christoph-conrads
Copy link
Contributor Author

christoph-conrads commented May 9, 2021

I see two possibilities for fixing this issue:

* xORCSD2BY1 checks for the special case where one submatrix has no rows and the other submatrix is square, xORBDB{2,3} are updated (documentation and code) to disallow such input;
* xORBDB{2,3} are fixed to handle any valid input.

I am in favor of the latter approach because to the best of my knowledge, all LAPACK routines can handle input of dimension zero (whether this is along one axis or several axes). Hence making xORBDB{2,3} reject certain inputs of dimension zero would violate the principle of least astonishment.

Consider the constraints on the input dimensions, see sorbdb2.f starting at line 64:

  • m, n, p integers
  • m ≥ 0 is the number of rows of [X1; X2]
  • q is the number of columns of [X1; X2] subject to 0 ≤ q ≤ m
  • p is the number of rows of X1 subject to 0 ≤ p ≤ min(m - p, q, m - q).

([X1; X2] is Matlab-like notation and means the rows of X1 on top of the rows of X2 and X1, X2 having the same number of columns.)

Have a look at the loop with the out-of-bounds access:

lapack/SRC/sorbdb2.f

Lines 317 to 322 in 2dafa3d

DO I = P + 1, Q
CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
X21(I,I) = ONE
CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
$ X21(I,I+1), LDX21, WORK(ILARF) )
END DO

If M = Q and I = Q in the last iteration, then X21(I+1,I) = X21(M + 1, 1) in the call to SLARFGP:

          CALL SLARFGP( 1, X21(M, M), X21(M+1, M), 1, TAUP2(M) )

SLARFGP then compute in line 144 the Euclidean norm of the vector of dimension zero starting at X21(M + 1, M). snrm2 correctly infers the norm to be zero (line 125) without referencing X21(M + 1, 1). Based on the norm, slarfgp will then continue to compute the elementary reflector without referencing X21(M + 1, 1) in lines 144 to 164:

lapack/SRC/slarfgp.f

Lines 144 to 164 in 2dafa3d

XNORM = SNRM2( N-1, X, INCX )
*
IF( XNORM.EQ.ZERO ) THEN
*
* H = [+/-1, 0; I], sign chosen so ALPHA >= 0.
*
IF( ALPHA.GE.ZERO ) THEN
* When TAU.eq.ZERO, the vector is special-cased to be
* all zeros in the application routines. We do not need
* to clear it.
TAU = ZERO
ELSE
* However, the application routines rely on explicit
* zero checks when TAU.ne.ZERO, and we must clear X.
TAU = TWO
DO J = 1, N-1
X( 1 + (J-1)*INCX ) = 0
END DO
ALPHA = -ALPHA
END IF
ELSE

@weslleyspereira Like I wrote #406, this is another -fcheck=all false positive.

edit link to my comment on May 6, 2021

@weslleyspereira
Copy link
Collaborator

Hi @christoph-conrads. Thanks for the complete description! This is very helpful!

I understood this is a false positive, and that SLARFGP doesn't use X21(I+1,I) when N=1. Thanks! So, in practice, nothing unexpected happens when we turn off the flag. (all the tests pass when we turn off -fcheck=all). Equally, SLARF will not use X21(I,I+1), since tau=0. However, I am in favor of continue using some flags, like -fcheck=all, so as to avoid true out-of-bounds (and other) errors.

I propose the following change.
Replace:

lapack/SRC/sorbdb2.f

Lines 317 to 322 in 2dafa3d

DO I = P + 1, Q
CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
X21(I,I) = ONE
CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
$ X21(I,I+1), LDX21, WORK(ILARF) )
END DO

by:

      DO I = P + 1, MIN( Q, M - P - 1 )
         CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
         X21(I,I) = ONE
         CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
     $               X21(I,I+1), LDX21, WORK(ILARF) )
      END DO
*     
      DO I = M - P, Q
         TAUP2(I) = ZERO
         X21(I,I) = ONE
      END DO

With this change, all tests pass with -fcheck=all.

I can submit a PR with the fix if everything is ok.

@christoph-conrads
Copy link
Contributor Author

I propose the following change.
[snip]

      DO I = P + 1, MIN( Q, M - P - 1 )
         CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
         X21(I,I) = ONE
         CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
     $               X21(I,I+1), LDX21, WORK(ILARF) )
      END DO
*     
      DO I = M - P, Q
         TAUP2(I) = ZERO
         X21(I,I) = ONE
      END DO
  1. The do-loop at the end assumes that X21(I, I) is nonnegative for otherwise must TAUP2(I, I) = 2, cf. SRC/slarfgp.f, line 146+. I do not see why this should be the case.
  2. The same fix must be applied to SORBDB3, i.e., for the case where X11 is square and X21 has no rows.

@christoph-conrads
Copy link
Contributor Author

christoph-conrads commented May 10, 2021

  1. The loop starting in SRC/sorbdb2.f, line 279, will produce out-of-bounds accesses because of the references X11(I+1, I), X11(I, I+1) (if P = Q), and X11(I+1, I+1).
  2. Case 3) applies to SORBDB2 with M - P = Q.

lapack/SRC/sorbdb2.f

Lines 279 to 313 in 2dafa3d

DO I = 1, P
*
IF( I .GT. 1 ) THEN
CALL SROT( Q-I+1, X11(I,I), LDX11, X21(I-1,I), LDX21, C, S )
END IF
CALL SLARFGP( Q-I+1, X11(I,I), X11(I,I+1), LDX11, TAUQ1(I) )
C = X11(I,I)
X11(I,I) = ONE
CALL SLARF( 'R', P-I, Q-I+1, X11(I,I), LDX11, TAUQ1(I),
$ X11(I+1,I), LDX11, WORK(ILARF) )
CALL SLARF( 'R', M-P-I+1, Q-I+1, X11(I,I), LDX11, TAUQ1(I),
$ X21(I,I), LDX21, WORK(ILARF) )
S = SQRT( SNRM2( P-I, X11(I+1,I), 1 )**2
$ + SNRM2( M-P-I+1, X21(I,I), 1 )**2 )
THETA(I) = ATAN2( S, C )
*
CALL SORBDB5( P-I, M-P-I+1, Q-I, X11(I+1,I), 1, X21(I,I), 1,
$ X11(I+1,I+1), LDX11, X21(I,I+1), LDX21,
$ WORK(IORBDB5), LORBDB5, CHILDINFO )
CALL SSCAL( P-I, NEGONE, X11(I+1,I), 1 )
CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
IF( I .LT. P ) THEN
CALL SLARFGP( P-I, X11(I+1,I), X11(I+2,I), 1, TAUP1(I) )
PHI(I) = ATAN2( X11(I+1,I), X21(I,I) )
C = COS( PHI(I) )
S = SIN( PHI(I) )
X11(I+1,I) = ONE
CALL SLARF( 'L', P-I, Q-I, X11(I+1,I), 1, TAUP1(I),
$ X11(I+1,I+1), LDX11, WORK(ILARF) )
END IF
X21(I,I) = ONE
CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
$ X21(I,I+1), LDX21, WORK(ILARF) )
*
END DO

@weslleyspereira
Copy link
Collaborator

Thanks, @christoph-conrads!

I agree with (1). I will prepare a PR including your correction for negative X21(I, I). I will also look at SORBDB2 and SORBDB3, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants