-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lapack/netlib: add Dpbtrf and Dpbtrs #63
Conversation
It might be worth filing an issue with OpenBLAS, @martin-frbg is vastly more responsive than the NETLIB mob. |
Thanks for the pointer but I do not even see an issue filed with the "NETLIB mob" yet so please cut them some slack ... looks to me that project is reduced to a husband-and-wife team nowadays so I kind of feel their pain. (And at least I would probably need a c/fortran reproducer anyway to understand the issue, not sure about them) |
This is a (minimal) reproducer:
Valgrind reports use of unitialized memory allocated by But I think a reproducer is not really necessary, it doesn't demonstrate much anyway. For a 1x1 matrix the storage order doesn't matter, yet here row major gives an error. The only non-trivial operation that LAPACKE does for row major is the row->col->row conversion. And just from looking at For sure, I'll report it also to the netlib reference. |
@martin-frbg "mob" as used in the Australian context (i.e. by me) is a non-derogatory word meaning a group or association of people related by family, work or interest ties - I forget that others don't necessarily know the meanings of our common words. The common on responsiveness is a reflection of past filed issues with NETLIB.Your point about the size of the team is well taken though. |
Agreed - I do not see how the lapacke_?gb_trans is expected to work (and it seems the code does not know either). What is a bit strange is that this has survived unchanged from the original 2010 LAPACKE code (except that matrix_order got renamed to matrix_layout at some point) without anybody noticing. |
The persistence of incorrect transpose handling code in LAPACKE is not unprecedented. We have found at least one other case where the transpose was incorrectly handled leading to a segfault. |
Can anyone please take a look at https://software.intel.com/en-us/node/520871#BAND_STORAGE and confidently explain to me how a band matrix is laid out in memory? I find the information there contradicting. It says:
but later:
That means that according to the mapping definition the diagonals are stored contiguously in memory, that is, they become rows of AB, contradicting what is written above! An alternative view is that the matrix is packed row-wise and the resulting full matrix is stored column-wise but that does not make any sense at all and is not how BLAS stores band matrices (https://software.intel.com/en-us/mkl-developer-reference-c-matrix-storage-schemes-for-blas-routines):
Apparently, |
The senseless mapping definition view is apparently accepted and used by others, see for example https://stackoverflow.com/questions/42750671/writing-a-banded-matrix-in-a-row-major-layout-for-lapack-solver-dgbsv and the accepted answer there (and the confusion about different storage order between CBLAS and LAPACKE expressed in comments by OP). |
I've been pondering this further and I have an unverifiable hypothesis. The Intel documentation with all its row-wise packing and diagonals-becoming-columns should be ignored except for the mapping It's either this or something else. I wonder whether anyone actually knows for sure. |
The only other (related) documentation I found is from the ESSL at |
@vladimir-ch We have formal tests for what we consider to be the correct band storage. These are in blas/blas64/conv.go. I implemented them from the docs in blas/gonum/doc.go and (from memory) this document. |
Yes, we use the BLAS storage. The fact that the |
I've created Reference-LAPACK/lapack#348 |
I've spent some time today playing with doing various conversions and testing them with OpenBLAS and MKL. My conclusion based on the behavior of MKL and the LAPACKE source is that the band storage for LAPACKE is indeed the FORTRAN variant (diagonals in rows, columns in colums) only stored in row major layout. This means that it is not the CBLAS format that we use in Gonum. Anyone using LAPACKE and CBLAS from C for band matrices in row-major should be aware of this. I'm not able to reproduce the valgrind warning about uninitialized memory. One possibility is that the binary was using OpenBLAS 0.3.6 from the system which was probably not compiled with the I consider this resolved, so now this PR has to wait until the |
1d6a57c
to
bed3ead
Compare
I've rebased this but need some time to recall the status or what it still needs. |
bed3ead
to
90de761
Compare
I'm afraid we will have to convert in |
That seems OK. |
c723ba7
to
0d69fdf
Compare
Also move the banded matrix conversion code to the lapack/netlib package because that's where any matrix conversion should be done. The lapacke package should accept exactly what LAPACKE accepts which means that unfortunately for banded matrices there will be the inevitable overhead of two conversions: one from BLAS (Gonum) row-major format to LAPACKE row-major format for banded matrices and one inside LAPACKE to the FORTRAN column-major banded format. In case of Dpbtrf the inverse conversion must be performed for the factored matrix. lapack/netlib: add Dpbtrs conv
0d69fdf
to
7938375
Compare
PTAL |
Codecov Report
@@ Coverage Diff @@
## master #63 +/- ##
==========================================
- Coverage 30.42% 30.40% -0.02%
==========================================
Files 4 3 -1
Lines 6420 6456 +36
==========================================
+ Hits 1953 1963 +10
- Misses 4037 4060 +23
- Partials 430 433 +3
Continue to review full report at Codecov.
|
The tests fail already for a 1x1 matrix with a positive element because LAPACKE says the matrix is not positive definite. I think it is because the column-major conversion routine LAPACKE_dgb_trans is completely wrong. You can take a look but it makes no sense to merge this until LAPACKE is fixed, so this PR is here only as a reminder.