-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argument mismatch for call to gemm from ssids #4
Comments
Nick, Yes it seems like a C/FORTRAN interfacing error. I guess you were using the reference BLAS in this case, which version did you use? Also, are you using the master or release version of SSIDS? Florent |
Hi Florent I was using the blas from NETLIB from a year or two ago. My version of SSIDS is dated October 19th 2016. Nick |
I noticed something probably related when compiling both SSIDS and HSL_MA97 into the same binary with link-time optimization. There are multiple spral/src/ssids/cpu/kernels/wrappers.cxx Lines 10 to 18 in c8c3fcd
For example for dpotrf :
The issue is that the Fortran function requires the length of the Some digging reveals that this issue is present in many codes that call BLAS/LAPACK functions from C and causes real problems since 2019. Here are some references:
Not quite sure what the best solution would be for this project. Implementing this correctly cross-platform does not seem trivial to me. What do you think? |
@mjacobse many thanks for looking into this in detail, this really is a case of CBLAS/LAPACKE incorrectly depending on undefined compiler behaviour for years (namely omitting the hidden length argument for single-character strings) as nicely explained in the LWN.net article above. My suggestion would be to follow the R developers on this and, until CBLAS/LAPACKE gets fixed (a fix was merged 3 days ago but has yet to make it into a release), add either |
@jfowkes my understanding is/was that the |
@martin-frbg many thanks for clarifying this, then we should also go for |
CBLAS/LAPACKE has since fixed this with the 3.10 release: Coming back to this, I am not sure anymore how this changes anything for this issue though. I do see uses of CBLAS in some example and test files for SSMFE, but no more than that. This issue for SSIDS in particular does not come from CBLAS/LAPACKE, but instead from the manual declaration of the BLAS functions in the SSIDS code itself: spral/src/ssids/cpu/kernels/wrappers.cxx Lines 10 to 18 in c8c3fcd
It's basically what CBLAS/LAPACKE did in the past, so the same issue was replicated here in SSIDS. That's why I don't see how a CBLAS/LAPACKE update would fix this here, I believe this should be fixed locally. Unless the idea was to switch to using CBLAS to replace these manual declarations? Or was the idea to copy their fix here locally? Personally neither the compiler flag workaround, nor the fix of manually adding size parameters to the C declarations that CBLAS/LAPACKE did seem great to me. If there is interest, I can look into adding properly C interoperable wrapper functions in Fortran for the few BLAS routines that are used from |
When running an application compiled with gfortran 4.9, I obtained the following segfault -
At line 630 of file blas.f90
Fortran runtime error: Actual string length is shorter than the declared one for dummy argument 'transa' (0/1)
Error termination. Backtrace:
#0 0x2ac124d25f07 in ???
#1 0x2ac124d26a45 in ???
#2 0x2ac124d26dfa in ???
#3 0x81f05c in dgemm_
at blas.f90:630
#4 0x6fc646 in _ZN5spral5ssids3cpu9host_gemmIdEEvNS1_9operationES3_iiiT_PKS4_iS6_iS4_PS4_i
at wrappers.cxx:27
#5 0x6fbef4 in _ZN5spral5ssids3cpu15ldlt_tpp_factorEiiPiPdiS3_S3_ibddiS3_i
at ldlt_tpp.cxx:219
#6 0x6f5db9 in ZN5spral5ssids3cpu17ldlt_app_internal5BlockIdLi32ENS1_14BuddyAllocatorIiSaIdEEEE6factorINS4_IdS5_EEEEiiPiPdRKNS1_18cpu_factor_optionsERSt6vectorINS1_9WorkspaceESaISG_EERKT
at ldlt_app.cxx:990
#7 0x6f6334 in _ZN5spral5ssids3cpu17ldlt_app_internal4LDLTIdLi32ENS2_10CopyBackupIdNS1_14BuddyAllocatorIdSaIdEEEEELb0ELb0ES7_E24run_elim_pivoted_notasksEiiPiPdiSB_RNS2_10ColumnDataIdNS5_IiS6_EEEERS8_RKNS1_18cpu_factor_optionsEidSB_iRSt6vectorINS1_9WorkspaceESaISL_EERKS7_i
at ldlt_app.cxx:1570
#8 0x6f4b55 in ZN5spral5ssids3cpu17ldlt_app_internal4LDLTIdLi32ENS2_10CopyBackupIdNS1_14BuddyAllocatorIdSaIdEEEEELb0ELb0ES7_E6factorEiiPiPdiSB_RS8_RKNS1_18cpu_factor_optionsENS1_11PivotMethodEidSB_iRSt6vectorINS1_9WorkspaceESaISI_EERKS7
at ldlt_app.cxx:2297
#9 0x6f5af2 in ZN5spral5ssids3cpu17ldlt_app_internal5BlockIdLi32ENS1_14BuddyAllocatorIiSaIdEEEE6factorINS4_IdS5_EEEEiiPiPdRKNS1_18cpu_factor_optionsERSt6vectorINS1_9WorkspaceESaISG_EERKT
at ldlt_app.cxx:974
#10 0x6f820e in _ZN5spral5ssids3cpu17ldlt_app_internal4LDLTIdLi32ENS2_10CopyBackupIdNS1_14BuddyAllocatorIdSaIdEEEEELb1ELb0ES7_E24run_elim_pivoted_notasksEiiPiPdiSB_RNS2_10ColumnDataIdNS5_IiS6_EEEERS8_RKNS1_18cpu_factor_optionsEidSB_iRSt6vectorINS1_9WorkspaceESaISL_EERKS7_i
at ldlt_app.cxx:1570
#11 0x6fa3ac in ZN5spral5ssids3cpu17ldlt_app_internal4LDLTIdLi32ENS2_10CopyBackupIdNS1_14BuddyAllocatorIdSaIdEEEEELb1ELb0ES7_E6factorEiiPiPdiSB_RS8_RKNS1_18cpu_factor_optionsENS1_11PivotMethodEidSB_iRSt6vectorINS1_9WorkspaceESaISI_EERKS7
at ldlt_app.cxx:2297
#12 0x6fb260 in ZN5spral5ssids3cpu15ldlt_app_factorIdNS1_14BuddyAllocatorIdSaIdEEEEEiiiPiPT_iS8_S7_S8_iRKNS1_18cpu_factor_optionsERSt6vectorINS1_9WorkspaceESaISD_EERKT0
at ldlt_app.cxx:2453
#13 0x6e475b in ???
#14 0x6e299a in factor_node<false, double, spral::ssids::cpu::BuddyAllocator<double, std::allocator > >
at ./ssids/cpu/factor.hxx:175
#15 0x6e299a in _ZN5spral5ssids3cpu14NumericSubtreeILb0EdLm8388608ENS1_11AppendAllocIdEEEC2ERKNS1_15SymbolicSubtreeEPKdSA_PPvRKNS1_18cpu_factor_optionsERNS1_11ThreadStatsE._omp_fn.6
at ssids/cpu/NumericSubtree.hxx:172
#16 0x2ac12534760e in ???
#17 0x2ac1253501db in ???
#18 0x6d01a8 in __spral_ssids_fkeep_MOD_inner_factor_cpu._omp_fn.1
at fkeep.f90:110
line 630-633 of the local file blas.f90 is
so maybe there is some confusion between a C and fortram character? The problem seems to vanish with a system blas call via -lblas.
Any ideas?
Nick
The text was updated successfully, but these errors were encountered: