-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable running of LAPACK tests for recent OpenBLAS easyconfigs + add patch to fix failing LAPACK tests due to use of -ftree-vectorize #16406
Conversation
…max. number of failing tests due to numerical errors to 150
0bff8b7
to
8694579
Compare
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1278747855 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot edit: failed due to:
|
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1279082958 processed Message to humans: this is just bookkeeping information for me, |
LAPACK test results from Intel Skylake X:
|
Test report by @boegelbot |
Test report by @boegel |
Why not use a patch based on OpenMathLib/OpenBLAS#3786 instead? That will only disable the vectorization for the lapack part, meaning that the OpenBLAS code itself will still be vectorized. Also, thinking about the future: are we going to backport the GCC patch to the GCCcore EasyConfigs? Because to be fair, Lapack might not be the only Fortran code that has been miscompiled due to this compiler bug. If we decide to backport the GCC patch, are we then changing the OpenBLAS EasyConfigs again to turn the vectorization back on (also for the Lapack part)? Don't get me wrong: I'm all for making sure that at least a fix makes it into the next EB release, and probably turning off vectorization is the most quick and 'sure' way to do that for now. Just wondering how we'll treat this in the longer run, and just want to stress that if this happens to Lapack, it might have happened to other codes. |
#16411 |
I justed tested OpenBLAS-0.3.20-GCC-11.3.0.eb with vectorization enabled, with patched GCC 11.3 and it gives me this:
(avx2 target, most are close eigenvalues). Will check what it gives with vect disabled as well. |
I could bring it down to
with -ftree-vectorize enabled, patching lapack as follows:
what's going on here? In the above loop I1 and I2 have different values depending on whether eigenvectors are computed or not. Hence the vectorization acts differently and its use of FMA instructions as well. The parentheses force a certain order that is compatible with non-FMA instructions here, and eigenvalues end up with the exact same binary value in the tests. @boegel: Let me know what you think, the hard work of figuring out why is done, now it's just admin and implementing patches in the right places etc. |
@bartoldeman Should that last patch be contributed to OpenBLAS, and maybe also to https://github.com/Reference-LAPACK/lapack/? Looks good though, we should apply this to these easyconfigs for sure, that seems like a better approach than disabling the use of |
Yes it should be going to reference lapack in some shape or form (or making the tests loser). Mathematically it is of course super innocent but it also relies on brackets disabling FMA, I'm not at all sure that is guaranteed in the future. Something for later this week. We can certainly have it in EB I think. |
@martin-frbg Any thoughts on the LAPACK patch that @bartoldeman put together? |
Yes please create upstream PRs for this if it is not too much bother @bartoldeman (weird that it should need the extra parentheses, wonder if that is another, more subtle gfortran bug) |
@martin-frbg now with parentheses you can force one of the above (e.g. the first one for However: this change only applies to smaller matrices, with can you actually double check your own OpenBLAS installation? I have failing tests here (even without any special optimizations):
do you get those too or do you have 0 failures? I think in the end is it a false expectation of the tests to expect exactly the same eigenvalues with or without eigenvectors, and the tests need adjusting, not random tricks avoiding FMA? |
Note I must stress that those failures look large but aren't in reality, it's just the test setting it to for
|
… max. number of failing LAPACK tests due to numerical errors to 10 + don't disable vectorize toolchain option
yes it's always the same eigenvalue test.
|
…100 for recent OpenBLAS easyconfigs, due to eigenvalue test being a bit too strict
@boegelbot please test @ generoso |
I want to get this PR merged soon now, so I'll go with option 2 for now (see c474ce7), we can follow up with a separatePR that adds an extra patch + lowers the max. failing test count back to |
I don't see c474ce7 yet but ok with me :) |
Sorry, forgot to actually push it 🤦♂️ |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1281127449 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
Test report by @boegel |
@bartoldeman right, the lapack testsuite is (unfortunately) very much centered on itself although Reference-LAPACK supports building with an alternate BLAS implementation (as already evidenced by the STFSM test - I see you noticed that issue ticket). |
Test report by @boegel |
@boegel seems your build on cns1 is still not good, maybe raise to 200, or the original 150 again? |
…it more to 150 for recent OpenBLAS easyconfigs
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1281456729 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegel |
Test report by @boegelbot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Going to merge this. I'll do some further investigations tomorrow but this looks good to go now. |
Test report by @boegel |
Note: on KNL I get 243 errors in the LAPACK tests of OpenBLAS |
@akesandgren can you also attach |
@bartoldeman Here is the KNL results for |
Thanks @akesandgren. This seems to be caused by (probably FMAs in) asm code in the core OpenBLAS, as for TARGET=HASWELL I also get 134 failures even with None of them are worrying though, it's all the same strict eigenvalue mismatches. |
I submitted a patch OpenMathLib/OpenBLAS#3795 that improves the situation on Haswell (using FMA throughout, instead of only in the C code because our use of This should also improve the situation on KNL, though of course OpenBLAS isn't optimal there since it won't use AVX512 on KNL (you'll need to use MKL or BLIS there). For Skylake, perhaps accidentally, OpenBLAS uses a generic C kernel for |
oops yes the skx cscal/zscal looks accidental - change slipped in with an unrelated PR. thx. KNL is treated like haswell as in its prime time there was no competent avx512 dev/contributor around. the SKX kernels use avx512 extensions not present on KNL |
(created using
eb --new-pr
)cfr. #16380
requires:
By disabling the use of-ftree-vectorize
, failing LAPACK tests are dropping from:to
(results collected on an AMD Rome system)
edit: thanks to patch from @bartoldeman (added in bb0269a) + patched GCC (see #16411), the number of failing LAPACK tests is now down to a handful...