Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to OpenBLAS 0.3.18 and make Sage fat binaries portable #32424

Closed
mkoeppe opened this issue Aug 26, 2021 · 98 comments
Closed

Upgrade to OpenBLAS 0.3.18 and make Sage fat binaries portable #32424

mkoeppe opened this issue Aug 26, 2021 · 98 comments

Comments

@mkoeppe
Copy link
Contributor

mkoeppe commented Aug 26, 2021

as reported in:

Related:

CC: @williamstein @kliem @dimpase @NathanDunfield @culler @slel @orlitzky @vbraun @maxale @sagetrac-tmonteil @jhpalmieri

Component: porting

Keywords: upgrade, openblas

Author: Isuru Fernando, Matthias Koeppe, Volker Braun

Branch/Commit: 2f5f195

Reviewer: Matthias Koeppe, Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/32424

@mkoeppe mkoeppe added this to the sage-9.5 milestone Aug 26, 2021
@kliem
Copy link
Contributor

kliem commented Aug 26, 2021

comment:1

The signal error is caused by nullspaceMP, which is in the iml library.

Iml itself doesn't use architecture specific instructions, but the dependencies are.
But then again, OPENBLAS and gmp are already built with make DYNAMIC_ARCH=1 resp. --enable-fat. Both of them got updated between sage 9.0 and sage 9.3.

@mkoeppe

This comment has been minimized.

@NathanDunfield
Copy link
Contributor

comment:3

I suspect the cause is OpenBLAS. While this library is built with DYNAMIC_ARCH=1, there is still the non-performance critical code in it which will use whatever instructions are available on the machine at compile time unless you also set TARGET. See

OpenMathLib/OpenBLAS#3056

For the macOS app, Marc Culler had to set TARGET=CORE2 in addition to DYNAMIC_ARCH=1 so that it would run on a 2013 cylindrical Mac Pro.

@isuruf
Copy link
Member

isuruf commented Aug 26, 2021

Commit: 343d6a1

@isuruf
Copy link
Member

isuruf commented Aug 26, 2021

Author: Isuru Fernando

@isuruf
Copy link
Member

isuruf commented Aug 26, 2021

New commits:

cff0e2bset target for dynamic_arch
343d6a1AVX512 support is now much better

@isuruf
Copy link
Member

isuruf commented Aug 26, 2021

Branch: u/isuruf/openblas-target

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Aug 26, 2021

comment:6

should we also throw in an openblas upgrade? https://repology.org/projects/o/?inrepo=sagemath_stable

@isuruf
Copy link
Member

isuruf commented Aug 26, 2021

comment:7

Yes, that would be good to have. I didn't realize openblas was way behind.
Feel free to push to this branch

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Aug 26, 2021

comment:8

on macOS:

[openblas-0.3.17] ./spkg-install: line 55: conditional binary operator expected
[openblas-0.3.17] ./spkg-install: line 55: syntax error near `~='
[openblas-0.3.17] ./spkg-install: line 55: `    if [[ $os-$machine ~= darwin-x86_64 ]]; then'

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Aug 26, 2021

Changed branch from u/isuruf/openblas-target to u/mkoeppe/openblas-target

@slel
Copy link
Member

slel commented Aug 26, 2021

Changed commit from 343d6a1 to c8cbf1f

@slel
Copy link
Member

slel commented Aug 26, 2021

New commits:

c8cbf1fbuild/pkgs/openblas: Update to 0.3.17

@slel
Copy link
Member

slel commented Aug 26, 2021

Changed keywords from none to upgrade, openblas

@slel slel changed the title Sage 9.4 still creates non-portable binaries with SAGE_FAT_BINARY="yes" Upgrade to OpenBLAS 0.3.17 and make Sage fat binaries portable Aug 26, 2021
@isuruf
Copy link
Member

isuruf commented Aug 27, 2021

Changed branch from u/mkoeppe/openblas-target to u/isuruf/openblas-target

@isuruf
Copy link
Member

isuruf commented Aug 27, 2021

Changed commit from c8cbf1f to 1b03f2b

@isuruf
Copy link
Member

isuruf commented Aug 27, 2021

New commits:

bed527eset target for dynamic_arch
825f704AVX512 support is now much better
1b03f2bbuild/pkgs/openblas: Update to 0.3.17

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Aug 27, 2021

Changed author from Isuru Fernando to Isuru Fernando, Matthias Koeppe

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Aug 27, 2021

@vbraun
Copy link
Member

vbraun commented Oct 31, 2021

comment:71

The build log is identical to one with OPENBLAS_CONFIGURE="TARGET=PENRYN HAVE_AVX=1" except that the latter doesn't segfault, so it must be in the config.h value

@vbraun
Copy link
Member

vbraun commented Oct 31, 2021

comment:72

config.h:

#define OS_DARWIN	1
#define ARCH_X86_64	1
#define C_CLANG	1
#define __64BIT__	1
#define FUNDERSCORE	_
#define HAVE_C11	1
#define PTHREAD_CREATE_FUNC	pthread_create
#define BUNDERSCORE	_
#define NEEDBUNDERSCORE	1
#define PENRYN
#define L1_CODE_SIZE 32768
#define L1_CODE_ASSOCIATIVE 8
#define L1_CODE_LINESIZE 64
#define L1_DATA_SIZE 32768
#define L1_DATA_ASSOCIATIVE 8
#define L1_DATA_LINESIZE 64
#define L2_SIZE 2097152
#define L2_ASSOCIATIVE 8
#define L2_LINESIZE 64
#define L3_SIZE 16777216
#define L3_ASSOCIATIVE 16
#define L3_LINESIZE 64
#define DTB_DEFAULT_ENTRIES 32
#define HAVE_CMOV
#define HAVE_MMX
#define HAVE_SSE
#define HAVE_SSE2
#define HAVE_SSE3
#define HAVE_SSSE3
#define HAVE_SSE4_1
#define HAVE_SSE4_2
#define HAVE_AVX
#define HAVE_CFLUSH
#define NUM_SHAREDCACHE 1
#define NUM_CORES 1
#define CORE_PENRYN
#define CHAR_CORENAME "PENRYN"
#define SLOCAL_BUFFER_SIZE	32768
#define DLOCAL_BUFFER_SIZE	16384
#define CLOCAL_BUFFER_SIZE	32768
#define ZLOCAL_BUFFER_SIZE	16384
#define GEMM_MULTITHREAD_THRESHOLD	4

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Oct 31, 2021

comment:73

My configuration (works):

Makefile.conf:

OSNAME=Darwin
ARCH=x86_64
C_COMPILER=CLANG
BINARY32=
BINARY64=1
FU=_
CEXTRALIB=-L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib  -lto_library -lSystem  /Library/Developer/CommandLineTools/usr/lib/clang/13.0.0/lib/darwin/libclang_rt.osx.a 
F_COMPILER=GFORTRAN
FC=gfortran
BU=_
FEXTRALIB=-L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib -L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib/gcc/x86_64-apple-darwin20.6.0/10.3.0 -L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib/gcc/x86_64-apple-darwin20.6.0/10.3.0/../../.. -L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib -L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib/gcc/x86_64-apple-darwin20.6.0/10.3.0 -L/Users/mkoeppe/s/sage/sage-rebasing/.tox/local-macos-nohomebrew/local/lib/gcc/x86_64-apple-darwin20.6.0/10.3.0/../../..  -lgfortran -lSystem -lquadmath -lm -lSystem -lgfortran -lSystem -lquadmath -lm -lSystem  
CORE=PENRYN
LIBCORE=penryn
NUM_CORES=12
PENRYN=1
L1_DATA_SIZE=32768
L1_DATA_LINESIZE=64
L2_SIZE=1048576
L2_LINESIZE=64
DTB_DEFAULT_ENTRIES=256
DTB_SIZE=4096
HAVE_CMOV=1
HAVE_MMX=1
HAVE_SSE=1
HAVE_SSE2=1
HAVE_SSE3=1
HAVE_SSSE3=1
HAVE_SSE4_1=1
MAKE += -j 12
SBGEMM_UNROLL_M=8
SBGEMM_UNROLL_N=4
SGEMM_UNROLL_M=8
SGEMM_UNROLL_N=4
DGEMM_UNROLL_M=4
DGEMM_UNROLL_N=4
QGEMM_UNROLL_M=2
QGEMM_UNROLL_N=2
CGEMM_UNROLL_M=4
CGEMM_UNROLL_N=2
ZGEMM_UNROLL_M=2
ZGEMM_UNROLL_N=2
XGEMM_UNROLL_M=1
XGEMM_UNROLL_N=1
CGEMM3M_UNROLL_M=8
CGEMM3M_UNROLL_N=4
ZGEMM3M_UNROLL_M=4
ZGEMM3M_UNROLL_N=4
XGEMM3M_UNROLL_M=2
XGEMM3M_UNROLL_N=2

config.h:

#define OS_DARWIN	1
#define ARCH_X86_64	1
#define C_CLANG	1
#define __64BIT__	1
#define FUNDERSCORE	_
#define HAVE_C11	1
#define PTHREAD_CREATE_FUNC	pthread_create
#define BUNDERSCORE	_
#define NEEDBUNDERSCORE	1
#define PENRYN
#define L1_DATA_SIZE 32768
#define L1_DATA_LINESIZE 64
#define L2_SIZE 1048576
#define L2_LINESIZE 64
#define DTB_DEFAULT_ENTRIES 256
#define DTB_SIZE 4096
#define HAVE_CMOV
#define HAVE_MMX
#define HAVE_SSE
#define HAVE_SSE2
#define HAVE_SSE3
#define HAVE_SSSE3
#define HAVE_SSE4_1
#define CORE_PENRYN
#define CHAR_CORENAME "PENRYN"
#define SLOCAL_BUFFER_SIZE	32768
#define DLOCAL_BUFFER_SIZE	16384
#define CLOCAL_BUFFER_SIZE	32768
#define ZLOCAL_BUFFER_SIZE	16384
#define GEMM_MULTITHREAD_THRESHOLD	4

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Oct 31, 2021

comment:74

Replying to @vbraun:

The build log is identical to one with OPENBLAS_CONFIGURE="TARGET=PENRYN HAVE_AVX=1" except that the latter doesn't segfault, so it must be in the config.h value

The files that you showed in comment:70, comment:72 have HAVE_AVX, so what's going on?

@vbraun
Copy link
Member

vbraun commented Oct 31, 2021

comment:75

PENRYN=1 is apparently not the same as CORE=PENRYN

I do have AVX (and building with TARGET=PENRYN HAVE_AVX=1 works fine):

  <qemu:commandline>
    <qemu:arg value="-device"/>
    <qemu:arg value="isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc"/>
    <qemu:arg value="-smbios"/>
    <qemu:arg value="type=2"/>
    <qemu:arg value="-device"/>
    <qemu:arg value="vmware-svga"/>
    <qemu:arg value="-cpu"/>
    <qemu:arg value="Penryn,kvm=on,vendor=GenuineIntel,+invtsc,vmware-cpuid-freq=on,+pcid,+ssse3,+sse4.2,+popcnt,+avx,+aes,+xsave,+xsaveopt,check"/>
  </qemu:commandline>

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 1, 2021

comment:76

Replying to @mkoeppe:

Replying to @vbraun:

The build log is identical to one with OPENBLAS_CONFIGURE="TARGET=PENRYN HAVE_AVX=1" except that the latter doesn't segfault, so it must be in the config.h value

The files that you showed in comment:70, comment:72 have HAVE_AVX, so what's going on?

So when you pass TARGET=PENRYN HAVE_AVX=1, how do config.h and Makefile.conf change compared to what you posted in comment:70, comment:72?

@vbraun
Copy link
Member

vbraun commented Nov 1, 2021

comment:77
  • src-segfault is from running make
  • src-working is from running make TARGET=PENRYN HAVE_AVX=1

config.h

vbraun@catalina-osx openblas-test % diff -u src-segfault/config.h src-working/config.h
--- src-segfault/config.h	2021-11-01 15:01:57.000000000 -0700
+++ src-working/config.h	2021-11-01 15:02:16.000000000 -0700
@@ -8,19 +8,12 @@
 #define BUNDERSCORE	_
 #define NEEDBUNDERSCORE	1
 #define PENRYN
-#define L1_CODE_SIZE 32768
-#define L1_CODE_ASSOCIATIVE 8
-#define L1_CODE_LINESIZE 64
 #define L1_DATA_SIZE 32768
-#define L1_DATA_ASSOCIATIVE 8
 #define L1_DATA_LINESIZE 64
-#define L2_SIZE 2097152
-#define L2_ASSOCIATIVE 8
+#define L2_SIZE 1048576
 #define L2_LINESIZE 64
-#define L3_SIZE 16777216
-#define L3_ASSOCIATIVE 16
-#define L3_LINESIZE 64
-#define DTB_DEFAULT_ENTRIES 32
+#define DTB_DEFAULT_ENTRIES 256
+#define DTB_SIZE 4096
 #define HAVE_CMOV
 #define HAVE_MMX
 #define HAVE_SSE
@@ -28,11 +21,6 @@
 #define HAVE_SSE3
 #define HAVE_SSSE3
 #define HAVE_SSE4_1
-#define HAVE_SSE4_2
-#define HAVE_AVX
-#define HAVE_CFLUSH
-#define NUM_SHAREDCACHE 1
-#define NUM_CORES 1
 #define CORE_PENRYN
 #define CHAR_CORENAME "PENRYN"
 #define SLOCAL_BUFFER_SIZE	32768

Makefile.conf

vbraun@catalina-osx openblas-test % diff -u src-segfault/Makefile.conf src-working/Makefile.conf 
--- src-segfault/Makefile.conf	2021-11-01 15:01:57.000000000 -0700
+++ src-working/Makefile.conf	2021-11-01 15:02:16.000000000 -0700
@@ -12,14 +12,20 @@
 CORE=PENRYN
 LIBCORE=penryn
 NUM_CORES=8
+PENRYN=1
+L1_DATA_SIZE=32768
+L1_DATA_LINESIZE=64
+L2_SIZE=1048576
+L2_LINESIZE=64
+DTB_DEFAULT_ENTRIES=256
+DTB_SIZE=4096
+HAVE_CMOV=1
 HAVE_MMX=1
 HAVE_SSE=1
 HAVE_SSE2=1
 HAVE_SSE3=1
 HAVE_SSSE3=1
 HAVE_SSE4_1=1
-HAVE_SSE4_2=1
-HAVE_AVX=1
 MAKE += -j 8
 SBGEMM_UNROLL_M=8
 SBGEMM_UNROLL_N=4

@vbraun
Copy link
Member

vbraun commented Nov 1, 2021

comment:78

Its weird that HAVE_AVX isn't set in the working case, but I'm pretty sure AVX isn't the reason. In particular, building with OPENBLAS_CONFIGURE="NO_AVX=1" segfaults as well, and building with OPENBLAS_CONFIGURE="TARGET=PENRYN HAVE_AVX=1" works.

@vbraun
Copy link
Member

vbraun commented Nov 4, 2021

comment:79

The culprit seems to be

#define DTB_DEFAULT_ENTRIES 32

increasing it to 64, say, fixes the segfault

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 4, 2021

comment:80

Report upstream?

@vbraun
Copy link
Member

vbraun commented Nov 6, 2021

comment:81

I've made a branch at u/vbraun/packages/openblas/0.3.18-test that patches DTB_DEFAULT_ENTRIES=32 when forcing TARGET=PENRYN, with that I can reproduce the same segfault on linux when running

OPENBLAS_CONFIGURE="TARGET=PENRYN" ./sage -p openblas
./sage -t --long  src/sage/tests/books/computational-mathematics-with-sagemath/linsolve_doctest.py

@vbraun
Copy link
Member

vbraun commented Nov 6, 2021

comment:82
$ valgrind ./local/bin/python
[...]
>>> solve = factorized(Acsc) # LU factorization
==727003== Invalid write of size 8
==727003==    at 0x143AC900: dgemv_n (dgemv_n.S:233)
==727003==    by 0xFFFFFFFFFFFFFFFE: ???
==727003==    by 0x1F: ???
==727003==    by 0x1E: ???
==727003==    by 0xA75C8947: ???
==727003==    by 0x9F0E438F: ???
==727003==    by 0x4CF: ???
==727003==    by 0xBFEFFFFFFFFFFFFF: ???
==727003==    by 0xFFFFFFFFFFE00009: ???
==727003==    by 0x1F: ???
==727003==    by 0x9F0DAA87: ???
==727003==    by 0x98: ???
==727003==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Though dgemv_n.S hasn't been touched in 8 years ;)

@vbraun
Copy link
Member

vbraun commented Nov 6, 2021

comment:83

This is OpenMathLib/OpenBLAS#3421, patch is in the linked PR

@vbraun
Copy link
Member

vbraun commented Nov 6, 2021

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 6, 2021

Changed reviewer from Matthias Koeppe, Dima Pasechnik to Matthias Koeppe, Dima Pasechnik, https://github.com/mkoeppe/sage/actions/runs/1429845185, https://github.com/mkoeppe/sage/actions/runs/1429845182

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 6, 2021

comment:85

Testing with a bunch of other upgrade tickets


New commits:

2f5f195Add upstream fix for segfault in factorize

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 6, 2021

Changed commit from 18044f3 to 2f5f195

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 7, 2021

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 8, 2021

comment:87

Tests look good

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 8, 2021

Changed reviewer from Matthias Koeppe, Dima Pasechnik, https://github.com/mkoeppe/sage/actions/runs/1430376304, https://github.com/mkoeppe/sage/actions/runs/1430376300 to Matthias Koeppe, Dima Pasechnik

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 8, 2021

Changed author from Isuru Fernando, Matthias Koeppe to Isuru Fernando, Matthias Koeppe, Volker Braun

@vbraun
Copy link
Member

vbraun commented Nov 12, 2021

Changed branch from u/vbraun/packages/openblas/0.3.18 to 2f5f195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants