-
Notifications
You must be signed in to change notification settings - Fork 1.5k
OpenBLAS Extensions
Martin Kroeker edited this page Aug 11, 2020
·
10 revisions
- BLAS-like extensions
Routine | Data Types | Description |
---|---|---|
?gemm3m | c,z | gemm3m |
?imatcopy | s,d,c,z | in-place transpositon/copying |
?omatcopy | s,d,c,z | out-of-place transpositon/copying |
?geadd | s,d,c,z | matrix add |
-
LAPACK-like extensions
- SHGEMM half-precision GEMM taking bfloat16 arguments
NOTE the SH... naming scheme has been agreed on with Jack Dongarra of netlib
-
Utility functions
- openblas_get_num_threads
- openblas_set_num_threads
-
int openblas_get_num_procs(void)
returns the number of processors available on the system (may include "hyperthreading cores") -
int openblas_get_parallel(void)
returns 0 for sequential use, 1 for platform-based threading and 2 for OpenMP-based threading -
char * openblas_get_config()
returns something likeNO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell
-
int openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset)
sets the cpu affinity mask of the given thread to the provided cpuset. (Only available under Linux, with semantics identical to pthread_setaffinity_np)