Skip to content

OpenBLAS Extensions

Martin Kroeker edited this page Aug 11, 2020 · 10 revisions
  • BLAS-like extensions
Routine Data Types Description
?gemm3m c,z gemm3m
?imatcopy s,d,c,z in-place transpositon/copying
?omatcopy s,d,c,z out-of-place transpositon/copying
?geadd s,d,c,z matrix add
  • LAPACK-like extensions

    • SHGEMM half-precision GEMM taking bfloat16 arguments

    NOTE the SH... naming scheme has been agreed on with Jack Dongarra of netlib

  • Utility functions

    • openblas_get_num_threads
    • openblas_set_num_threads
    • int openblas_get_num_procs(void) returns the number of processors available on the system (may include "hyperthreading cores")
    • int openblas_get_parallel(void) returns 0 for sequential use, 1 for platform-based threading and 2 for OpenMP-based threading
    • char * openblas_get_config() returns something like NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell
    • int openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset) sets the cpu affinity mask of the given thread to the provided cpuset. (Only available under Linux, with semantics identical to pthread_setaffinity_np)