Releases
rocm-5.7.0
rocBLAS 3.1.0 for ROCm 5.7.0
Added
yaml lock step argument scanning for rocblas-bench and rocblas-test clients. See Programmers Guide for details.
rocblas-gemm-tune is used to find the best performing GEMM kernel for each of a given set of GEMM problems.
Fixed
make offset calculations for rocBLAS functions 64 bit safe. Fixes for very large leading dimensions or increments potentially causing overflow:
Level 1: axpy, copy, rot, rotm, scal, swap, asum, dot, iamax, iamin, nrm2
Level 2: gemv, symv, hemv, trmv, ger, syr, her, syr2, her2, trsv
Level 3: gemm, symm, hemm, trmm, syrk, herk, syr2k, her2k, syrkx, herkx, trsm, trtri, dgmm, geam
General: set_vector, get_vector, set_matrix, get_matrix
Related fixes: internal scalar loads with > 32bit offsets
fix in-place functionality for all trtri sizes
Changed
dot when using rocblas_pointer_mode_host is now synchronous to match legacy BLAS as it stores results in host memory
enhanced reporting of installation issues caused by runtime libraries (Tensile)
standardized internal rocblas C++ interface across most functions
Deprecated
Removal of STDC_WANT_IEC_60559_TYPES_EXT define in future release
Dependencies
optional use of AOCL BLIS 4.0 on Linux for clients
optional build tool only dependency on python psutil
You can’t perform that action at this time.