Skip to content

rocBLAS-2.22.0 for ROCm 3.5.0

Compare
Choose a tag to compare
@amdkila amdkila released this 10 Jul 22:50
· 3032 commits to develop since this release

Changelist

  • add geam complex, geam_batched, and geam_strided_batched
  • add dgmm, dgmm_batched, and dgmm_strided_batched

Optimized performance

  • ger

    • rocblas_sger, rocblas_dger,
    • rocblas_sger_batched, rocblas_dger_batched
    • rocblas_sger_strided_batched, rocblas_dger_strided_batched
  • geru

    • rocblas_cgeru, rocblas_zgeru
    • rocblas_cgeru_batched, rocblas_zgeru_batched
    • rocblas_cgeru_strided_batched, rocblas_zgeru_strided_batched
  • gerc

    • rocblas_cgerc, rocblas_zgerc
    • rocblas_cgerc_batched, rocblas_zgerc_batched
    • rocblas_cgerc_strided_batched, rocblas_zgerc_strided_batched
  • symv

    • rocblas_ssymv, rocblas_dsymv, rocblas_csymv, rocblas_zsymv,
    • rocblas_ssymv_batched, rocblas_dsymv_batched, rocblas_csymv_batched, rocblas_zsymv_batched,
    • rocblas_ssymv_strided_batched, rocblas_dsymv_strided_batched, rocblas_csymv_strided_batched, rocblas_zsymv_strided_batched,
  • sbmv

    • rocblas_ssbmv, rocblas_dsbmv,
    • rocblas_ssbmv_batched, rocblas_dsbmv_batched,
    • rocblas_ssbmv_strided_batched, rocblas_dsbmv_strided_batched,
  • spmv

    • rocblas_sspmv, rocblas_dspmv,
    • rocblas_sspmv_batched, rocblas_dspmv_batched,
    • rocblas_sspmv_strided_batched, rocblas_dspmv_strided_batched,
  • improved documentation

  • Fix argument checking in functions to match legacy BLAS

  • Fixed conjugate-transpose version of geam

Known failures

  • Compilation for GPU Targets
    • When using the install.sh script for "all" GPU Targets, which is the default, you must first set an environment variable HCC_AMDGPU_TARGET listing the GPU targets, e.g. HCC_AMDGPU_TARGET=gfx803,gfx900,gfx906,gfx908
    • If building for a specific architecture(s) using the -a | --architecture flag, you should also set the environment variable HCC_AMDGPU_TARGET to match.
    • Mismatching the environment variable to the -a flag architectures creates builds that may result in SEGFAULTS when running on GPUs which weren't specified.