hipBLASLt is a library that provides general matrix-matrix operations. It has a flexible API that extends functionalities beyond a traditional BLAS library, such as adding flexibility to matrix data layouts, input types, compute types, and algorithmic implementations and heuristics.
Note
The published hipBLASLt documentation is available at hipBLASLt in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the hipBLASLt/docs folder of this repository. As with all ROCm projects, the documentation is open source. For more information, see Contribute to ROCm documentation.
hipBLASLt uses the HIP programming language with an underlying optimized generator as its backend kernel provider.
After you specify a set of options for a matrix-matrix operation, you can reuse these for different
inputs. The general matrix-multiply (GEMM) operation is performed by the hipblasLtMatmul
API.
The equation is:
Where op( ) refers to in-place operations, such as transpose and non-transpose, and alpha and beta are scalars.
The activation function supports GELU and ReLU. the bias vector matches matrix D rows and broadcasts to all D columns.
The following table provides data type support. Note that fp8 and bf8 are only supported on the gfx94x platform.
A | B | C | D | Compute(Scale) |
---|---|---|---|---|
fp32 | fp32 | fp32 | fp32 | fp32 |
fp16 | fp16 | fp16 | fp16 | fp32 |
fp16 | fp16 | fp16 | fp32 | fp32 |
bf16 | bf16 | bf16 | bf16 | fp32 |
fp8/bf8 | fp8/bf8 | fp32 | fp32 | fp32 |
fp8/bf8 | fp8/bf8 | fp16 | fp16 | fp32 |
fp8/bf8 | fp8/bf8 | bf16 | bf16 | fp32 |
fp8/bf8 | fp8/bf8 | fp8 | fp8 | fp32 |
fp8/bf8 | fp8/bf8 | bf8 | bf8 | fp32 |
int8 | int8 | int8 | int8 | int32 |
Full documentation for hipBLASLt is available at rocm.docs.amd.com/projects/hipBLASLt.
Run the steps below to build documentation locally.
cd docs
pip3 install -r sphinx/requirements.txt
python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html
Alternatively, build with CMake:
cmake -DBUILD_DOCS=ON ...
To install hipBLASLt, you must meet the following requirements:
Required hardware:
- gfx90a card
- gfx94x card
- gfx110x card
Required software:
- Git
- CMake 3.16.8 or later
- python3.7 or later
- python3.7-venv or later
- AMD ROCm, version 5.5 or later
- hipBLAS-common
- roctracer
You can build hipBLASLt using the install.sh
script:
# Clone hipBLASLt using git
git clone https://github.com/ROCmSoftwarePlatform/hipBLASLt
# Go to hipBLASLt directory
cd hipBLASLt
# Run install.sh script
# Command line options:
# -h|--help - prints help message
# -i|--install - install after build
# -d|--dependencies - install build dependencies
# -c|--clients - build library clients too (combines with -i & -d)
# -g|--debug - build with debug flag
./install.sh -idc
NOTE: To build hipBLASLt for ROCm <= 6.2, pass the
--legacy_hipblas_direct
flag toinstall.sh
All unit tests are located in build/release/clients/staging/
. To build these tests, you must build
hipBLASLt with --clients
.
You can find more information at the following links:
If you want to submit an issue, you can do so on GitHub.
To contribute to our repository, you can create a GitHub pull request.