-
Notifications
You must be signed in to change notification settings - Fork 166
Home
To build rocblas library and clients, see 1.Build.
For example code and Makefile using rocBLAS see 2.Example.
For instructions on how to run/use the client code, see 3.Running.
rocBLAS exports the functions listed in 4.Exported functions.
See 5.Logging to set Environment variables that cause rocBLAS to output logging information for each rocBLAS call. Note that output is streamed to standard error, and logging should only be used for diagnostics as it will slow down the code.
rocBLAS uses the code Tensile for gemm functions. Tensile can be tuned for specific gemm sizes. There is default training in rocBLAS, and most users will not need to ever train. Information on training is in 6.Train.
For information on Device and Stream management, see the section 7.Device and stream management.
TRSM has division, and the triangle matrices may be ill-conditioned. For more information see 8.Numerical stability in TRSM.
Some environment variables that can be set to profile are described in the section 9.Profile rocBLAS kernels