-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started
The most important thing to install before you can use Gust is the CUDA toolkit, version 5.5. You can get it here. Than just follow the instructions from NVidia's website for the installation of CUDA.
Additionally, you need to install JCuda. The jar files are managed by SBT but there are also some native libraries that need to be visible to your JDK. This can be acheived by making the CLASSPATH point to JCuda's directory (Windows, Linux/Mac OS).
You also need Scala with version at least 2.11.1 and SBT at least 0.13.
Currently the only way to get Gust is by cloning it from GitHub.
$ git clone https://github.com/piotrMocz/gust.git
$ cd gust
$ sbt
> console
Additionally, you have to compile all the *.cu files needed by Gust. This can be done easily by running gen-ptx.sh (Linux/Mac OS) or gen-ptx.bat (Windows). If you have any errors at this point, check the compute capability of your GPU, it should be 3.0 or more in order for Gust to run.
Gust aims to make its use as transparent as possible for the Breeze user. This can be thought of as providing an additional data structure to Breeze -- a dense matrix residing on the GPU but accessed and manipulated as an ordinary DenseMatrix. However, to use most of the operations backed by Gust, you need to provide an implicit handle for the cuBLAS library.
import jcuda.jcublas.{cublasHandle, JCublas2}
implicit val handle = new cublasHandle
JCublas2.cublasCreate(handle)
The basic data structure in Gust is CuMatrix, an analog of Breeze's DenseMatrix. It supports Floats and Doubles, but some operations work for Ints and Longs as well. There are a few ways to create a new CuMatrix and most of them work the same way as in Breeze:
Operation | Command |
---|---|
From a DenseMatrix | CuMatrix.fromDense[Double](denseMatrix) |
Empty matrix | CuMatrix.create[Double](m, n) |
Matrix with zeros | CuMatrix.zeros[Double](m, n) |
Matrix with ones | CuMatrix.ones[Double](m, n) |
Random matrix | CuMatrix.rand(m, n) |
Identity matrix | CuMatrix.eye[Double](n) |
For creating any non-standard matrices that are not described above, it is best to use Breeze to create the matrix CPU-side and then copy it to GPU using the CuMatrix.fromDense
method. The GPU equivalent of DenseVector is CuVector, which can be created similarily to CuMatrix:
Operation | Command |
---|---|
From a DenseVector | CuVector.fromDense[Double](denseMatrix) |
Empty vector | CuVector.create[Double](n) |
Vector with zeros | CuVector.zeros[Double](n) |
Vector with ones | CuVector.ones[Double](n) |
Random vector | CuVector.rand(n) |
The table below assumes that both a and b are CuMatrices, shaped appropriately for every operation.
Operation | Command |
---|---|
Element-wise copy | a := b |
Matrix addition | a + b |
Matrix inplace addition | a += b |
Matrix product | a * b |
Inplace matrix product | a *= b |
Element-wise product | a :* b |
Matrix/scalar multiplication | a * 2.0 |
Also, Gust supports many element-wise operations like max, min, sin, cos, etc. |
Right now Gust supports only a small part of the operations available in Breeze, but they are used in the exact same way. (The table below assumes that a is a general CuMatrix, b is a CuMatrix representing a column vector, i.e. b.cols == 1
and v is a CuVector).
Operation | Command |
---|---|
Transposition | a.t |
Linear solve | a \ b |
Determinant | det(a) |
Trace | trace(a) |
Condition number | cond(a) |
Froebenius norm | norm(a) |
LU decomposition | LU(a) |
Cholesky decomposition | cholesky(a) // where a is SPD |
Singular Value Decomposition | svd(a) |
QR decomposition | qr(a) |
Construct a diagonal matrix from vector | diag(v) |
CuSparseMatrix is a sparse matrix stored on the GPU in the CSC (Compressed Sparse Column) format. It may be created in one of the two ways: from DenseMatrix, using the method CuSparseMatrix.fromDense
or from CuMatrix, using the method CuSparseMatrix.fromCuMatrix
.
To create a sparse matrix, apart from a cublasHandle, you need to make an instance of cusparseHandle available. It is created just like the cublasHandle:
import jcuda.jcusparse.{cusparseHandle, JCusparse2}
implicit val sparseHandle = new cusparseHandle
JCusparse2.cusparseCreate(sparseHandle)
CuSparseMatrix currently features a subset of operations available for CuMatrix:
- transposition
- addition
- subtraction
- sparse matrix product
- sparse matrix-dense vector product
- linear solve
- norm
- LU and Cholesky factorizations
Note that both LU and Cholesky are incomplete factorizations.