Cuda-Matrix-Multiplication

A basic example of how to do matrix multiplcation with cuda and cuBLAS library. In order for the example to work you must have an Nvidia GPU supporting CUDA.
Multiplcations are done by using cublasGemmEx method.

Overview

This simple app does the basic matrix multplication A X B = C

A = rowsA * rank
B = rank * colsB
C = rowsA * colsB

Arrays are 2d and are declared using single raw pointers

float* A = new float[sizeA];

In order to access them we follow this trick

for (size_t i = 0; i < rows; ++i)
{
    for (size_t j = 0; j < cols; ++j)
    {
        cout << A[j * rows + i] << " ";
    }
    cout << endl;
}

The could be either 16 or 32 bit floats. The result array is always 32 bit.

The time difference beetween the GPU(device) and CPU(host and single threaded) are very big.

Also Nvidia states the after cuBLAS version 11 tensor cores will be used automatically, link.

Build

Linux

Make sure you have all the required dependencies for cuda to compile and work.

sudo apt install nvidia-cuda-toolkit

Build

make all

Run the executable

./main.out

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
DisplayGpuInfo.h		DisplayGpuInfo.h
ErrorHandling.h		ErrorHandling.h
Makefile		Makefile
MatrixUtilities.h		MatrixUtilities.h
README.md		README.md
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cuda-Matrix-Multiplication

Overview

Build

Linux

About

Releases

Packages

Languages

kostakis/Cuda-Matrix-Multiplication

Folders and files

Latest commit

History

Repository files navigation

Cuda-Matrix-Multiplication

Overview

Build

Linux

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages