Matrix multiplication optimization in C (Advanced Architectures assignment @ UMinho)
The optimizations used where:
- matrix transposing
- access based on blocks
- vectorization of code
- CUDA
Al the performance analysis where performed on the SeARCH cluster @ UMinho
This project is licensed under the MIT License - see the LICENSE file for details