perf-ninja/labs/memory_bound/loop_tiling_1 at main · dendibakh/perf-ninja

History

Name		Name	Last commit message	Last commit date
parent directory ..
CMakeLists.txt		CMakeLists.txt
README.md		README.md
bench.cpp		bench.cpp
init.cpp		init.cpp
solution.cpp		solution.cpp
solution.hpp		solution.hpp
validate.cpp		validate.cpp

README.md

Loop tiling (blocking) is an important technique that you can use to speed up code that is working with multi-dimensional arrays. If one of the memory access patterns on your array is column-wise, or if in the code you are accessing the same data several times in the loop, this technique can be very beneficial for the performance. It is often seen in matrix multiplication and matrix rotation operations, to speed them up.

Every time the CPU loads a new element of a matrix, it also fetches a few neighboring elements (cache line) belonging to the same row. If matrices are big and you are accessing a matrix column-wise, performance of your code may suffer from poor cache utilization. Because by the time you access the second element in the first row, it's no longer in the cache since it was replaced by the cache lines with elements from other rows of the matrix.

So, instead of going through the whole matrix at once, you can split it into small chunks, which entirely fit into a CPU cache. By processing matrix in blocks (tiles), you are reusing the elements of the matrix which are in the CPU cache and this will give your code a speed boost. Picking the right value for the TILE_SIZE is experimental and depends both on the HW architecture and the algorithm itself. Hint: you can use Roofline Performance analysis (in Intel Advisor or other tools) to determine what's limiting performance of the loop.

Authored-by: @ibogosavljevic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loop_tiling_1

loop_tiling_1

README.md

Files

loop_tiling_1

Directory actions

More options

Directory actions

More options

Latest commit

History

loop_tiling_1

Folders and files

parent directory

README.md