Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

ThrudPrimrose · 2024-09-03T16:43:05Z

I made some minor additions to make implementing some transformations easier for me. I will explain all three changes and why I needed them.

Add gpu_force_syncthreads to force a call to __syncthreads in a map in dace/codegen/targets/cuda.py and dace/sdfg/nodes.py.

I preferred to tile work maps (e.g., K reduction for sum-of-inner-products matrix multiplication) of kernels in such a way that all new tiled maps are in the scope of the thread block map, yet when it is combined with shared memory, a __syncthreads call is necessary within the thread block map which is not performed for sequential maps inside a thread block scheduled map, I would like to be able to force this behavior

Adding the skew option to the map tiling transformation.

Having every map start from 0 makes writing my transformations simpler. Therefore, I wanted the map tiling transformation to start the inner map at 0; I could only achieve this behavior by copying over the skew parameter from the strip mine transformation. I would still prefer to use the map tiling transformation instead of strip mine while having the skew parameter.

…ck_size variable if it is set while the auto-detected is different

…loop starts from 0

ThrudPrimrose added 3 commits September 3, 2024 18:23

Add options to force a syncthread within a map and prefer the gpu_blo…

7149621

…ck_size variable if it is set while the auto-detected is different

Add skew option map tiling transformation

8efe6b5

Improve grammar

e8c134a

ThrudPrimrose requested review from BenWeber42, tbennun, phschaad and acalotoiu September 3, 2024 16:43

ThrudPrimrose added 6 commits September 5, 2024 12:25

Incorporate step size to thread block computation

675e1dc

Merge branch 'master' into additions

dc6fea3

The step is not really necessary for the block size detection if the …

5e21392

…loop starts from 0

Fix typo

cd61594

Add gpu force syncthreads to to generate MapExit

40de8f6

Merge branch 'spcl:master' into additions

a7ab6c3

ThrudPrimrose marked this pull request as ready for review October 15, 2024 11:30

Merge branch 'main' into additions

cac7515

phschaad approved these changes Nov 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

ThrudPrimrose commented Sep 3, 2024 •

edited

Loading

Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

Are you sure you want to change the base?

Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

Conversation

ThrudPrimrose commented Sep 3, 2024 • edited Loading

ThrudPrimrose commented Sep 3, 2024 •

edited

Loading