Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Additions to Enable Tiling and Explicit Memory Movement Transformations #1636

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

ThrudPrimrose
Copy link
Collaborator

@ThrudPrimrose ThrudPrimrose commented Sep 3, 2024

I made some minor additions to make implementing some transformations easier for me. I will explain all three changes and why I needed them.

  1. Add gpu_force_syncthreads to force a call to __syncthreads in a map in dace/codegen/targets/cuda.py and dace/sdfg/nodes.py.
  • I preferred to tile work maps (e.g., K reduction for sum-of-inner-products matrix multiplication) of kernels in such a way that all new tiled maps are in the scope of the thread block map, yet when it is combined with shared memory, a __syncthreads call is necessary within the thread block map which is not performed for sequential maps inside a thread block scheduled map, I would like to be able to force this behavior
  1. Adding the skew option to the map tiling transformation.
  • Having every map start from 0 makes writing my transformations simpler. Therefore, I wanted the map tiling transformation to start the inner map at 0; I could only achieve this behavior by copying over the skew parameter from the strip mine transformation. I would still prefer to use the map tiling transformation instead of strip mine while having the skew parameter.

@ThrudPrimrose ThrudPrimrose marked this pull request as ready for review October 15, 2024 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants