Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate when barriers need to be generated #18

Closed
Bastacyclop opened this issue Jan 14, 2020 · 2 comments
Closed

Investigate when barriers need to be generated #18

Bastacyclop opened this issue Jan 14, 2020 · 2 comments
Labels
bug Something isn't working generated code Changes to the code that our compiler generates

Comments

@Bastacyclop
Copy link
Member

Currently, we are generating barriers only (and always) after generating a parForLocal. This is probably too much in many cases and may not be enough in some others.

@Bastacyclop Bastacyclop added the bug Something isn't working label Jan 14, 2020
@Bastacyclop Bastacyclop added the generated code Changes to the code that our compiler generates label Feb 13, 2020
bastian-koepcke pushed a commit that referenced this issue Sep 17, 2020
@Bastacyclop
Copy link
Member Author

Bastacyclop commented Nov 23, 2020

This test generates too many barriers, which can lead to invalid results in #74:

// FIXME: does not pass in CI
ignore("harrisTileShiftInwardsWLParVecUnaligned(4) generates valid OpenCL") {
import rise.openCL.DSL.{mapWorkGroup, mapLocal, toLocal}
checkOCL(lowerOCL(
ocl.harrisTileShiftInwardsPar(tileX, tileY, mapWorkGroup(_),
ocl.harrisVecUnaligned2(4, mapLocal(_), toLocal))),
LocalSize((4, 4)), GlobalSize((32, 32)))
}

The generated code looks like:

__kernel
void harris(global float* restrict output, int n0, int n1, const global float* restrict x0, local float* restrict x2694, local float* restrict x2801){
  /* mapWorkgroup */
  for (int wg_id_3073 = get_group_id(1);(wg_id_3073 < ((7 + n0) / 8));wg_id_3073 = (wg_id_3073 + get_num_groups(1))) {
    /* mapWorkgroup */
    for (int wg_id_3080 = get_group_id(0);(wg_id_3080 < ((3 + n1) / 8));wg_id_3080 = (wg_id_3080 + get_num_groups(0))) {
      /* mapLocal */
      for (int l_id_3082 = get_local_id(1);(l_id_3082 < 12);l_id_3082 = (l_id_3082 + get_local_size(1))) {
        /* mapLocal */
        for (int l_id_3087 = get_local_id(0);(l_id_3087 < 3);l_id_3087 = (l_id_3087 + get_local_size(0))) {
          /* oclReduceSeq */
          {
            // produce x2817 from x0
            vstore4(x2817, 0, (&(x2801[((4 * l_id_3087) + (14 * l_id_3082))])));
          }
        }
        
        // FIXME: this barrier should not be generated
        barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      }
      
      // FIXME: this barrier should only fence local memory
      barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      /* mapLocal */
      for (int l_id_3084 = get_local_id(1);(l_id_3084 < 10);l_id_3084 = (l_id_3084 + get_local_size(1))) {
        /* mapLocal */
        for (int l_id_3172 = get_local_id(0);(l_id_3172 < 3);l_id_3172 = (l_id_3172 + get_local_size(0))) {
          /* oclReduceSeq */
          {
            // produce x2748 from x2801
              // produce x2720 from x2801
              vstore4(x2748, 0, (&(x2694[((4 * l_id_3172) + (24 * l_id_3084))])));
              vstore4(x2720, 0, (&(x2694[((12 + (4 * l_id_3172)) + (24 * l_id_3084))])));
          }
        }
        
        // FIXME: this barrier should not be generated
        barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      }
      
      // FIXME: this barrier should only fence local memory
      barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      /* mapLocal */
      for (int l_id_3086 = get_local_id(1);(l_id_3086 < 8);l_id_3086 = (l_id_3086 + get_local_size(1))) {
        /* mapLocal */
        for (int l_id_3265 = get_local_id(0);(l_id_3265 < 2);l_id_3265 = (l_id_3265 + get_local_size(0))) {
          // produce output from x2694
        }
        
        // FIXME: this barrier should not be generated
        barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      }
      
      // FIXME: this barrier should only fence local memory
      barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
    }
  }
}

@Bastacyclop
Copy link
Member Author

closed in favour of #224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working generated code Changes to the code that our compiler generates
Projects
None yet
Development

No branches or pull requests

1 participant