You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for (intwg0=get_group_id(1); wg0<n; wg0+=get_num_groups(1)) {
for (intwg1=get_group_id(0); wg1<m; wg1+=get_num_groups(0)) {
-for (intl0=get_local_id(1); l0<o; l0+=get_local_size(1)) {
+for (intl0=get_local_id(1); l0<ctt(o); l0+=get_local_size(1)) {
+if (l0<o) {
for (intl1=get_local_id(0); l1<p; l1+=get_local_size(0)) {
[...] // read from global input; write to local memory
}
+ }
barrier(CLK_LOCAL_MEM_FENCE);
+if (l0<o) {
for (intl2=get_local_id(0); l2<p-2; l2+=get_local_size(0)) {
[...] // read from local memory; write to global output
}
+ }
barrier(CLK_LOCAL_MEM_FENCE);
}
}
}
Part of the work-items of a work-group might not enter the loop in line 3. As a result, the barriers inside this loop may only be reached by part of the work-items, leading to undefined behaviour.
Previous work on Lift suffered from similar limitations and implemented a mix of compilation time and runtime checks to report the issue to the user. For Shine, an additional imperative DPIA pass could be implemented to fix the code as illustrated above:
(-), buggy code that would be generated by both Lift and Shine.
(+), a potential fix where the ctt function rounds up a number to a multiple of the involved work-items.
The text was updated successfully, but these errors were encountered:
Issue related to #18 and remaining after #80.
In some cases, the generated barriers might not be encountered by all work-items. Example low-level Rise program and its generated OpenCL code:
Part of the work-items of a work-group might not enter the loop in line 3. As a result, the barriers inside this loop may only be reached by part of the work-items, leading to undefined behaviour.
Previous work on Lift suffered from similar limitations and implemented a mix of compilation time and runtime checks to report the issue to the user. For Shine, an additional imperative DPIA pass could be implemented to fix the code as illustrated above:
The text was updated successfully, but these errors were encountered: