-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][OpenMP] Make omp.taskloop into a loop wrapper #87253
Conversation
This patch introduces an operation intended to hold loop information associated to the `omp.distribute`, `omp.simdloop`, `omp.taskloop` and `omp.wsloop` operations. This is a stopgap solution to unblock work on transitioning these operations to becoming wrappers, as discussed in [this RFC](https://discourse.llvm.org/t/rfc-representing-combined-composite-constructs-in-the-openmp-dialect/76986). Long-term, this operation will likely be replaced by `omp.canonical_loop`, which is being designed to address missing support for loop transformations, etc.
This patch defines a common interface to be shared by all OpenMP loop wrapper operations. The main restrictions these operations must meet in order to be considered a wrapper are: - They contain a single region. - Their region contains a single block. - Their block only contains another loop wrapper or `omp.loop_nest` and a terminator. The new interface is attached to the `omp.parallel`, `omp.wsloop`, `omp.simdloop`, `omp.distribute` and `omp.taskloop` operations. It is not currently enforced that these operations meet the wrapper restrictions, which would break existing OpenMP loop-generating code. Rather, this will be introduced progressively in subsequent patches.
This patch updates the definition of `omp.taskloop` to enforce the restrictions of a wrapper operation.
@llvm/pr-subscribers-mlir-openmp @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) ChangesThis patch updates the definition of Patch is 22.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87253.diff 4 Files Affected:
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a7bf93deae2fb3..b7660d6cd0ea5d 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -985,10 +985,10 @@ def TaskOp : OpenMP_Op<"task", [AttrSizedOperandSegments,
}
def TaskloopOp : OpenMP_Op<"taskloop", [AttrSizedOperandSegments,
- AutomaticAllocationScope, RecursiveMemoryEffects,
- AllTypesMatch<["lowerBound", "upperBound", "step"]>,
+ AutomaticAllocationScope,
DeclareOpInterfaceMethods<LoopWrapperInterface>,
- ReductionClauseInterface]> {
+ RecursiveMemoryEffects, ReductionClauseInterface,
+ SingleBlockImplicitTerminator<"TerminatorOp">]> {
let summary = "taskloop construct";
let description = [{
The taskloop construct specifies that the iterations of one or more
@@ -996,21 +996,19 @@ def TaskloopOp : OpenMP_Op<"taskloop", [AttrSizedOperandSegments,
iterations are distributed across tasks generated by the construct and
scheduled to be executed.
- The `lowerBound` and `upperBound` specify a half-open range: the range
- includes the lower bound but does not include the upper bound. If the
- `inclusive` attribute is specified then the upper bound is also included.
- The `step` specifies the loop step.
-
- The body region can contain any number of blocks.
+ The body region can contain a single block which must contain a single
+ operation and a terminator. The operation must be another compatible loop
+ wrapper or an `omp.loop_nest`.
```
- omp.taskloop <clauses>
- for (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
- %a = load %arrA[%i1, %i2] : memref<?x?xf32>
- %b = load %arrB[%i1, %i2] : memref<?x?xf32>
- %sum = arith.addf %a, %b : f32
- store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
- omp.terminator
+ omp.taskloop <clauses> {
+ omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
+ %a = load %arrA[%i1, %i2] : memref<?x?xf32>
+ %b = load %arrB[%i1, %i2] : memref<?x?xf32>
+ %sum = arith.addf %a, %b : f32
+ store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
+ omp.yield
+ }
}
```
@@ -1087,11 +1085,7 @@ def TaskloopOp : OpenMP_Op<"taskloop", [AttrSizedOperandSegments,
created.
}];
- let arguments = (ins Variadic<IntLikeType>:$lowerBound,
- Variadic<IntLikeType>:$upperBound,
- Variadic<IntLikeType>:$step,
- UnitAttr:$inclusive,
- Optional<I1>:$if_expr,
+ let arguments = (ins Optional<I1>:$if_expr,
Optional<I1>:$final_expr,
UnitAttr:$untied,
UnitAttr:$mergeable,
@@ -1130,8 +1124,7 @@ def TaskloopOp : OpenMP_Op<"taskloop", [AttrSizedOperandSegments,
|`grain_size` `(` $grain_size `:` type($grain_size) `)`
|`num_tasks` `(` $num_tasks `:` type($num_tasks) `)`
|`nogroup` $nogroup
- ) `for` custom<LoopControl>($region, $lowerBound, $upperBound, $step,
- type($step), $inclusive) attr-dict
+ ) $region attr-dict
}];
let extraClassDeclaration = [{
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index 564c23201db4fd..c92e593c20dc45 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1659,6 +1659,16 @@ LogicalResult TaskloopOp::verify() {
"the grainsize clause and num_tasks clause are mutually exclusive and "
"may not appear on the same taskloop directive");
}
+
+ if (!isWrapper())
+ return emitOpError() << "must be a loop wrapper";
+
+ if (LoopWrapperInterface nested = getNestedWrapper()) {
+ // Check for the allowed leaf constructs that may appear in a composite
+ // construct directly after TASKLOOP.
+ if (!isa<SimdLoopOp>(nested))
+ return emitError() << "only supported nested wrapper is 'omp.simdloop'";
+ }
return success();
}
diff --git a/mlir/test/Dialect/OpenMP/invalid.mlir b/mlir/test/Dialect/OpenMP/invalid.mlir
index 8f4103dabee5df..4e9f3b0cff064d 100644
--- a/mlir/test/Dialect/OpenMP/invalid.mlir
+++ b/mlir/test/Dialect/OpenMP/invalid.mlir
@@ -1561,10 +1561,11 @@ func.func @omp_cancellationpoint2() {
func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testmemref = "test.memref"() : () -> (memref<i32>)
// expected-error @below {{expected equal sizes for allocate and allocator variables}}
- "omp.taskloop"(%lb, %ub, %ub, %lb, %step, %step, %testmemref) ({
- ^bb0(%arg3: i32, %arg4: i32):
- "omp.terminator"() : () -> ()
- }) {operandSegmentSizes = array<i32: 2, 2, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0>} : (i32, i32, i32, i32, i32, i32, memref<i32>) -> ()
+ "omp.taskloop"(%testmemref) ({
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }) {operandSegmentSizes = array<i32: 0, 0, 0, 0, 0, 1, 0, 0, 0>} : (memref<i32>) -> ()
return
}
@@ -1574,10 +1575,11 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
%testf32_2 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{expected as many reduction symbol references as reduction variables}}
- "omp.taskloop"(%lb, %ub, %ub, %lb, %step, %step, %testf32, %testf32_2) ({
- ^bb0(%arg3: i32, %arg4: i32):
- "omp.terminator"() : () -> ()
- }) {operandSegmentSizes = array<i32: 2, 2, 2, 0, 0, 0, 2, 0, 0, 0, 0, 0>, reductions = [@add_f32]} : (i32, i32, i32, i32, i32, i32, !llvm.ptr, !llvm.ptr) -> ()
+ "omp.taskloop"(%testf32, %testf32_2) ({
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }) {operandSegmentSizes = array<i32: 0, 0, 0, 2, 0, 0, 0, 0, 0>, reductions = [@add_f32]} : (!llvm.ptr, !llvm.ptr) -> ()
return
}
@@ -1585,12 +1587,12 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
- %testf32_2 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{expected as many reduction symbol references as reduction variables}}
- "omp.taskloop"(%lb, %ub, %ub, %lb, %step, %step, %testf32) ({
- ^bb0(%arg3: i32, %arg4: i32):
- "omp.terminator"() : () -> ()
- }) {operandSegmentSizes = array<i32: 2, 2, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0>, reductions = [@add_f32, @add_f32]} : (i32, i32, i32, i32, i32, i32, !llvm.ptr) -> ()
+ "omp.taskloop"(%testf32) ({
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }) {operandSegmentSizes = array<i32: 0, 0, 0, 1, 0, 0, 0, 0, 0>, reductions = [@add_f32, @add_f32]} : (!llvm.ptr) -> ()
return
}
@@ -1600,10 +1602,11 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
%testf32_2 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{expected as many reduction symbol references as reduction variables}}
- "omp.taskloop"(%lb, %ub, %ub, %lb, %step, %step, %testf32, %testf32_2) ({
- ^bb0(%arg3: i32, %arg4: i32):
- "omp.terminator"() : () -> ()
- }) {in_reductions = [@add_f32], operandSegmentSizes = array<i32: 2, 2, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0>} : (i32, i32, i32, i32, i32, i32, !llvm.ptr, !llvm.ptr) -> ()
+ "omp.taskloop"(%testf32, %testf32_2) ({
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }) {in_reductions = [@add_f32], operandSegmentSizes = array<i32: 0, 0, 2, 0, 0, 0, 0, 0, 0>} : (!llvm.ptr, !llvm.ptr) -> ()
return
}
@@ -1611,12 +1614,12 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
- %testf32_2 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{expected as many reduction symbol references as reduction variables}}
- "omp.taskloop"(%lb, %ub, %ub, %lb, %step, %step, %testf32_2) ({
- ^bb0(%arg3: i32, %arg4: i32):
- "omp.terminator"() : () -> ()
- }) {in_reductions = [@add_f32, @add_f32], operandSegmentSizes = array<i32: 2, 2, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0>} : (i32, i32, i32, i32, i32, i32, !llvm.ptr) -> ()
+ "omp.taskloop"(%testf32) ({
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }) {in_reductions = [@add_f32, @add_f32], operandSegmentSizes = array<i32: 0, 0, 1, 0, 0, 0, 0, 0, 0>} : (!llvm.ptr) -> ()
return
}
@@ -1638,9 +1641,10 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
%testf32_2 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{if a reduction clause is present on the taskloop directive, the nogroup clause must not be specified}}
- omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr) nogroup
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- omp.terminator
+ omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr) nogroup {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
}
return
}
@@ -1662,9 +1666,10 @@ combiner {
func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testf32 = "test.f32"() : () -> (!llvm.ptr)
// expected-error @below {{the same list item cannot appear in both a reduction and an in_reduction clause}}
- omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr) in_reduction(@add_f32 -> %testf32 : !llvm.ptr)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- omp.terminator
+ omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr) in_reduction(@add_f32 -> %testf32 : !llvm.ptr) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
}
return
}
@@ -1674,8 +1679,20 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
%testi64 = "test.i64"() : () -> (i64)
// expected-error @below {{the grainsize clause and num_tasks clause are mutually exclusive and may not appear on the same taskloop directive}}
- omp.taskloop grain_size(%testi64: i64) num_tasks(%testi64: i64)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.taskloop grain_size(%testi64: i64) num_tasks(%testi64: i64) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ omp.yield
+ }
+ }
+ return
+}
+
+// -----
+
+func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
+ // expected-error @below {{op must be a loop wrapper}}
+ omp.taskloop {
+ %0 = arith.constant 0 : i32
omp.terminator
}
return
@@ -1683,6 +1700,21 @@ func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
// -----
+func.func @taskloop(%lb: i32, %ub: i32, %step: i32) {
+ // expected-error @below {{only supported nested wrapper is 'omp.simdloop'}}
+ omp.taskloop {
+ omp.distribute {
+ omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+ omp.yield
+ }
+ omp.terminator
+ }
+ }
+ return
+}
+
+// -----
+
func.func @omp_threadprivate() {
%1 = llvm.mlir.addressof @_QFsubEx : !llvm.ptr
// expected-error @below {{op failed to verify that all of {sym_addr, tls_addr} have same type}}
diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir
index 8d9acab67e0358..5b57bbc2cafda1 100644
--- a/mlir/test/Dialect/OpenMP/ops.mlir
+++ b/mlir/test/Dialect/OpenMP/ops.mlir
@@ -171,6 +171,23 @@ func.func @omp_loop_nest(%lb : index, %ub : index, %step : index) -> () {
omp.yield
}
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) step (%{{.*}})
+ "omp.loop_nest" (%lb, %ub, %step) ({
+ ^bb0(%iv2: index):
+ // CHECK: test.op1
+ "test.op1"(%lb) : (index) -> ()
+ // CHECK: test.op2
+ "test.op2"() : () -> ()
+ // CHECK: omp.yield
+ omp.yield
+ }) : (index, index, index) -> ()
+ omp.yield
+ }
+
return
}
@@ -209,6 +226,22 @@ func.func @omp_loop_nest_pretty(%lb : index, %ub : index, %step : index) -> () {
omp.yield
}
+ // TODO Remove induction variables from omp.wsloop.
+ omp.wsloop for (%iv) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.loop_nest
+ // CHECK-SAME: (%{{.*}}) : index =
+ // CHECK-SAME: (%{{.*}}) to (%{{.*}}) step (%{{.*}})
+ omp.loop_nest (%iv2) : index = (%lb) to (%ub) step (%step) {
+ // CHECK: test.op1
+ "test.op1"(%lb) : (index) -> ()
+ // CHECK: test.op2
+ "test.op2"() : () -> ()
+ // CHECK: omp.yield
+ omp.yield
+ }
+ omp.yield
+ }
+
return
}
@@ -1993,135 +2026,128 @@ func.func @omp_taskgroup_clauses() -> () {
// CHECK-LABEL: @omp_taskloop
func.func @omp_taskloop(%lb: i32, %ub: i32, %step: i32) -> () {
- // CHECK: omp.taskloop for (%{{.+}}) : i32 = (%{{.+}}) to (%{{.+}}) step (%{{.+}}) {
- omp.taskloop for (%i) : i32 = (%lb) to (%ub) step (%step) {
- // CHECK: omp.terminator
- omp.terminator
- }
-
- // CHECK: omp.taskloop for (%{{.+}}) : i32 = (%{{.+}}) to (%{{.+}}) step (%{{.+}}) {
- omp.taskloop for (%i) : i32 = (%lb) to (%ub) step (%step) {
- // CHECK: test.op1
- "test.op1"(%lb) : (i32) -> ()
- // CHECK: test.op2
- "test.op2"() : () -> ()
- // CHECK: omp.terminator
- omp.terminator
- }
-
- // CHECK: omp.taskloop for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
- }
-
- // CHECK: omp.taskloop for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) inclusive step (%{{.+}}, %{{.+}}) {
- omp.taskloop for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) inclusive step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop {
+ omp.taskloop {
+ omp.loop_nest (%i) : i32 = (%lb) to (%ub) step (%step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
%testbool = "test.bool"() : () -> (i1)
- // CHECK: omp.taskloop if(%{{[^)]+}})
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop if(%testbool)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop if(%{{[^)]+}}) {
+ omp.taskloop if(%testbool) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
- // CHECK: omp.taskloop final(%{{[^)]+}})
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop final(%testbool)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop final(%{{[^)]+}}) {
+ omp.taskloop final(%testbool) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
- // CHECK: omp.taskloop untied
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop untied
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop untied {
+ omp.taskloop untied {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
- // CHECK: omp.taskloop mergeable
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop mergeable
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop mergeable {
+ omp.taskloop mergeable {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
%testf32 = "test.f32"() : () -> (!llvm.ptr)
%testf32_2 = "test.f32"() : () -> (!llvm.ptr)
- // CHECK: omp.taskloop in_reduction(@add_f32 -> %{{.+}} : !llvm.ptr, @add_f32 -> %{{.+}} : !llvm.ptr)
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop in_reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop in_reduction(@add_f32 -> %{{.+}} : !llvm.ptr, @add_f32 -> %{{.+}} : !llvm.ptr) {
+ omp.taskloop in_reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
- // CHECK: omp.taskloop reduction(@add_f32 -> %{{.+}} : !llvm.ptr, @add_f32 -> %{{.+}} : !llvm.ptr)
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop reduction(@add_f32 -> %{{.+}} : !llvm.ptr, @add_f32 -> %{{.+}} : !llvm.ptr) {
+ omp.taskloop reduction(@add_f32 -> %testf32 : !llvm.ptr, @add_f32 -> %testf32_2 : !llvm.ptr) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
- // CHECK: omp.taskloop in_reduction(@add_f32 -> %{{.+}} : !llvm.ptr) reduction(@add_f32 -> %{{.+}} : !llvm.ptr)
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop in_reduction(@add_f32 -> %testf32 : !llvm.ptr) reduction(@add_f32 -> %testf32_2 : !llvm.ptr)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop in_reduction(@add_f32 -> %{{.+}} : !llvm.ptr) reduction(@add_f32 -> %{{.+}} : !llvm.ptr) {
+ omp.taskloop in_reduction(@add_f32 -> %testf32 : !llvm.ptr) reduction(@add_f32 -> %testf32_2 : !llvm.ptr) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
%testi32 = "test.i32"() : () -> (i32)
- // CHECK: omp.taskloop priority(%{{[^:]+}}: i32)
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- omp.taskloop priority(%testi32: i32)
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop priority(%{{[^:]+}}: i32) {
+ omp.taskloop priority(%testi32: i32) {
+ omp.loop_nest (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
+ // CHECK: omp.yield
+ omp.yield
+ }
}
%testmemref = "test.memref"() : () -> (memref<i32>)
- // CHECK: omp.taskloop allocate(%{{.+}} : memref<i32> -> %{{.+}} : memref<i32>)
- omp.taskloop allocate(%testmemref : memref<i32> -> %testmemref : memref<i32>)
- // CHECK-SAME: for (%{{.+}}, %{{.+}}) : i32 = (%{{.+}}, %{{.+}}) to (%{{.+}}, %{{.+}}) step (%{{.+}}, %{{.+}}) {
- for (%i, %j) : i32 = (%lb, %ub) to (%ub, %lb) step (%step, %step) {
- // CHECK: omp.terminator
- omp.terminator
+ // CHECK: omp.taskloop allocate(%{{.+}} : memref<i32> -> %{{.+}} : memref<i32>) {
+ omp.taskloop allocate(%te...
[truncated]
|
…/spr/loop-nest-02-wrapper-iface
…s/skatrak/spr/loop-nest-03-taskloop-mlir
This PR is now up to date and ready to land. I'll wait one more day to give time in case someone else wants to share any further feedback. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This patch updates the definition of
omp.taskloop
to enforce the restrictions of a wrapper operation.