RemoveSingleIterationLoop crashes trying to simplify a `mod 0` affine expression #9244

dcaballe · 2022-05-28T02:25:07Z

#device_target_cpu = #hal.device.target<"cpu", {executable_targets = [#hal.executable.target<"llvm", "embedded-elf-x86_64", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-unknown-eabi-elf"}>]}>
#executable_layout = #hal.executable.layout<push_constants = 0, sets = [#hal.descriptor_set.layout<0, bindings = [#hal.descriptor_set.binding<0, storage_buffer>, #hal.descriptor_set.binding<1, storage_buffer>, #hal.descriptor_set.binding<2, storage_buffer>]>]>
#executable_target_embedded_elf_x86_64_ = #hal.executable.target<"llvm", "embedded-elf-x86_64", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-unknown-eabi-elf"}>
#map0 = affine_map<()[s0] -> (s0 ceildiv 4)>
#map1 = affine_map<()[s0] -> (s0 * 4)>
#map2 = affine_map<()[s0, s1] -> (-((s0 * -4 + 4) mod (s1 * 4)) + 4)>
#map3 = affine_map<(d0)[s0] -> (d0 + s0)>
#translation = #iree_codegen.translation_info<CPUDoubleTilingExpert workload_per_wg = [4]>
module attributes {hal.device.targets = [#device_target_cpu]} {
  hal.executable private @simple_mul_dispatch_0 {
    hal.executable.variant public @embedded_elf_x86_64, target = #executable_target_embedded_elf_x86_64_ {
      hal.executable.entry_point public @simple_mul_dispatch_0 ordinal(0) layout(#executable_layout) {translation_info = #translation} {
      ^bb0(%arg0: !hal.device, %arg1: index, %arg2: index, %arg3: index):
        %c1 = arith.constant 1 : index
        %0 = affine.apply #map0()[%arg1]
        hal.return %0, %c1, %c1 : index, index, index
      }
      builtin.module {
        func.func @simple_mul_dispatch_0() {
          %cst = arith.constant 0.000000e+00 : f32
          %c4 = arith.constant 4 : index
          %c0 = arith.constant 0 : index
          %0 = hal.interface.binding.subspan set(0) binding(0) type(storage_buffer) offset(%c0) alignment(64) : memref<4xf32>
          memref.assume_alignment %0, 64 : memref<4xf32>
          %1 = hal.interface.binding.subspan set(0) binding(1) type(storage_buffer) offset(%c0) alignment(64) : memref<4xf32>
          memref.assume_alignment %1, 64 : memref<4xf32>
          %2 = hal.interface.binding.subspan set(0) binding(2) type(storage_buffer) offset(%c0) alignment(64) : memref<4xf32>
          memref.assume_alignment %2, 64 : memref<4xf32>
          %workgroup_id_x = hal.interface.workgroup.id[0] : index
          %workgroup_count_x = hal.interface.workgroup.count[0] : index
          %3 = affine.apply #map1()[%workgroup_id_x]
          %4 = affine.apply #map1()[%workgroup_count_x]
          %5 = affine.apply #map2()[%workgroup_id_x, %workgroup_count_x]
          scf.for %arg0 = %3 to %5 step %4 {
            %6 = memref.subview %2[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %7 = memref.subview %0[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %8 = memref.subview %1[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %9 = vector.transfer_read %7[%c0], %cst {in_bounds = [true]} : memref<4xf32, #map3>, vector<4xf32>
            %10 = vector.transfer_read %8[%c0], %cst {in_bounds = [true]} : memref<4xf32, #map3>, vector<4xf32>
            %11 = arith.mulf %9, %10 : vector<4xf32>
            vector.transfer_write %11, %6[%c0] {in_bounds = [true]} : vector<4xf32>, memref<4xf32, #map3>
          }
          scf.for %arg0 = %5 to %c4 step %4 {
            %6 = memref.subview %2[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %7 = memref.subview %0[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %8 = memref.subview %1[%arg0] [4] [1] : memref<4xf32> to memref<4xf32, #map3>
            %9 = vector.transfer_read %7[%c0], %cst {in_bounds = [true]} : memref<4xf32, #map3>, vector<4xf32>
            %10 = vector.transfer_read %8[%c0], %cst {in_bounds = [true]} : memref<4xf32, #map3>, vector<4xf32>
            %11 = arith.mulf %9, %10 : vector<4xf32>
            vector.transfer_write %11, %6[%c0] {in_bounds = [true]} : vector<4xf32>, memref<4xf32, #map3>
          }
          return
        }
      }
    }
  }
}

RemoveSingleIterationLoop crashes when analyzing the last loop in the function above (scf.for %arg0 = %5 to %c4 step %4). alwaysRunsFirstIteration utility invokes substituteMin which tries to simplify the affine expression () -> (8 mod 0). The mod 0 expression is not properly handled and compilation crashes with:

.../llvm-project/mlir/lib/IR/AffineExpr.cpp:1186: void mlir::SimpleAffineExprFlattener::visitModExp
r(mlir::AffineBinaryOpExpr): Assertion `rhsConst > 0 && "RHS constant has to be positive"' failed.

The text was updated successfully, but these errors were encountered:

hanhanW · 2022-05-31T01:59:34Z

  %5 = affine.apply affine_map<()[s0, s1] -> (-((s0 * -4 + 4) mod (s1 * 4)) + 4)>()[%workgroup_id_x, %workgroup_count_x]

It looks more complicated than all the cases I've seen before. Could you elaborate how the IR is generated? What's the original IR (maybe at Linalg level) and the configurations?

dcaballe · 2022-05-31T04:34:51Z

That code is generated after applying peeling to the original loop. In particular, it's from iree/samples/simple_embedding/simple_embedding_test.mlir. Perhaps @matthias-springer can help clarify how that expression is generated.

hanhanW · 2022-05-31T05:05:45Z

Peeling on wrong loops might breaks the assumption in RemoveSingleIterationLoop. Do you have full dump log (i.e., -mlir-print-ir-after-all)? How do we repro the issue? Are there patches to apply?

dcaballe · 2022-05-31T06:17:43Z

Sorry, I thought it was obvious but it's actually not! I tried iree-opt simplify_bug.mlir -iree-codegen-remove-single-iteration-loop on the initial code that I provided and it looks like RemoveSingleIterationLoop is HAL dependent. I update the code above to also include the module but I'm still not able to reproduce with iree-opt. I'll check tomorrow with more time.

The loop before peeling is the following. It looks like a valid loop to peel to me:

#config = #iree_codegen.lowering_config<tile_sizes = [[4], [4], [0]]>                                                                                                                                                   
#device_target_cpu = #hal.device.target<"cpu", {executable_targets = [#hal.executable.target<"llvm", "embedded-elf-x86_64", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-unknown-eabi-elf"}>]}>                                                                                                                 
#executable_layout = #hal.executable.layout<push_constants = 0, sets = [#hal.descriptor_set.layout<0, bindings = [#hal.descriptor_set.binding<0, storage_buffer>, #hal.descriptor_set.binding<1, storage_buffer>, #hal.descriptor_set.binding<2, storage_buffer>]>]>                                                                                                                                                                            
#executable_target_embedded_elf_x86_64_ = #hal.executable.target<"llvm", "embedded-elf-x86_64", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-unknown-eabi-elf"}>                                                                                                                                                
#map0 = affine_map<()[s0] -> (s0 ceildiv 4)>                                                                                                                                                                            
#map1 = affine_map<()[s0] -> (s0 * 4)>                                                                                                                                                                                  
#map2 = affine_map<(d0) -> (d0)>                                                                                                                                                                                        
#translation = #iree_codegen.translation_info<CPUDoubleTilingExpert workload_per_wg = [4]>                                                                                                                              
module attributes {hal.device.targets = [#device_target_cpu]} {                                                                                                                                                         
  hal.executable private @simple_mul_dispatch_0 {                                                                                                                                                                       
    hal.executable.variant public @embedded_elf_x86_64, target = #executable_target_embedded_elf_x86_64_ {                                                                                                              
      hal.executable.entry_point public @simple_mul_dispatch_0 ordinal(0) layout(#executable_layout) {translation_info = #translation} {                                                                                
      ^bb0(%arg0: !hal.device, %arg1: index, %arg2: index, %arg3: index):                                                                                                                                               
        %c1 = arith.constant 1 : index                                                                                                                                                                                  
        %0 = affine.apply #map0()[%arg1]                                                                                                                                                                                
        hal.return %0, %c1, %c1 : index, index, index                                                                                                                                                                   
      }                                                                                                                                                                                                                 
      builtin.module {                                                                                                                                                                                                  
        func.func @simple_mul_dispatch_0() {                                                                                                                                                                            
          %c1 = arith.constant 1 : index                                                                                                                                                                                
          %c4 = arith.constant 4 : index                                                                                                                                                                                
          %c0 = arith.constant 0 : index                                                                                                                                                                                
          %0 = hal.interface.binding.subspan set(0) binding(0) type(storage_buffer) offset(%c0) alignment(64) : !flow.dispatch.tensor<readonly:4xf32>                                                                   
          %1 = hal.interface.binding.subspan set(0) binding(1) type(storage_buffer) offset(%c0) alignment(64) : !flow.dispatch.tensor<readonly:4xf32>                                                                   
          %2 = hal.interface.binding.subspan set(0) binding(2) type(storage_buffer) offset(%c0) alignment(64) : !flow.dispatch.tensor<writeonly:4xf32>                                                                  
          %workgroup_id_x = hal.interface.workgroup.id[0] : index                                                                                                                                                       
          %workgroup_count_x = hal.interface.workgroup.count[0] : index                                                                                                                                                 
          %3 = affine.apply #map1()[%workgroup_id_x]                                                                                                                                                                    
          %4 = affine.apply #map1()[%workgroup_count_x]                                                                                                                                                                 
          scf.for %arg0 = %3 to %c4 step %4 {                                                                                                                                                                           
            %5 = flow.dispatch.tensor.load %2, offsets = [%arg0], sizes = [4], strides = [1] : !flow.dispatch.tensor<writeonly:4xf32> -> tensor<4xf32>                                                                  
            %6 = flow.dispatch.tensor.load %0, offsets = [%arg0], sizes = [4], strides = [1] : !flow.dispatch.tensor<readonly:4xf32> -> tensor<4xf32>                                                                   
            %7 = flow.dispatch.tensor.load %1, offsets = [%arg0], sizes = [4], strides = [1] : !flow.dispatch.tensor<readonly:4xf32> -> tensor<4xf32>                                                                   
            %8 = linalg.generic {indexing_maps = [#map2, #map2, #map2], iterator_types = ["parallel"]} ins(%6, %7 : tensor<4xf32>, tensor<4xf32>) outs(%5 : tensor<4xf32>) attrs =  {__internal_linalg_transform__ = "1", lowering_config = #config, name = "mul.1"} {                                                                                                                                                                          
            ^bb0(%arg1: f32, %arg2: f32, %arg3: f32):                                                                                                                                                                   
              %9 = arith.mulf %arg1, %arg2 : f32                                                                                                                                                                        
              linalg.yield %9 : f32                                                                                                                                                                                     
            } -> tensor<4xf32>                                                                                                                                                                                          
            flow.dispatch.tensor.store %8, %2, offsets = [%arg0], sizes = [4], strides = [%c1] : tensor<4xf32> -> !flow.dispatch.tensor<writeonly:4xf32>                                                                
          }                                                                                                                                                                                                             
          return                                                                                                                                                                                                        
        }                                                                                                                                                                                                               
      }                                                                                                                                                                                                                 
    }                                                                                                                                                                                                                   
  }
}

hanhanW · 2022-05-31T06:27:06Z

Sorry for the confusion.. I meant that if it can get reproduced through iree-translate. I'd like to see full dump before jumping into this specific issue. The actual issue could happen in other places. We might want to fix it in the first place.

E.g., what's the IR before and after peeling. What loops are we target on for peeling? It looks like the peeling transform is applied on distributed loops. I'd expect it happens on the second tiling level. It would be clearer to me if you can provide the commit and IR dumps. We can also chat through VC if there are many details.

MaheshRavishankar · 2022-05-31T18:07:37Z

(Begin soap box) The use of RemoveSingleIterationLoop is a hack. We should really not be using it, but have it for reasons that are not entirely technical. (End soap box).

For purpose of prototyping might be worth just dropping this pass and seeing if things work. (worst case for development purposes add a flag that guards this pass usage, and leave it default true)

dcaballe · 2022-05-31T19:14:04Z

iree-opt --split-input-file --pass-pipeline='hal.executable(hal.executable.variant(builtin.module(func.func(iree-codegen-remove-single-iteration-loop))))' simplify_bug.mlir seems to repro it. I find surprising that we can't run passes in isolation and we need to run a "small pipeline" even for simple passes like this one. I guess that's because we have dependencies with the HAL dialect? Anyways...

I'm also attaching the output of print-ir-after-all. If you want to reproduce it yourself, you can pull https://github.com/dcaballe/iree/tree/peeling for IREE and https://github.com/dcaballe/llvm-project/tree/peeling for third-party/llvm.

If I remove RemoveSingleIterationLoop from the double tiling expert and peeling is disabled, tests/e2e/linalg_transform/linalg_transform.mlir.test and iree/compiler/Codegen/LLVMCPU/test/linalg_transform.mlir.test crash with:

iree-run-mlir: /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h:134: mlir::transform::TransformState::RegionScope::RegionScope(mlir::transform::Transform
State &, mlir::Region &): Assertion `state.regionStack.back()->isProperAncestor(&region) && "scope started at a non-nested region"' failed.                                                             
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.                                                          
Stack dump:                                                                                                                                                                                
0.      Program arguments: iree-run-mlir /usr/local/google/home/diegocaballero/iree2/tests/e2e/linalg_transform/linalg_transform.mlir --iree-hal-target-backends=dylib-llvm-aot --iree-codegen-use-linalg-transform-interp --linalg-transf
orm-file-name=/usr/local/google/home/diegocaballero/iree2/tests/e2e/linalg_transform/linalg_transform_spec.mlir                                                                            
 #0 0x0000000004852d4a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:11                
 #1 0x0000000004852efb PrintStackTraceSignalHandler(void*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1                                 
 #2 0x0000000004851596 llvm::sys::RunSignalHandlers() /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Signals.cpp:103:5                  
 #3 0x0000000004853625 SignalHandler(int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1               
 #4 0x00007f2428dd7200 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12200)                                                                                  
 #5 0x00007f24289878a1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1                                                                                        
 #6 0x00007f2428971546 abort ./stdlib/abort.c:81:7                                                                                                                   
 #7 0x00007f242897142f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8                                                                                            
 #8 0x00007f242897142f _nl_load_domain ./intl/loadmsgcat.c:970:34                                                                                                    
 #9 0x00007f2428980222 (/lib/x86_64-linux-gnu/libc.so.6+0x31222)                                                                                                                           
#10 0x0000000007d98e84 mlir::transform::TransformState::RegionScope::RegionScope(mlir::transform::TransformState&, mlir::Region&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform
/IR/TransformInterfaces.h:135:7                                                                                                                                                            
#11 0x0000000007d9554b mlir::transform::TransformState::make_region_scope(mlir::Region&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h:349:10
#12 0x0000000007e7cb90 mlir::transform::WithPDLPatternsOp::apply(mlir::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Dialect/Transform/IR/TransformOps.cpp:313:22                                                                                                                                                                                                 
#13 0x0000000007e6c2d6 mlir::transform::detail::TransformOpInterfaceInterfaceTraits::Model<mlir::transform::WithPDLPatternsOp>::apply(mlir::transform::detail::TransformOpInterfaceInterfaceTraits::Concept const*, mlir::Operation*, mlir
::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/build/debug/third_party/llvm-project/llvm/tools/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h.inc:55:56
#14 0x0000000007e6eb63 mlir::transform::TransformOpInterface::apply(mlir::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/build/debug/third_party/llvm-project/llvm/tools/mlir
/include/mlir/Dialect/Transform/IR/TransformInterfaces.cpp.inc:10:14                                                                                                                       
#15 0x0000000007e6e768 mlir::transform::TransformState::applyTransform(mlir::transform::TransformOpInterface) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp:1
26:24                                                                                                                                                                                      
#16 0x0000000007d53523 (anonymous namespace)::LinalgTransformInterp::runOnOperation() /usr/local/google/home/diegocaballero/iree2/llvm-external-projects/iree-dialects/lib/Dialect/LinalgTransform/Passes/TransformInterpreter.cpp:97:24
#17 0x0000000004ac8cba mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:471:21
#18 0x0000000004ac92b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /usr
/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:534:16                                                                                             
#19 0x0000000004acd571 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_4::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero
/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:456:12                                                                                                                              
#20 0x0000000004acd2f2 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, 
unsigned int)::$_4>(long, mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12   
#21 0x0000000004d4b131 llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llv
m/include/llvm/ADT/STLFunctionalExtras.h:68:12                                                                                                                                             
#22 0x0000000004d48785 mlir::Pass::runPipeline(mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Pass/Pass.h:195:12                        
#23 0x0000000007d0ab70 mlir::iree_compiler::(anonymous namespace)::LLVMCPULowerExecutableTargetPass::runOnOperation() /usr/local/google/home/diegocaballero/iree2/compiler/src/iree/compiler/Codegen/LLVMCPU/LLVMCPULowerExecutableTarget.
cpp:236:14                                                                                                                                                                                                              
#24 0x0000000004ac8cba mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:471:21
#25 0x0000000004ac92b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /usr
/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:534:16                                                                                                                          
#26 0x0000000004acd571 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_4::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero
/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:456:12                                                                                                                                                           
#27 0x0000000004acd2f2 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, 
unsigned int)::$_4>(long, mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12                                
#28 0x0000000004d4b131 llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llv
m/include/llvm/ADT/STLFunctionalExtras.h:68:12                                                                                                                                                                          
#29 0x0000000004d48785 mlir::Pass::runPipeline(mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Pass/Pass.h:195:12                                          
#30 0x000000000782af37 mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass::runOnOperation() /usr/local/google/home/diegocaballero/iree2/compiler/src/iree/compiler/Dialect/HAL/Transforms/TranslateExecutables.cpp:67:16

If I enable peeling, more tests are crashing with:

FAILED: samples/static_library/simple_mul_c_module.h samples/static_library/simple_mul_c_module.o samples/static_library/simple_mul_emitc.h /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_c_module.h /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_c_module.o /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_emitc.h                                                                                                                                                                                                                                                                                                                                                                           
cd /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library && /usr/local/google/home/diegocaballero/iree2/build/debug/tools/iree-compile --iree-mlir-to-vm-c-module --iree-hal-target-backends=dylib-llvm-aot --iree-llvm-link-embedded=false --iree-llvm-link-static --iree-llvm-static-library-output-path=simple_mul_c_module.o /usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir -o simple_mul_emitc.h                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: semi-affine expressions (modulo by non-const) are not supported                                                                                                                                                                                                                                                   
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^                                                                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal                                                                                                                                                                                                              
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^                                                                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: note: see current operation: %67 = "builtin.unrealized_conversion_cast"(%66) : (i64) -> index                                                                                                                                                                                                                            
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"llvm", "static", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-linux-gnu"}>
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^

But I'm not sure if this happens because RemoveSingleIterationLoop is disabled or this is a new unrelated issue. It seems related to the mod operation as well so, before investigating this further, it would be great to have confirmation from @matthias-springer that the first example is a good candidate for peeling and peeling is generating the right code. Perhaps we are hitting a gap in the peeling implementation.

MaheshRavishankar · 2022-05-31T19:53:13Z

iree-opt --split-input-file --pass-pipeline='hal.executable(hal.executable.variant(builtin.module(func.func(iree-codegen-remove-single-iteration-loop))))' simplify_bug.mlir seems to repro it. I find surprising that we can't run passes in isolation and we need to run a "small pipeline" even for simple passes like this one. I guess that's because we have dependencies with the HAL dialect? Anyways...

That particular pass needs to look at the hal.executable.entry_point to remove the loop. Hence its a not great, and also why you need to whole nesting...

I'm also attaching the output of print-ir-after-all. If you want to reproduce it yourself, you can pull https://github.com/dcaballe/iree/tree/peeling for IREE and https://github.com/dcaballe/llvm-project/tree/peeling for third-party/llvm.

If I remove RemoveSingleIterationLoop from the double tiling expert and peeling is disabled, tests/e2e/linalg_transform/linalg_transform.mlir.test and iree/compiler/Codegen/LLVMCPU/test/linalg_transform.mlir.test crash with:

Strange. Didnt know that the linalg_transform relied on this. You can ignore these errors. If these are the only ones left we can pull in Nicolas to help.

iree-run-mlir: /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h:134: mlir::transform::TransformState::RegionScope::RegionScope(mlir::transform::Transform
State &, mlir::Region &): Assertion `state.regionStack.back()->isProperAncestor(&region) && "scope started at a non-nested region"' failed.                                                             
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/[](https://github.com/llvm/llvm-project/issues/) and include the crash backtrace.                                                          
Stack dump:                                                                                                                                                                                
0.      Program arguments: iree-run-mlir /usr/local/google/home/diegocaballero/iree2/tests/e2e/linalg_transform/linalg_transform.mlir --iree-hal-target-backends=dylib-llvm-aot --iree-codegen-use-linalg-transform-interp --linalg-transf
orm-file-name=/usr/local/google/home/diegocaballero/iree2/tests/e2e/linalg_transform/linalg_transform_spec.mlir                                                                            
 #0 0x0000000004852d4a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:11                
 #1 0x0000000004852efb PrintStackTraceSignalHandler(void*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1                                 
 #2 0x0000000004851596 llvm::sys::RunSignalHandlers() /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Signals.cpp:103:5                  
 #3 0x0000000004853625 SignalHandler(int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1               
 #4 0x00007f2428dd7200 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12200)                                                                                  
 #5 0x00007f24289878a1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1                                                                                        
 #6 0x00007f2428971546 abort ./stdlib/abort.c:81:7                                                                                                                   
 #7 0x00007f242897142f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8                                                                                            
 #8 0x00007f242897142f _nl_load_domain ./intl/loadmsgcat.c:970:34                                                                                                    
 #9 0x00007f2428980222 (/lib/x86_64-linux-gnu/libc.so.6+0x31222)                                                                                                                           
#10 0x0000000007d98e84 mlir::transform::TransformState::RegionScope::RegionScope(mlir::transform::TransformState&, mlir::Region&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform
/IR/TransformInterfaces.h:135:7                                                                                                                                                            
#11 0x0000000007d9554b mlir::transform::TransformState::make_region_scope(mlir::Region&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h:349:10
#12 0x0000000007e7cb90 mlir::transform::WithPDLPatternsOp::apply(mlir::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Dialect/Transform/IR/TransformOps.cpp:313:22                                                                                                                                                                                                 
#13 0x0000000007e6c2d6 mlir::transform::detail::TransformOpInterfaceInterfaceTraits::Model<mlir::transform::WithPDLPatternsOp>::apply(mlir::transform::detail::TransformOpInterfaceInterfaceTraits::Concept const*, mlir::Operation*, mlir
::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/build/debug/third_party/llvm-project/llvm/tools/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h.inc:55:56
#14 0x0000000007e6eb63 mlir::transform::TransformOpInterface::apply(mlir::transform::TransformResults&, mlir::transform::TransformState&) /usr/local/google/home/diegocaballero/iree2/build/debug/third_party/llvm-project/llvm/tools/mlir
/include/mlir/Dialect/Transform/IR/TransformInterfaces.cpp.inc:10:14                                                                                                                       
#15 0x0000000007e6e768 mlir::transform::TransformState::applyTransform(mlir::transform::TransformOpInterface) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp:1
26:24                                                                                                                                                                                      
#16 0x0000000007d53523 (anonymous namespace)::LinalgTransformInterp::runOnOperation() /usr/local/google/home/diegocaballero/iree2/llvm-external-projects/iree-dialects/lib/Dialect/LinalgTransform/Passes/TransformInterpreter.cpp:97:24
#17 0x0000000004ac8cba mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:471:21
#18 0x0000000004ac92b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /usr
/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:534:16                                                                                             
#19 0x0000000004acd571 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_4::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero
/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:456:12                                                                                                                              
#20 0x0000000004acd2f2 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, 
unsigned int)::$_4>(long, mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12   
#21 0x0000000004d4b131 llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llv
m/include/llvm/ADT/STLFunctionalExtras.h:68:12                                                                                                                                             
#22 0x0000000004d48785 mlir::Pass::runPipeline(mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Pass/Pass.h:195:12                        
#23 0x0000000007d0ab70 mlir::iree_compiler::(anonymous namespace)::LLVMCPULowerExecutableTargetPass::runOnOperation() /usr/local/google/home/diegocaballero/iree2/compiler/src/iree/compiler/Codegen/LLVMCPU/LLVMCPULowerExecutableTarget.
cpp:236:14                                                                                                                                                                                                              
#24 0x0000000004ac8cba mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:471:21
#25 0x0000000004ac92b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /usr
/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:534:16                                                                                                                          
#26 0x0000000004acd571 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_4::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero
/iree2/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:456:12                                                                                                                                                           
#27 0x0000000004acd2f2 mlir::LogicalResult llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, 
unsigned int)::$_4>(long, mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12                                
#28 0x0000000004d4b131 llvm::function_ref<mlir::LogicalResult (mlir::OpPassManager&, mlir::Operation*)>::operator()(mlir::OpPassManager&, mlir::Operation*) const /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/llv
m/include/llvm/ADT/STLFunctionalExtras.h:68:12                                                                                                                                                                          
#29 0x0000000004d48785 mlir::Pass::runPipeline(mlir::OpPassManager&, mlir::Operation*) /usr/local/google/home/diegocaballero/iree2/third_party/llvm-project/mlir/include/mlir/Pass/Pass.h:195:12                                          
#30 0x000000000782af37 mlir::iree_compiler::IREE::HAL::TranslateTargetExecutableVariantsPass::runOnOperation() /usr/local/google/home/diegocaballero/iree2/compiler/src/iree/compiler/Dialect/HAL/Transforms/TranslateExecutables.cpp:67:16

If I enable peeling, more tests are crashing with:

FAILED: samples/static_library/simple_mul_c_module.h samples/static_library/simple_mul_c_module.o samples/static_library/simple_mul_emitc.h /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_c_module.h /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_c_module.o /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library/simple_mul_emitc.h                                                                                                                                                                                                                                                                                                                                                                           
cd /usr/local/google/home/diegocaballero/iree2/build/debug/samples/static_library && /usr/local/google/home/diegocaballero/iree2/build/debug/tools/iree-compile --iree-mlir-to-vm-c-module --iree-hal-target-backends=dylib-llvm-aot --iree-llvm-link-embedded=false --iree-llvm-link-static --iree-llvm-static-library-output-path=simple_mul_c_module.o /usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir -o simple_mul_emitc.h                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: semi-affine expressions (modulo by non-const) are not supported                                                                                                                                                                                                                                                   
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^                                                                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal                                                                                                                                                                                                              
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^                                                                                                                                                                                                                                                                                                                                                                                                                
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: note: see current operation: %67 = "builtin.unrealized_conversion_cast"(%66) : (i64) -> index                                                                                                                                                                                                                            
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:3:8: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"llvm", "static", {cpu_features = "", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", native_vector_size = 16 : index, target_triple = "x86_64-unknown-linux-gnu"}>
  %0 = "arith.mulf"(%arg0, %arg1) {name = "mul.1"} : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                             
       ^                                                                                                                                                                                                                                                                                                                                                                                                         
/usr/local/google/home/diegocaballero/iree2/samples/static_library/simple_mul.mlir:1:1: note: called from                                                                                                                                                                                                                                                                                                        
func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32>                                                                                                                                                                                                                                                                                                                               
^

But I'm not sure if this happens because RemoveSingleIterationLoop is disabled or this is a new unrelated issue. It seems related to the mod operation as well so, before investigating this further, it would be great to have confirmation from @matthias-springer that the first example is a good candidate for peeling and peeling is generating the right code. Perhaps we are hitting a gap in the peeling implementation.

Yeah this seems related to mod 0 issue. Btw, fact that mod 0 is appearing is itself strange. Is n mod 0 supposed to simplify to n?

matthias-springer · 2022-05-31T21:31:31Z

it would be great to have confirmation from @matthias-springer that the first example is a good candidate for peeling and peeling is generating the right code. Perhaps we are hitting a gap in the peeling implementation.

The loop looks fine. There are actually no requirements for the loop itself. Any loop can be peeled. (Unless every iteration is already a full iteration, then we may not do anything, I forgot...) After the peeling, we try to simplify affine.max and affine.min ops inside of the loop. If there are invalid affine expressions somewhere inside the loop, this may fail.

An affine expression that has a mod 0 sounds ill-formed to me. It's like a division by zero, which is also invalid. I'm wondering where this is coming from. Are you sure it is generated during peeling?

The loop peeling replaces the upper bound of the loop with a new bound. If we have many constant parts in the computation, this new bound can be very simple. In the general case, with an SSA value step (like we have it here), it can get pretty complex. The upper bound is computed in SCF/Transforms/LoopSpecialization.cpp:

// New upper bound: %ub - (%ub - %lb) mod %step

dcaballe · 2022-05-31T22:51:31Z

An affine expression that has a mod 0 sounds ill-formed to me.

The mod 0 seems to be a thing, based on this TODO comment. Not sure how it should be resolved to, though.

Are you sure it is generated during peeling?

// New upper bound: %ub - (%ub - %lb) mod %step

You can search for 'Peel' in this dump. Peeling generates:

#map0 = affine_map<()[s0] -> (s0 ceildiv 4)>
#map1 = affine_map<()[s0] -> (s0 * 4)>
#map2 = affine_map<()[s0, s1, s2] -> (s1 - (s1 - s0) mod s2)>
#map3 = affine_map<(d0) -> (d0)>
...
          %3 = affine.apply #map1()[%workgroup_id_x]
          %4 = affine.apply #map1()[%workgroup_count_x]
          %5 = affine.apply #map2()[%3, %c4, %4]
          scf.for %arg0 = %3 to %5 step %4 {

Let's replace:

%c4 in map2: affine_map<()[s0, s1] -> (4 - (4 - s0) mod s1)>
%3 in map2: affine_map<()[s0, s1] -> (4 - (4 - s0 *4) mod s1)>
%4 in map2: affine_map<()[s0, s1] -> (4 - (4 - s0 *4) mod (s1 * 4))>
which can be canonicalized to affine_map<()[s0, s1] -> (-((s0 * -4 + 4) mod (s1 * 4)) + 4)>`. This looks good to me.

However, when we instantiate the map, s0 = workgroup_id_x and s1 = workgroup_count_x. workgroup_count_x should always be > 0, right? Something weird seems to be happening here in the simplifyMin call. I don't see any 'min' operation in the IR...

matthias-springer · 2022-05-31T23:27:05Z

There are in fact no affine.min/affine.max ops in the IR. The loop peeling is actually pretty straightforward in that case. And looking at the IR in the dump, it looks like it does the right thing.

The mod 0 still looks very suspicious to me. I'd recommend looking into how this is generated. Maybe there is a bug in SimplifyTrivialLoops. I have no idea what SimplifyTrivialLoops is doing, but it almost looks like it is trying to compute something given the bounds of some value (GetMinMaxExprFn). With that in mind, the function name simplifyMin would make sense ("lower bound") and maybe does not refer to an affine.min op.

hanhanW · 2022-06-01T07:45:20Z

I think I know why. There are issues in getNumWorkgroup. IREE can't infer the bound, so they are replaced with 0. (The same issue happens in dynamic shapes, but I don't know why unconditionally converting them to zeros works.)

I dont find the documentation yet. I feel that it means unknown when setting the values to zeros. In this case, we should do nothing when they are zeros.

This should fix the issue: #9258

For more exploration, I'd suggest to disable the pass in the pipeline.

) The workgroup count is set to zero when we're not able to infer the bounds. Fixes #9244

dcaballe added bug 🐞 Something isn't working help wanted Extra attention is needed labels May 28, 2022

dcaballe assigned MaheshRavishankar and hanhanW May 28, 2022

dcaballe assigned matthias-springer May 31, 2022

hanhanW mentioned this issue Jun 1, 2022

Add checks for unknown workgroup count when inferring boundaries. #9258

Merged

hanhanW closed this as completed in #9258 Jun 2, 2022

hanhanW added a commit that referenced this issue Jun 2, 2022

Add checks for unknown workgroup count when inferring boundaries. (#9258

697396e

) The workgroup count is set to zero when we're not able to infer the bounds. Fixes #9244

dcaballe mentioned this issue Jun 2, 2022

Missing support for semi-affine modulo operations in Affine dialect #9279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RemoveSingleIterationLoop crashes trying to simplify a `mod 0` affine expression #9244

RemoveSingleIterationLoop crashes trying to simplify a `mod 0` affine expression #9244

dcaballe commented May 28, 2022 •

edited

Loading

hanhanW commented May 31, 2022

dcaballe commented May 31, 2022

hanhanW commented May 31, 2022

dcaballe commented May 31, 2022

hanhanW commented May 31, 2022

MaheshRavishankar commented May 31, 2022

dcaballe commented May 31, 2022

MaheshRavishankar commented May 31, 2022

matthias-springer commented May 31, 2022

dcaballe commented May 31, 2022

matthias-springer commented May 31, 2022

hanhanW commented Jun 1, 2022

RemoveSingleIterationLoop crashes trying to simplify a mod 0 affine expression #9244

RemoveSingleIterationLoop crashes trying to simplify a mod 0 affine expression #9244

Comments

dcaballe commented May 28, 2022 • edited Loading

hanhanW commented May 31, 2022

dcaballe commented May 31, 2022

hanhanW commented May 31, 2022

dcaballe commented May 31, 2022

hanhanW commented May 31, 2022

MaheshRavishankar commented May 31, 2022

dcaballe commented May 31, 2022

MaheshRavishankar commented May 31, 2022

matthias-springer commented May 31, 2022

dcaballe commented May 31, 2022

matthias-springer commented May 31, 2022

hanhanW commented Jun 1, 2022

RemoveSingleIterationLoop crashes trying to simplify a `mod 0` affine expression #9244

RemoveSingleIterationLoop crashes trying to simplify a `mod 0` affine expression #9244

dcaballe commented May 28, 2022 •

edited

Loading