-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FXML.1923: PDLL support for native constraints with attribute results #24
FXML.1923: PDLL support for native constraints with attribute results #24
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thank you very much for your explanation in the PR description. I appreciate that a lot. That's a nice feature. I have a few questions/suggestions:
- Not for this work perhaps: How hard is it to give these third-kind of native a different name (i.e. utility native) and not mixing what we want with constraints?
- I don't see a useful case at the moment to have declarations of natives for PDLL. When we add new natives, we have to define them here as well which wasn't the case for PDL.
- While I was working with PDLL and sometimes converted it to pdl_interp, I realized that if you have a constraint that is not dependent on the operation or values or attribute you are capturing, after lowering it to pdl_interp, the order of the natives are not matched with the one in PDLL. In fact, it moves it to the beginning of the matching section and that was problematic for some cases. I can help you to recreate that case and see if it would be problematic here as well.
- Important: What if we enable rewrite failure instead of adding this kind of constraint? This is not supported by PDL now. So, the idea is, we match, we go the rewrite section and there we can fail and stop rewriting and we go to the next matching.
It really depends on how we want to handle this operation:
A constraint is meant to always check something on the IR that you are matching (which is why it requires at least one operand) and will by the pass be ordered to a location where all of its operands have been checked and evaluated, i.e. when there are no operands to the beginning of the pattern (Note: This is also influenced by other patterns in the same file, especially when they call the same native constraint). I think what you are referring to are the constraints that should check for state in the pass it is applied in. They are quite special and out of scope for this extension. Nothing with regard to that has been changed so I expect the issues will be similar.
The semantics of a PDL pattern is: if we have a successful match of the lhs then the rhs should be able to completely perform the rewrite. This clear separation of matching and rewriting enables the creation of an efficient matcher for a set of patterns and avoiding error handling and aborting rewriting in the bytecode interpreter (which would introduce a runtime penalty). I think we should avoid mixing these concepts more than absolutely required as to not lose these properties. |
One possible change that I would like your opinion on would be the registering of constraint functions with results. |
I agree. That would be a better design IMO too. Separating them from the user's perspective reduces the confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really impressive! My only questions are about signaling a failure in a constraint with results.
mlir/include/mlir/IR/PatternMatch.h
Outdated
/// the low-level PDLValue form, and the results are manually appended to | ||
/// the given result list. | ||
/// | ||
/// * `ResultT (PatternRewriter &, ValueTs... values)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we signal a failure
with this interface? Don't we need something similar like FailureOr<ResultT>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
I think it makes sense to remove the template magic and under the hood conversion that is enabled by this. The problem is that not all valid native rewrite functions are also valid constraint with results functions. We would have to adjust how the conversion from PDL structures to built-in structure is done for this.
So for the scope of this PR we will only support constraints with results in form of
LogicalResult (PatternRewriter &, PDLResultList &, ArrayRef<PDLValue>)
// At this point in time the corresponding pdl.ApplyNativeConstraint op has | ||
// been deleted and the new pdl_interp.ApplyConstraint has not been created | ||
// yet. To enable use of results created by these operations we build a | ||
// placeholder value that will be replaced when the actual | ||
// pdl_interp.ApplyConstraint operation is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about PDLInterp
, could you give me an example when this is relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my description on top:
This is required in all scenarios where a native constraint returns a result that is used on the rhs
because said result has to be an input the the pdl_interp rewriter function that is created for the pattern.
However, the call to this function is always generated before the pdl_interp.apply_constraint operation is created.
An example for the output of this pass is the following:
PDL input:
module {
pdl.pattern @simple : benefit(1) {
%0 = operands
%1 = operation "test.op"(%0 : !pdl.range<value>)
%attr = apply_native_constraint "NativeConstraint"(%1 : !pdl.operation) : !pdl.attribute
rewrite %1 {
%3 = operation "test.success" {"someAttr" = %attr}
replace %1 with %3
}
}
}
The PDL_interp output is the following:
module {
pdl_interp.func @matcher(%arg0: !pdl.operation) {
pdl_interp.check_operation_name of %arg0 is "test.op" -> ^bb2, ^bb1
^bb1: // 4 preds: ^bb0, ^bb2, ^bb3, ^bb4
pdl_interp.finalize
^bb2: // pred: ^bb0
pdl_interp.check_result_count of %arg0 is 0 -> ^bb3, ^bb1
^bb3: // pred: ^bb2
%0 = pdl_interp.apply_constraint "NativeConstraint"(%arg0 : !pdl.operation) : !pdl.attribute -> ^bb4, ^bb1
^bb4: // pred: ^bb3
pdl_interp.record_match @rewriters::@pdl_generated_rewriter(%0, %arg0 : !pdl.attribute, !pdl.operation) : benefit(1), generatedOps(["test.success"]), loc([%arg0]), root("test.op") -> ^bb1
}
module @rewriters {
pdl_interp.func @pdl_generated_rewriter(%arg0: !pdl.attribute, %arg1: !pdl.operation) {
%0 = pdl_interp.get_results of %arg1 : !pdl.range<value>
%1 = pdl_interp.get_value_type of %0 : !pdl.range<type>
%2 = pdl_interp.create_operation "test.success" {"someAttr" = %arg0} -> (%1 : !pdl.range<type>)
pdl_interp.erase %arg1
pdl_interp.finalize
}
}
}
The pdl_interp.record_match
operation and the block it is in is not the last thing this pass generates for a matcher. This is done so the operation that is the last check before a successful match can have this block (^bb4) as successor (e.g. %0 = pdl_interp.apply_constraint
here). But the pdl_interp.record_match
needs the result of %0 = pdl_interp.apply_constraint
as operand.
So there is a cyclic dependency here, which we solve using the placeholder.
In short:
%0 = pdl_interp.apply_constraint
needs the success branch as successor.pdl_interp.record_match
is created with the success branch and needs the result%0
as operand
`__xray_customevent` and `__xray_typedevent` are built-in functions in Clang. With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and llvm.xray.typedevent, respectively. These intrinsics are then lowered to TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is responsible for generating a code sequence that calls either `__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3 arguments). Before patching, the code sequence is prefixed by a branch instruction that skips the rest of the code sequence. After patching (compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP and the function call will take effects. This patch implements the lowering process for {PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime. ``` // Lowering of PATCHABLE_EVENT_CALL .Lxray_sled_N: b #24 stp x0, x1, [sp, #-16]! x0 = reg of op0 x1 = reg of op1 bl __xray_CustomEvent ldrp x0, x1, [sp], #16 ``` As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now pass on AArch64. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D153320
…lvm#80904)" This reverts commit b1ac052. This commit breaks coroutine splitting for non-swift calling convention functions. In this example: ```ll ; ModuleID = 'repro.ll' source_filename = "stdlib/test/runtime/test_llcl.mojo" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @craSH to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 } define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 { ret void } define dso_local void @craSH(ptr %0) #0 { %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0) %3 = call ptr @llvm.coro.begin(token %2, ptr null) %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0 %5 = call ptr @llvm.coro.async.resume() store ptr %5, ptr %4, align 8 %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison) ret void } define dso_local ptr @ctxt_proj_fn(ptr %0) #0 { ret ptr %0 } ; Function Attrs: nomerge nounwind declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1 ; Function Attrs: nounwind declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2 ; Function Attrs: nounwind declare ptr @llvm.coro.begin(token, ptr writeonly) #2 ; Function Attrs: nomerge nounwind declare ptr @llvm.coro.async.resume() #1 attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" } attributes #1 = { nomerge nounwind } attributes #2 = { nounwind } ``` This verifier crashes after the `coro-split` pass with ``` cannot guarantee tail call due to mismatched parameter counts musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison) LLVM ERROR: Broken function PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: opt ../../../reduced.ll -O0 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36 #4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10 #7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10 #8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6 #9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7 #10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40 #11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23 #12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69 #13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28 #14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70 #15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32 #16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41 #17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24 #18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7 #19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7 #20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7 #21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14 #22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12 #23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13 #24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12 #26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18 #28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 #30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74 #31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 #33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33 #34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17 #35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7 #36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7 #37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7 #38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10 #39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27 #40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33 #41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20 #43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5 #44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335) Aborted (core dumped)
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc on failure. And then the wrapped malloc appears to access thread local storage using global dynamic accesses, thus calling ___interceptor___tls_get_addr, before REAL(__tls_get_addr) has been set, so we get a crash inside ___interceptor___tls_get_addr. For example, this can happen when looking up __isoc23_scanf which might not exist in some libcs. Fix this by marking the thread local variable accessed inside the debug checks as "initial-exec", which does not require __tls_get_addr. This is probably a better alternative to llvm#83886. This fixes a different crash but is related to llvm#46204. Backtrace: ``` #0 0x0000000000000000 in ?? () #1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759 #2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218 #3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129 #4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167 #5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383 #6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207 #7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103 #8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39 #9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69 #10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198 #11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223 #12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666 #13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56 #14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so", fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157 #15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793 --Type <RET> for more, q to quit, c to continue without paging-- #16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146 #17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195 #18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40 #19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237 #20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256 #21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138 #22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54 #23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68 #24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42 #25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61 #26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315 ``` Reviewed By: vitalybuka, MaskRay Pull Request: llvm#83890
In a similar manner as in https://reviews.llvm.org/D133494 use `TBL` to place bytes in the *upper* part of `i32` elements and then convert to float using fixed-point `scvtf`, i.e. scvtf Vd.4s, Vn.4s, #24
This PR adds support for PDLL native constraints to return results.
This is useful for situations where a pattern checks for certain constraints of multiple interdependent attributes and computes a new attribute value based on them. Currently we do this check natively in C++ during matching and after a successful match have to escape to native C++ again to do the computation during the rewriting part of the pattern. With this PR we can do the computation in C++ during matching and use the result in the rewriting part of the pattern. Effectively this enables a choice in the trade-off of memory consumption during matching vs recomputation of values.
This is a simplified example of a situation where this is useful: We have two operations with certain attributes that have interdependent constraints. For instance
attr_foo: one_of [0, 2, 4, 8]
,attr_bar: one_of [0, 2, 4, 8]
andattr_foo == attr_bar
. The pattern should only match if all conditions are true. The new operation should be created with a new attribute which is computed from the two matched attributes e.g.attr_baz = attr_foo * attr_bar
. For the check we already escape to native C++ and have all values at hand so it makes sense to directly compute the new attribute value as well:To achieve this the following changes were necessary:
Remove simple check in PDLL parser to allow native constraints to return results
Change PDL definition of
pdl.apply_native_constraint
to allow variadic resultsChange PDL_interp definition of
pdl_interp.apply_constraint
to allow variadic resultsAdjust PDLToPDLInterp Pass:
The input to the pass is an arbitrary number of PDL patterns. The pass collects all predicates
that are required to match all of the pdl patterns and tries to establish an ordering that allows
creation of a single efficient matcher function to match all of them. Hence, the pass does not support
the creation of a value on the matching side of a pattern out of the box. Simply referring to
the
pdl.apply_native_constraint
operation is also not possible because the pdl patterns are deletedbefore the pdl_interp dialect operations are created.
To solve this we record the type of the results in the predicate for the constraint and
create positions (how the pass refers to values that are not materialized yet) for them.
When a position is evaluated (i.e. used by an op on the rhs of the pattern) we emit a placeholder value
to enable the creation of the pdl_interp dialect operation. The placeholder value is replaced later
when the actual
pdl_interp.apply_constraint
operation is created.This is required in all scenarios where a native constraint returns a result that is used on the rhs
because said result has to be an input the the pdl_interp rewriter function that is created for the pattern.
However, the call to this function is always generated before the
pdl_interp.apply_constraint
operation is created.Modify Bytecode generator and interpreter:
Constraint functions which return results have a different type compared to existing constraint functions.
They have the same type as native rewrite functions. For this reason they are for now registered as rewrite functions.
Other options:
Now that constraints may produce results it is hard to say what is the exact difference between native constraints
and native rewrites. The only difference is that native constraints are evaluated during matching and native rewrites
are evaluated during rewriting. So it might even make sense to unify the two concepts and have a single type for both
and only a single function to register them, i.e.
registerNativeFunction
instead ofregisterConstraintFunction
andregisterRewriteFunction
.ByteCode generation:
ByteCode execution:
For a follow-up PR: