Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-repl] Set up executor implicitly to account for init PTUs #84758

Merged

Conversation

weliveindetail
Copy link
Contributor

Until now the IncrExecutor is created lazily on the first execution request. In order to process the PTUs that come from initialization, we have to do it upfront implicitly.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2024

@llvm/pr-subscribers-clang

Author: Stefan Gränitz (weliveindetail)

Changes

Until now the IncrExecutor is created lazily on the first execution request. In order to process the PTUs that come from initialization, we have to do it upfront implicitly.


Full diff: https://github.com/llvm/llvm-project/pull/84758.diff

3 Files Affected:

  • (modified) clang/lib/Interpreter/Interpreter.cpp (+26-4)
  • (modified) clang/test/Interpreter/execute.cpp (+2-2)
  • (modified) clang/test/Interpreter/inline-virtual.cpp (+1-1)
diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index e293fefb524963..5e1161ca472b36 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -229,12 +229,30 @@ IncrementalCompilerBuilder::CreateCudaHost() {
 }
 
 Interpreter::Interpreter(std::unique_ptr<CompilerInstance> CI,
-                         llvm::Error &Err) {
-  llvm::ErrorAsOutParameter EAO(&Err);
+                         llvm::Error &ErrOut) {
+  llvm::ErrorAsOutParameter EAO(&ErrOut);
   auto LLVMCtx = std::make_unique<llvm::LLVMContext>();
   TSCtx = std::make_unique<llvm::orc::ThreadSafeContext>(std::move(LLVMCtx));
-  IncrParser = std::make_unique<IncrementalParser>(*this, std::move(CI),
-                                                   *TSCtx->getContext(), Err);
+  IncrParser = std::make_unique<IncrementalParser>(
+      *this, std::move(CI), *TSCtx->getContext(), ErrOut);
+  if (ErrOut)
+    return;
+
+  // Not all frontends support code-generation, e.g. ast-dump actions don't
+  if (IncrParser->getCodeGen()) {
+    if (llvm::Error Err = CreateExecutor()) {
+      ErrOut = joinErrors(std::move(ErrOut), std::move(Err));
+      return;
+    }
+
+    // Process the PTUs that came from initialization. For example -include will
+    // give us a header that's processed at initialization of the preprocessor.
+    for (PartialTranslationUnit &PTU : IncrParser->getPTUs())
+      if (llvm::Error Err = Execute(PTU)) {
+        ErrOut = joinErrors(std::move(ErrOut), std::move(Err));
+        return;
+      }
+  }
 }
 
 Interpreter::~Interpreter() {
@@ -375,6 +393,10 @@ Interpreter::Parse(llvm::StringRef Code) {
 llvm::Error Interpreter::CreateExecutor() {
   const clang::TargetInfo &TI =
       getCompilerInstance()->getASTContext().getTargetInfo();
+  if (!IncrParser->getCodeGen())
+    return llvm::make_error<llvm::StringError>("Operation failed. "
+                                               "No code generator available",
+                                               std::error_code());
   llvm::Error Err = llvm::Error::success();
   auto Executor = std::make_unique<IncrementalExecutor>(*TSCtx, Err, TI);
   if (!Err)
diff --git a/clang/test/Interpreter/execute.cpp b/clang/test/Interpreter/execute.cpp
index 6e73ed3927e815..534a54ed94fba2 100644
--- a/clang/test/Interpreter/execute.cpp
+++ b/clang/test/Interpreter/execute.cpp
@@ -7,6 +7,8 @@
 
 // RUN: cat %s | clang-repl | FileCheck %s
 // RUN: cat %s | clang-repl -Xcc -O2 | FileCheck %s
+// RUN: clang-repl -Xcc -include -Xcc %s | FileCheck %s
+// RUN: clang-repl -Xcc -fsyntax-only -Xcc -include -Xcc %s
 extern "C" int printf(const char *, ...);
 int i = 42;
 auto r1 = printf("i = %d\n", i);
@@ -19,5 +21,3 @@ auto r2 = printf("S[f=%f, m=0x%llx]\n", s.f, reinterpret_cast<unsigned long long
 
 inline int foo() { return 42; }
 int r3 = foo();
-
-%quit
diff --git a/clang/test/Interpreter/inline-virtual.cpp b/clang/test/Interpreter/inline-virtual.cpp
index 79ab8ed337ffea..d862b3354f61fe 100644
--- a/clang/test/Interpreter/inline-virtual.cpp
+++ b/clang/test/Interpreter/inline-virtual.cpp
@@ -14,7 +14,7 @@ struct A { int a; A(int a) : a(a) {} virtual ~A(); };
 // PartialTranslationUnit.
 inline A::~A() { printf("~A(%d)\n", a); }
 
-// Create one instance with new and delete it.
+// Create one instance with new and delete it. We crash here now:
 A *a1 = new A(1);
 delete a1;
 // CHECK: ~A(1)

@weliveindetail weliveindetail changed the title Clang repl implicit executor create [clang-repl] Set up executor implicitly to account for init PTUs Mar 11, 2024
@@ -14,7 +14,7 @@ struct A { int a; A(int a) : a(a) {} virtual ~A(); };
// PartialTranslationUnit.
inline A::~A() { printf("~A(%d)\n", a); }

// Create one instance with new and delete it.
// Create one instance with new and delete it. We crash here now:
A *a1 = new A(1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With -O2 this test fails unexpectedly now. Minimal repro:

clang-repl> struct A { int a; A(int a) : a(a) {} virtual ~A() {} };
clang-repl> A *a1 = new A(1);
clang-repl: llvm/include/llvm/ADT/SmallVector.h:308: const_reference llvm::SmallVectorTemplateCommon<llvm::PointerAlignElem>::operator[](size_type) const [T = llvm::PointerAlignElem]: Assertion `idx < size()' failed.

The following still works and thus I assume it's related to c861d32:

clang-repl> struct A { int a; A(int a) : a(a) {} virtual ~A() {} }; A *a1 = new A(1);

@hahnjo Maybe we now process init code that clashes with your above fix. Do you think that's possible? Any idea how to handle it? If possible, I'd like to submit the patch as-is with the test XFAILed for later investigation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even have initial PTUs in the default case? Also the minimal reproducer shows a more general version where the virtual destructor is actually defined inline (c861d32 addresses the case where it is out-of-line, which is special due to key virtual functions). So if that breaks entirely (which is critical for us), I'm personally not ok with just XFAILing it to land another change...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably is something worth fixing now. I could not see the stack trace on osx. Can you paste it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even have initial PTUs in the default case?

Well, this test passes, if I comment out the for loop in the ctor that executes initial PTUs:
https://github.com/llvm/llvm-project/pull/84758/files#diff-b8484f1fc5b057f146ed5d9b6e2cd47c3f6f5ae879c7a0eee44f0a272581a88cR250-R254

Also the minimal reproducer shows a more general version where the virtual destructor is actually defined inline (c861d32 addresses the case where it is out-of-line, which is special due to key virtual functions).

Oh that's a good note, I had not considered the difference yet and actually they have different backtraces. Eventually, they both reach the same VTablePtr code though.

I could not see the stack trace on osx. Can you paste it here?

Here is a diff (inline left, out-of-line right):
Screenshot 2024-03-12 at 11 12 36

So if that breaks entirely (which is critical for us), I'm personally not ok with just XFAILing it to land another change...

What breaks here is the parser and this patch doesn't even touch it. Not sure I am missing something, but it seems that it triggers a bug that always existed and just didn't show up so far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me share some more observations.

We execute the initial PTU and not only parse it. It shouldn't make a difference for the parser right? Still, I wasn't sure, especially since we unusally only parse runtime PTUs. So I made some experiments:

    Init      Runtime      Dtor-def  |  New     Delete  Automatic
(1) Execute   Execute   Out-of-line  |  fails   -       fails
                             Inline  |  fails   -       once*

(2) Execute   Parse     Out-of-line  |  fails   -       fails
                             Inline  |  once*   fails   once

(3) Parse     Any       Out-of-line  |  works   works   works
                             Inline  |  works   works   works

Without -O2 everything works.
Automatic = A a1(1);
once = Works exactly once (including shutdown)
once* = Same, but only if dtor has side-effects e.g. printf call

(1) is the behavior with this patch. (3) was the status-quo before this patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, does valgrind say anything?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a report from valgrind and we seem to have a problem:

==858864== Warning: set address range perms: large range [0x108000, 0x11f2c000) (defined)
==858864== Invalid read of size 8
==858864==    at 0x11E99BE: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::length() const (basic_string.h:927)
==858864==    by 0x11F1E3B: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (basic_string.tcc:259)
==858864==    by 0x11EE135: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (basic_string.h:1387)
==858864==    by 0x11EB15C: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (basic_string.h:681)
==858864==    by 0x16F1819: llvm::DataLayout::operator=(llvm::DataLayout const&) (DataLayout.h:206)
==858864==    by 0x1A552E9: llvm::DataLayout::init(llvm::Module const*) (DataLayout.cpp:557)
==858864==    by 0x1A552B3: llvm::DataLayout::DataLayout(llvm::Module const*) (DataLayout.cpp:554)
==858864==    by 0x33797CA: clang::CodeGen::CodeGenTBAA::getVTablePtrAccessInfo(llvm::Type*) (CodeGenTBAA.cpp:274)
==858864==    by 0x3221BC3: clang::CodeGen::CodeGenModule::getTBAAVTablePtrAccessInfo(llvm::Type*) (CodeGenModule.cpp:1424)
==858864==    by 0x2DCA94E: clang::CodeGen::CodeGenFunction::InitializeVTablePointer(clang::CodeGen::CodeGenFunction::VPtr const&) (CGClass.cpp:2592)
==858864==    by 0x2DCAFC8: clang::CodeGen::CodeGenFunction::InitializeVTablePointers(clang::CXXRecordDecl const*) (CGClass.cpp:2676)
==858864==    by 0x2DC519C: clang::CodeGen::CodeGenFunction::EmitDestructorBody(clang::CodeGen::FunctionArgList&) (CGClass.cpp:1525)
==858864==  Address 0x13599270 is 512 bytes inside a block of size 800 free'd
==858864==    at 0x12D50B6F: operator delete(void*, unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==858864==    by 0x1E5A922: std::default_delete<llvm::Module>::operator()(llvm::Module*) const (unique_ptr.h:85)
==858864==    by 0x1E5A867: std::__uniq_ptr_impl<llvm::Module, std::default_delete<llvm::Module> >::reset(llvm::Module*) (unique_ptr.h:182)
==858864==    by 0x1E5A800: std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >::reset(llvm::Module*) (unique_ptr.h:456)
==858864==    by 0x1E4F8B8: std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >::operator=(decltype(nullptr)) (unique_ptr.h:397)
==858864==    by 0x1E4B5DC: llvm::orc::ThreadSafeModule::~ThreadSafeModule() (ThreadSafeModule.h:116)
==858864==    by 0x1F4DF4F: llvm::orc::IRMaterializationUnit::~IRMaterializationUnit() (Layer.h:31)
==858864==    by 0x1FCE1D5: llvm::orc::BasicIRLayerMaterializationUnit::~BasicIRLayerMaterializationUnit() (Layer.h:120)
==858864==    by 0x1FCE1F5: llvm::orc::BasicIRLayerMaterializationUnit::~BasicIRLayerMaterializationUnit() (Layer.h:120)
==858864==    by 0x1FCE235: std::default_delete<llvm::orc::BasicIRLayerMaterializationUnit>::operator()(llvm::orc::BasicIRLayerMaterializationUnit*) const (unique_ptr.h:85)
==858864==    by 0x1FCCFFD: std::unique_ptr<llvm::orc::BasicIRLayerMaterializationUnit, std::default_delete<llvm::orc::BasicIRLayerMaterializationUnit> >::~unique_ptr() (unique_ptr.h:361)
==858864==    by 0x1FCA870: llvm::orc::IRLayer::add(llvm::IntrusiveRefCntPtr<llvm::orc::ResourceTracker>, llvm::orc::ThreadSafeModule) (Layer.cpp:27)
==858864==  Block was alloc'd at
==858864==    at 0x12D4E013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==858864==    by 0x2CC7B86: (anonymous namespace)::CodeGeneratorImpl::CodeGeneratorImpl(clang::DiagnosticsEngine&, llvm::StringRef, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, clang::HeaderSearchOptions const&, clang::PreprocessorOptions const&, clang::CodeGenOptions const&, llvm::LLVMContext&, clang::CoverageSourceInfo*) (ModuleBuilder.cpp:87)
==858864==    by 0x2CC90C0: clang::CreateLLVMCodeGen(clang::DiagnosticsEngine&, llvm::StringRef, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, clang::HeaderSearchOptions const&, clang::PreprocessorOptions const&, clang::CodeGenOptions const&, llvm::LLVMContext&, clang::CoverageSourceInfo*) (ModuleBuilder.cpp:372)
==858864==    by 0x2C8749E: clang::BackendConsumer::BackendConsumer(clang::BackendAction, clang::DiagnosticsEngine&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, clang::HeaderSearchOptions const&, clang::PreprocessorOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, clang::FileManager const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::SmallVector<clang::CodeGenAction::LinkModule, 4u>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >, llvm::LLVMContext&, clang::CoverageSourceInfo*) (CodeGenAction.cpp:127)
==858864==    by 0x2C8C2A0: clang::CodeGenAction::CreateASTConsumer(clang::CompilerInstance&, llvm::StringRef) (CodeGenAction.cpp:1051)
==858864==    by 0x22A1E64: clang::WrapperFrontendAction::CreateASTConsumer(clang::CompilerInstance&, llvm::StringRef) (FrontendAction.cpp:1207)
==858864==    by 0x229B7A5: clang::FrontendAction::CreateWrappedASTConsumer(clang::CompilerInstance&, llvm::StringRef) (FrontendAction.cpp:166)
==858864==    by 0x22A08F0: clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, clang::FrontendInputFile const&) (FrontendAction.cpp:955)
==858864==    by 0x2198EA6: clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (CompilerInstance.cpp:1061)
==858864==    by 0x2573ECA: clang::IncrementalParser::IncrementalParser(clang::Interpreter&, std::unique_ptr<clang::CompilerInstance, std::default_delete<clang::CompilerInstance> >, llvm::LLVMContext&, llvm::Error&) (IncrementalParser.cpp:211)
==858864==    by 0x2562095: std::_MakeUniq<clang::IncrementalParser>::__single_object std::make_unique<clang::IncrementalParser, clang::Interpreter&, std::unique_ptr<clang::CompilerInstance, std::default_delete<clang::CompilerInstance> >, llvm::LLVMContext&, llvm::Error&>(clang::Interpreter&, std::unique_ptr<clang::CompilerInstance, std::default_delete<clang::CompilerInstance> >&&, llvm::LLVMContext&, llvm::Error&) (unique_ptr.h:962)
==858864==    by 0x25599FD: clang::Interpreter::Interpreter(std::unique_ptr<clang::CompilerInstance, std::default_delete<clang::CompilerInstance> >, llvm::Error&) (Interpreter.cpp:236)
==858864== 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is with this patch or just current main?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, that's also the case with releases/16.x...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#89031 should unblock this PR.

@vgvassilev vgvassilev force-pushed the clang-repl-implicit-executor-create branch from e68cb56 to 0af28b7 Compare April 13, 2024 20:04
vgvassilev added a commit to vgvassilev/llvm-project that referenced this pull request Apr 17, 2024
Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and clones the llvm::Module before
passing it to the Jit. That's also not bullet proof because we have to guarantee
that CodeGen does not write on the blueprint. At that stage that seems more
consistent to what clang-repl already does to map each partial translation unit
to a new Module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
vgvassilev added a commit to vgvassilev/llvm-project that referenced this pull request Apr 19, 2024
…y access.

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
vgvassilev added a commit to vgvassilev/llvm-project that referenced this pull request Apr 19, 2024
…y access.

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
vgvassilev added a commit to vgvassilev/llvm-project that referenced this pull request Apr 19, 2024
…y access.

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
vgvassilev added a commit to vgvassilev/llvm-project that referenced this pull request Apr 20, 2024
…y access.

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
vgvassilev added a commit that referenced this pull request Apr 20, 2024
…y access. (#89031)

Clang's CodeGen is designed to work with a single llvm::Module. In many
cases
for convenience various CodeGen parts have a reference to the
llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module
which does
not map well to CodeGen's design. To work this around we clone the
module and
pass it down.

With some effort it is possible to teach CodeGen to ask the
CodeGenModule for
its current module and that would have an overall positive impact on
CodeGen
improving the encapsulation of various parts but that's not resilient to
future
regression.

This patch takes a more conservative approach and keeps the first
llvm::Module
empty intentionally and does not pass it to the Jit. That's also not
bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock
progress
on #84758.
@vgvassilev vgvassilev force-pushed the clang-repl-implicit-executor-create branch from 0af28b7 to d22301d Compare April 20, 2024 17:57
vgvassilev added a commit that referenced this pull request Apr 20, 2024
…id memory access. (#89031)"

Original commit message: "

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on #84758.
"

This patch reverts adc4f62 and removes
the check of `named_metadata_empty` of the first llvm::Module because on darwin
clang inserts some harmless metadata which we can ignore.
@vgvassilev vgvassilev force-pushed the clang-repl-implicit-executor-create branch from d22301d to b42376e Compare April 20, 2024 19:11
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 21, 2024
…y access. (llvm#89031)

Clang's CodeGen is designed to work with a single llvm::Module. In many
cases
for convenience various CodeGen parts have a reference to the
llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module
which does
not map well to CodeGen's design. To work this around we clone the
module and
pass it down.

With some effort it is possible to teach CodeGen to ask the
CodeGenModule for
its current module and that would have an overall positive impact on
CodeGen
improving the encapsulation of various parts but that's not resilient to
future
regression.

This patch takes a more conservative approach and keeps the first
llvm::Module
empty intentionally and does not pass it to the Jit. That's also not
bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock
progress
on llvm#84758.
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 21, 2024
…id memory access. (llvm#89031)"

Original commit message: "

Clang's CodeGen is designed to work with a single llvm::Module. In many cases
for convenience various CodeGen parts have a reference to the llvm::Module
(TheModule or Module) which does not change when a new module is pushed.
However, the execution engine wants to take ownership of the module which does
not map well to CodeGen's design. To work this around we clone the module and
pass it down.

With some effort it is possible to teach CodeGen to ask the CodeGenModule for
its current module and that would have an overall positive impact on CodeGen
improving the encapsulation of various parts but that's not resilient to future
regression.

This patch takes a more conservative approach and keeps the first llvm::Module
empty intentionally and does not pass it to the Jit. That's also not bullet
proof because we have to guarantee that CodeGen does not write on the
blueprint. However, we have inserted some assertions to catch accidental
additions to that canary module.

This change will fixes a long-standing invalid memory access reported by
valgrind when we enable the TBAA optimization passes. It also unblock progress
on llvm#84758.
"

This patch reverts adc4f62 and removes
the check of `named_metadata_empty` of the first llvm::Module because on darwin
clang inserts some harmless metadata which we can ignore.
@weliveindetail weliveindetail force-pushed the clang-repl-implicit-executor-create branch from b42376e to e375d14 Compare April 22, 2024 08:09
@weliveindetail
Copy link
Contributor Author

#89031 should unblock this PR.

@vgvassilev It did. And it uncovered an issue that I didn't think about before: If we set up the executor in the constructor, then the interpreter instance isn't fully initialized yet when we create the LLJIT. Thus, we shouldn't call virtual functions on the interpreter yet (and if we do they will invoke the base-class methods and not the overidden ones). In particular, this breaks the CreateJITBuilder() mechanism: 0cf4788

For me this patch doesn't have high prio, so I'd probably just drop it and keep everything as is for now. What's your take?

@vgvassilev
Copy link
Contributor

For me it is high priority because without it I cannot land the pch support in clang repl. I believe that’s also somewhat important for your use case?

@weliveindetail
Copy link
Contributor Author

Taking this patch would mean that deriving from Interpreter won't hold as the mechanism for extensions! At least for customizing the JITBuilder. I propose to pass the LLJITBuilder to the constructor then and to store it as a member for later Create/ResetExecutor() calls. Let me prototype that..

@vgvassilev
Copy link
Contributor

I am fine changing it in any way that creates an executor as part of the initialization phase when we have codegen. Maybe we can do it in the builder class and then pass it to the interpreter?

@weliveindetail weliveindetail merged commit cb7995a into llvm:main May 28, 2024
8 checks passed
@weliveindetail weliveindetail deleted the clang-repl-implicit-executor-create branch May 28, 2024 09:22
@weliveindetail
Copy link
Contributor Author

Some bots report failures after this patch landed. I will push a fix soon.

@weliveindetail
Copy link
Contributor Author

@vgvassilev I pushed a quick fix for the tests. I think we should revisit/refactor the test story here though in the mid-term.

@vgvassilev
Copy link
Contributor

What do you mean?

@vvereschaka
Copy link
Contributor

@vgvassilev
Copy link
Contributor

I am not sure how this patch changed these tests to start failing on Windows. Do you have any clue?

weliveindetail added a commit that referenced this pull request May 28, 2024
@weliveindetail
Copy link
Contributor Author

@vvereschaka Thanks for the note! should be fixed now.

I am not sure how this patch changed these tests to start failing on Windows. Do you have any clue?

It's just cases I missed in the first quick-fix

What do you mean?

This is adding a lot of boilerplate, because each and ever test now looks like this:

#ifdef CLANG_INTERPRETER_PLATFORM_CANNOT_CREATE_LLJIT
TEST(CodeCompletionTest, DISABLED_TemplateFunctions) {
#else
 TEST(CodeCompletionTest, TemplateFunctions) {
#endif
  if (!HostSupportsJit())
    GTEST_SKIP();
  <<< actual test >>>
}

@vgvassilev LIT tests check host JIT support during config and bulk skip all affected tests. (1) Maybe we can do sth like this in unittests as well? Or otherwise: (2) Add an option to the Interpreter to skip init PTUs and roll-back my quick fixes? What do you think?

weliveindetail added a commit that referenced this pull request May 28, 2024
…st check host JIT support (#84758)

fea7399 had removed the unused function that was still there when I tested.
@vgvassilev
Copy link
Contributor

Can we create some text fixture that will do that automatically for us and reduce this copy-paste?

@weliveindetail
Copy link
Contributor Author

Oh indeed, it seems we can decide to skip tests in SetUp()

class FooTest : public ::testing:Test {
protected:
    void SetUp()
    {
        GTEST_SKIP();
    }
};

vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…st check host JIT support (llvm#84758)

fea7399 had removed the unused function that was still there when I tested.
@vgvassilev
Copy link
Contributor

Looks like you already implemented it. Nice.

@jakeegan
Copy link
Member

Hi, I see you fixed some tests, but there's still a couple failing on AIX:

Clang :: Interpreter/incremental-mode.cpp
Clang-Unit :: Interpreter/./ClangReplInterpreterTests/2/3
Clang-Unit :: Interpreter/ExceptionTests/./ClangReplInterpreterExceptionTests/0/1

https://lab.llvm.org/buildbot/#/builders/214/builds/12340/steps/6/logs/stdio

@weliveindetail
Copy link
Contributor Author

Looks like you already implemented it. Nice.

Well, it's gonna be a bit more churn. Eventually, it will reduce coverage for the incremental features in the frontend that are independent from the JIT. To be honest, I don't mind because none of the targets seem relevant for me, just wanted to point that out.

weliveindetail added a commit that referenced this pull request May 30, 2024
@weliveindetail
Copy link
Contributor Author

@jakeegan Thanks for reporting! Should be fixed now. Hope we caught them all 🤞

@vgvassilev
Copy link
Contributor

Looks like you already implemented it. Nice.

Well, it's gonna be a bit more churn. Eventually, it will reduce coverage for the incremental features in the frontend that are independent from the JIT. To be honest, I don't mind because none of the targets seem relevant for me, just wanted to point that out.

We can test them in syntax-only mode. I am not too worried by that.

@weliveindetail
Copy link
Contributor Author

Here we go: #93816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants