-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release/18.x: Reland "[clang-repl] Keep the first llvm::Module empty to avoid invalid memory access. (#89031)" #90544
Conversation
@llvm/pr-subscribers-clang Author: None (llvmbot) ChangesBackport a3f07d3 Requested by: @nikic Full diff: https://github.com/llvm/llvm-project/pull/90544.diff 2 Files Affected:
diff --git a/clang/lib/Interpreter/IncrementalParser.cpp b/clang/lib/Interpreter/IncrementalParser.cpp
index 370bcbfee8b014..f5f32b9f3924ae 100644
--- a/clang/lib/Interpreter/IncrementalParser.cpp
+++ b/clang/lib/Interpreter/IncrementalParser.cpp
@@ -209,6 +209,10 @@ IncrementalParser::IncrementalParser(Interpreter &Interp,
if (Err)
return;
CI->ExecuteAction(*Act);
+
+ if (getCodeGen())
+ CachedInCodeGenModule = GenModule();
+
std::unique_ptr<ASTConsumer> IncrConsumer =
std::make_unique<IncrementalASTConsumer>(Interp, CI->takeASTConsumer());
CI->setASTConsumer(std::move(IncrConsumer));
@@ -224,11 +228,8 @@ IncrementalParser::IncrementalParser(Interpreter &Interp,
return; // PTU.takeError();
}
- if (CodeGenerator *CG = getCodeGen()) {
- std::unique_ptr<llvm::Module> M(CG->ReleaseModule());
- CG->StartModule("incr_module_" + std::to_string(PTUs.size()),
- M->getContext());
- PTU->TheModule = std::move(M);
+ if (getCodeGen()) {
+ PTU->TheModule = GenModule();
assert(PTU->TheModule && "Failed to create initial PTU");
}
}
@@ -364,6 +365,19 @@ IncrementalParser::Parse(llvm::StringRef input) {
std::unique_ptr<llvm::Module> IncrementalParser::GenModule() {
static unsigned ID = 0;
if (CodeGenerator *CG = getCodeGen()) {
+ // Clang's CodeGen is designed to work with a single llvm::Module. In many
+ // cases for convenience various CodeGen parts have a reference to the
+ // llvm::Module (TheModule or Module) which does not change when a new
+ // module is pushed. However, the execution engine wants to take ownership
+ // of the module which does not map well to CodeGen's design. To work this
+ // around we created an empty module to make CodeGen happy. We should make
+ // sure it always stays empty.
+ assert((!CachedInCodeGenModule ||
+ (CachedInCodeGenModule->empty() &&
+ CachedInCodeGenModule->global_empty() &&
+ CachedInCodeGenModule->alias_empty() &&
+ CachedInCodeGenModule->ifunc_empty())) &&
+ "CodeGen wrote to a readonly module");
std::unique_ptr<llvm::Module> M(CG->ReleaseModule());
CG->StartModule("incr_module_" + std::to_string(ID++), M->getContext());
return M;
diff --git a/clang/lib/Interpreter/IncrementalParser.h b/clang/lib/Interpreter/IncrementalParser.h
index e13b74c7f65948..f63bce50acd3b9 100644
--- a/clang/lib/Interpreter/IncrementalParser.h
+++ b/clang/lib/Interpreter/IncrementalParser.h
@@ -24,6 +24,7 @@
#include <memory>
namespace llvm {
class LLVMContext;
+class Module;
} // namespace llvm
namespace clang {
@@ -57,6 +58,10 @@ class IncrementalParser {
/// of code.
std::list<PartialTranslationUnit> PTUs;
+ /// When CodeGen is created the first llvm::Module gets cached in many places
+ /// and we must keep it alive.
+ std::unique_ptr<llvm::Module> CachedInCodeGenModule;
+
IncrementalParser();
public:
|
@nikic, I am confused. How is that an abi breaking change? |
@vgvassilev My bad, I completely missed that the changed header is in the lib/ directory, not in include/. |
Ok, maybe we could move forward? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…id memory access. (llvm#89031)" Original commit message: " Clang's CodeGen is designed to work with a single llvm::Module. In many cases for convenience various CodeGen parts have a reference to the llvm::Module (TheModule or Module) which does not change when a new module is pushed. However, the execution engine wants to take ownership of the module which does not map well to CodeGen's design. To work this around we clone the module and pass it down. With some effort it is possible to teach CodeGen to ask the CodeGenModule for its current module and that would have an overall positive impact on CodeGen improving the encapsulation of various parts but that's not resilient to future regression. This patch takes a more conservative approach and keeps the first llvm::Module empty intentionally and does not pass it to the Jit. That's also not bullet proof because we have to guarantee that CodeGen does not write on the blueprint. However, we have inserted some assertions to catch accidental additions to that canary module. This change will fixes a long-standing invalid memory access reported by valgrind when we enable the TBAA optimization passes. It also unblock progress on llvm#84758. " This patch reverts adc4f62 and removes the check of `named_metadata_empty` of the first llvm::Module because on darwin clang inserts some harmless metadata which we can ignore. (cherry picked from commit a3f07d3)
@nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. |
Backport a3f07d3
Requested by: @nikic