-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm][mlir][flang][OpenMP] Emit __atomic_load and __atomic_compare_exchange libcalls for complex types in atomic update #92364
Conversation
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-mlir Author: None (NimishMishra) ChangesThis patch adds functionality to emit relevant libcalls in case atomicrmw instruction can not be emitted (for instance, in case of complex types). Fixes #83760 Full diff: https://github.com/llvm/llvm-project/pull/92364.diff 4 Files Affected:
diff --git a/flang/test/Integration/OpenMP/atomic-update-complex.f90 b/flang/test/Integration/OpenMP/atomic-update-complex.f90
new file mode 100644
index 0000000000000..7fe37673e7dce
--- /dev/null
+++ b/flang/test/Integration/OpenMP/atomic-update-complex.f90
@@ -0,0 +1,59 @@
+!===----------------------------------------------------------------------===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===----------------------------------------------------------------------===!
+
+!RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s
+
+!CHECK: define void @_QQmain() {
+!CHECK: %[[VAL_1:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_2:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_3:.*]] = alloca { float, float }, align 8
+!CHECK: %[[X_NEW_VAL:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_4:.*]] = alloca { float, float }, i64 1, align 8
+!CHECK: %[[VAL_5:.*]] = alloca { float, float }, i64 1, align 8
+!CHECK: store { float, float } { float 2.000000e+00, float 2.000000e+00 }, ptr %[[VAL_4]], align 4
+!CHECK: br label %entry
+
+program main
+ complex*8 ia, ib
+ ia = (2, 2)
+
+!CHECK: entry:
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_3]])
+!CHECK: call void @__atomic_load(i64 8, ptr %[[VAL_4]], ptr %[[VAL_3]], i32 0)
+!CHECK: %[[VAL_6:.*]] = load { float, float }, ptr %[[VAL_3]], align 8
+!CHECK: call void @llvm.lifetime.end.p0(i64 8, ptr %[[VAL_3]])
+!CHECK: br label %.atomic.cont
+
+
+!CHECK: .atomic.cont
+!CHECK: %[[VAL_7:.*]] = phi { float, float } [ %[[VAL_6]], %entry ], [ {{.*}}, %.atomic.cont ]
+!CHECK: %[[VAL_8:.*]] = extractvalue { float, float } %[[VAL_7]], 0
+!CHECK: %[[VAL_9:.*]] = extractvalue { float, float } %[[VAL_7]], 1
+!CHECK: %[[VAL_10:.*]] = fadd contract float %[[VAL_8]], 1.000000e+00
+!CHECK: %[[VAL_11:.*]] = fadd contract float %[[VAL_9]], 1.000000e+00
+!CHECK: %[[VAL_12:.*]] = insertvalue { float, float } undef, float %[[VAL_10]], 0
+!CHECK: %[[VAL_13:.*]] = insertvalue { float, float } %[[VAL_12]], float %[[VAL_11]], 1
+!CHECK: store { float, float } %[[VAL_13]], ptr %[[X_NEW_VAL]], align 4
+!CHECK: %[[VAL_14:.*]] = load { float, float }, ptr %[[X_NEW_VAL]], align 4
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_1]])
+!CHECK: store { float, float } %[[VAL_7]], ptr %[[VAL_1]], align 8
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_2]])
+!CHECK: store { float, float } %[[VAL_14]], ptr %[[VAL_2]], align 8
+!CHECK: %[[VAL_15:.*]] = call zeroext i1 @__atomic_compare_exchange(i64 8, ptr %[[VAL_4]], ptr %[[VAL_1]], ptr %[[VAL_2]], i32 0, i32 0)
+!CHECK: call void @llvm.lifetime.end.p0(i64 8, ptr %[[VAL_2]])
+!CHECK: %[[VAL_16:.*]] = load { float, float }, ptr %[[VAL_1]], align 8
+!CHECK: %[[VAL_17:.*]] = insertvalue { { float, float }, i1 } poison, { float, float } %[[VAL_16]], 0
+!CHECK: %[[VAL_18:.*]] = insertvalue { { float, float }, i1 } %[[VAL_17]], i1 %[[VAL_15]], 1
+!CHECK: %[[VAL_19:.*]] = extractvalue { { float, float }, i1 } %[[VAL_18]], 0
+!CHECK: %[[VAL_20:.*]] = extractvalue { { float, float }, i1 } %[[VAL_18]], 1
+!CHECK: br i1 %[[VAL_20]], label %.atomic.exit, label %.atomic.cont
+ !$omp atomic update
+ ia = ia + (1, 1)
+ !$omp end atomic
+ print *, ia
+end program main
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index c9ee0c25194c2..8a2dd1e40f5c2 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2418,14 +2418,25 @@ class OpenMPIRBuilder {
/// \param IsXBinopExpr true if \a X is Left H.S. in Right H.S. part of the
/// update expression, false otherwise.
/// (e.g. true for X = X BinOp Expr)
- ///
+ /// \param shouldEmitLibCall true is atomicrmw cannot be emitted for \a X
/// \returns A pair of the old value of X before the update, and the value
/// used for the update.
std::pair<Value *, Value *>
emitAtomicUpdate(InsertPointTy AllocaIP, Value *X, Type *XElemTy, Value *Expr,
AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy &UpdateOp, bool VolatileX,
- bool IsXBinopExpr);
+ bool IsXBinopExpr, bool shouldEmitLibCall = false);
+
+ bool emitAtomicCompareExchangeLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, llvm::PHINode *PHI, llvm::BasicBlock *ContBB,
+ llvm::BasicBlock *ExitBB);
+
+ bool emitAtomicLoadLibCall(Instruction *I, unsigned Size, Align Alignment,
+ Value *PointerOperand, Value *ValueOperand,
+ Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, Value *&LoadedVal);
/// Emit the binary op. described by \p RMWOp, using \p Src1 and \p Src2 .
///
@@ -2487,14 +2498,15 @@ class OpenMPIRBuilder {
/// \param IsXBinopExpr true if \a X is Left H.S. in Right H.S. part of the
/// update expression, false otherwise.
/// (e.g. true for X = X BinOp Expr)
- ///
+ /// \param shouldEmitLibCall true is atomicrmw cannot be emitted for \a X
/// \return Insertion point after generated atomic update IR.
InsertPointTy createAtomicUpdate(const LocationDescription &Loc,
InsertPointTy AllocaIP, AtomicOpValue &X,
Value *Expr, AtomicOrdering AO,
AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy &UpdateOp,
- bool IsXBinopExpr);
+ bool IsXBinopExpr,
+ bool shouldEmitLibCall = false);
/// Emit atomic update for constructs: --- Only Scalar data types
/// V = X; X = X BinOp Expr ,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 42ea20919a5e1..42afaa8aed36c 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -5936,7 +5936,8 @@ OpenMPIRBuilder::createAtomicWrite(const LocationDescription &Loc,
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicUpdate(
const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X,
Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
- AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr) {
+ AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr,
+ bool shouldEmitLibCall) {
assert(!isConflictIP(Loc.IP, AllocaIP) && "IPs must not be ambiguous");
if (!updateToLocation(Loc))
return Loc.IP;
@@ -5955,7 +5956,7 @@ OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicUpdate(
});
emitAtomicUpdate(AllocaIP, X.Var, X.ElemTy, Expr, AO, RMWOp, UpdateOp,
- X.IsVolatile, IsXBinopExpr);
+ X.IsVolatile, IsXBinopExpr, shouldEmitLibCall);
checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Update);
return Builder.saveIP();
}
@@ -5993,10 +5994,180 @@ Value *OpenMPIRBuilder::emitRMWOpAsInstruction(Value *Src1, Value *Src2,
llvm_unreachable("Unsupported atomic update operation");
}
+bool OpenMPIRBuilder::emitAtomicCompareExchangeLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, llvm::PHINode *PHI, llvm::BasicBlock *ContBB,
+ llvm::BasicBlock *ExitBB) {
+
+ LLVMContext &Ctx = I->getContext();
+ Module *M = I->getModule();
+ const DataLayout &DL = M->getDataLayout();
+ IRBuilder<> Builder(I);
+ IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front());
+
+ Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8);
+
+ const Align AllocaAlignment = DL.getPrefTypeAlign(SizedIntTy);
+
+ ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size);
+ assert(Ordering != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Constant *OrderingVal =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering));
+ Constant *Ordering2Val = nullptr;
+ if (CASExpected) {
+ assert(Ordering2 != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Ordering2Val =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering2));
+ }
+
+ bool HasResult = I->getType() != Type::getVoidTy(Ctx);
+ AllocaInst *AllocaCASExpected = nullptr;
+ AllocaInst *AllocaValue = nullptr;
+ AllocaInst *AllocaResult = nullptr;
+
+ Type *ResultTy;
+ SmallVector<Value *, 6> Args;
+ AttributeList Attr;
+
+ Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size));
+
+ Value *PtrVal = PointerOperand;
+ PtrVal = Builder.CreateAddrSpaceCast(PtrVal, PointerType::getUnqual(Ctx));
+ Args.push_back(PtrVal);
+
+ if (CASExpected) {
+ AllocaCASExpected = AllocaBuilder.CreateAlloca(CASExpected->getType());
+ AllocaCASExpected->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaCASExpected, SizeVal64);
+ Builder.CreateAlignedStore(CASExpected, AllocaCASExpected, AllocaAlignment);
+ Args.push_back(AllocaCASExpected);
+ }
+
+ if (ValueOperand) {
+ AllocaValue = AllocaBuilder.CreateAlloca(ValueOperand->getType());
+ AllocaValue->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaValue, SizeVal64);
+ Builder.CreateAlignedStore(ValueOperand, AllocaValue, AllocaAlignment);
+ Args.push_back(AllocaValue);
+ }
+
+ if (!CASExpected && HasResult) {
+ AllocaResult = AllocaBuilder.CreateAlloca(I->getType());
+ AllocaResult->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaResult, SizeVal64);
+ Args.push_back(AllocaResult);
+ }
+
+ Args.push_back(OrderingVal);
+
+ if (Ordering2Val)
+ Args.push_back(Ordering2Val);
+
+ ResultTy = Type::getInt1Ty(Ctx);
+ Attr = Attr.addRetAttribute(Ctx, Attribute::ZExt);
+
+ SmallVector<Type *, 6> ArgTys;
+ for (Value *Arg : Args)
+ ArgTys.push_back(Arg->getType());
+
+ FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false);
+ FunctionCallee LibcallFn =
+ M->getOrInsertFunction("__atomic_compare_exchange", FnType, Attr);
+ CallInst *Call = Builder.CreateCall(LibcallFn, Args);
+ Call->setAttributes(Attr);
+ Value *Result = Call;
+
+ if (ValueOperand)
+ Builder.CreateLifetimeEnd(AllocaValue, SizeVal64);
+
+ Type *FinalResultTy = I->getType();
+ Value *V = PoisonValue::get(FinalResultTy);
+ Value *ExpectedOut = Builder.CreateAlignedLoad(
+ CASExpected->getType(), AllocaCASExpected, AllocaAlignment);
+ Builder.CreateLifetimeEnd(AllocaCASExpected, SizeVal64);
+ V = Builder.CreateInsertValue(V, ExpectedOut, 0);
+ V = Builder.CreateInsertValue(V, Result, 1);
+ I->replaceAllUsesWith(V);
+ Value *PreviousVal = Builder.CreateExtractValue(V, /*Idxs=*/0);
+ Value *SuccessFailureVal = Builder.CreateExtractValue(V, /*Idxs=*/1);
+ PHI->addIncoming(PreviousVal, Builder.GetInsertBlock());
+ Builder.CreateCondBr(SuccessFailureVal, ExitBB, ContBB);
+ return true;
+}
+
+bool OpenMPIRBuilder::emitAtomicLoadLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, Value *&LoadedVal) {
+
+ LLVMContext &Ctx = I->getContext();
+ Module *M = I->getModule();
+ const DataLayout &DL = M->getDataLayout();
+ IRBuilder<> Builder(I);
+ IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front());
+
+ Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8);
+
+ const Align AllocaAlignment = DL.getPrefTypeAlign(SizedIntTy);
+
+ ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size);
+ assert(Ordering != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Constant *OrderingVal =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering));
+ Constant *Ordering2Val = nullptr;
+
+ bool HasResult = I->getType() != Type::getVoidTy(Ctx);
+ AllocaInst *AllocaCASExpected = nullptr;
+ AllocaInst *AllocaValue = nullptr;
+ AllocaInst *AllocaResult = nullptr;
+
+ Type *ResultTy;
+ SmallVector<Value *, 6> Args;
+ AttributeList Attr;
+
+ Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size));
+
+ Value *PtrVal = PointerOperand;
+ PtrVal = Builder.CreateAddrSpaceCast(PtrVal, PointerType::getUnqual(Ctx));
+ Args.push_back(PtrVal);
+
+ if (!CASExpected && HasResult) {
+ AllocaResult = AllocaBuilder.CreateAlloca(I->getType());
+ AllocaResult->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaResult, SizeVal64);
+ Args.push_back(AllocaResult);
+ }
+
+ Args.push_back(OrderingVal);
+
+ if (Ordering2Val)
+ Args.push_back(Ordering2Val);
+
+ ResultTy = Type::getVoidTy(Ctx);
+
+ SmallVector<Type *, 6> ArgTys;
+ for (Value *Arg : Args)
+ ArgTys.push_back(Arg->getType());
+
+ FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false);
+ FunctionCallee LibcallFn =
+ M->getOrInsertFunction("__atomic_load", FnType, Attr);
+ CallInst *Call = Builder.CreateCall(LibcallFn, Args);
+ Call->setAttributes(Attr);
+
+ LoadedVal =
+ Builder.CreateAlignedLoad(I->getType(), AllocaResult, AllocaAlignment);
+ Builder.CreateLifetimeEnd(AllocaResult, SizeVal64);
+ I->replaceAllUsesWith(LoadedVal);
+ return true;
+}
+
std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
InsertPointTy AllocaIP, Value *X, Type *XElemTy, Value *Expr,
AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
- AtomicUpdateCallbackTy &UpdateOp, bool VolatileX, bool IsXBinopExpr) {
+ AtomicUpdateCallbackTy &UpdateOp, bool VolatileX, bool IsXBinopExpr,
+ bool shouldEmitLibCall) {
// TODO: handle the case where XElemTy is not byte-sized or not a power of 2
// or a complex datatype.
bool emitRMWOp = false;
@@ -6018,7 +6189,7 @@ std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
emitRMWOp &= XElemTy->isIntegerTy();
std::pair<Value *, Value *> Res;
- if (emitRMWOp) {
+ if (emitRMWOp && !shouldEmitLibCall) {
Res.first = Builder.CreateAtomicRMW(RMWOp, X, Expr, llvm::MaybeAlign(), AO);
// not needed except in case of postfix captures. Generate anyway for
// consistency with the else part. Will be removed with any DCE pass.
@@ -6027,6 +6198,64 @@ std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
Res.second = Res.first;
else
Res.second = emitRMWOpAsInstruction(Res.first, Expr, RMWOp);
+ } else if (shouldEmitLibCall) {
+ LoadInst *OldVal =
+ Builder.CreateLoad(XElemTy, X, X->getName() + ".atomic.load");
+ OldVal->setAtomic(AO);
+ const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
+ unsigned LoadSize =
+ LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+
+ Value *LoadedVal = nullptr;
+ emitAtomicLoadLibCall(OldVal, LoadSize, OldVal->getAlign(),
+ OldVal->getPointerOperand(), nullptr, nullptr,
+ OldVal->getOrdering(), AtomicOrdering::NotAtomic,
+ LoadedVal);
+
+ BasicBlock *CurBB = Builder.GetInsertBlock();
+ Instruction *CurBBTI = CurBB->getTerminator();
+ CurBBTI = CurBBTI ? CurBBTI : Builder.CreateUnreachable();
+ BasicBlock *ExitBB =
+ CurBB->splitBasicBlock(CurBBTI, X->getName() + ".atomic.exit");
+ BasicBlock *ContBB = CurBB->splitBasicBlock(CurBB->getTerminator(),
+ X->getName() + ".atomic.cont");
+ ContBB->getTerminator()->eraseFromParent();
+ Builder.restoreIP(AllocaIP);
+ AllocaInst *NewAtomicAddr = Builder.CreateAlloca(XElemTy);
+ NewAtomicAddr->setName(X->getName() + "x.new.val");
+ Builder.SetInsertPoint(ContBB);
+ llvm::PHINode *PHI = Builder.CreatePHI(OldVal->getType(), 2);
+ PHI->addIncoming(LoadedVal, CurBB);
+ Value *OldExprVal = PHI;
+
+ Value *Upd = UpdateOp(OldExprVal, Builder);
+ Builder.CreateStore(Upd, NewAtomicAddr);
+ LoadInst *DesiredVal = Builder.CreateLoad(XElemTy, NewAtomicAddr);
+ AtomicOrdering Failure =
+ llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO);
+ AtomicCmpXchgInst *Result = Builder.CreateAtomicCmpXchg(
+ X, PHI, DesiredVal, llvm::MaybeAlign(), AO, Failure);
+
+ const DataLayout &DL = Result->getModule()->getDataLayout();
+ unsigned Size = DL.getTypeStoreSize(Result->getCompareOperand()->getType());
+
+ emitAtomicCompareExchangeLibCall(
+ Result, Size, Result->getAlign(), Result->getPointerOperand(),
+ Result->getNewValOperand(), Result->getCompareOperand(),
+ Result->getSuccessOrdering(), Result->getFailureOrdering(), PHI, ContBB,
+ ExitBB);
+
+ Result->eraseFromParent();
+ OldVal->eraseFromParent();
+ Res.first = OldExprVal;
+ Res.second = Upd;
+ if (UnreachableInst *ExitTI =
+ dyn_cast<UnreachableInst>(ExitBB->getTerminator())) {
+ CurBBTI->eraseFromParent();
+ Builder.SetInsertPoint(ExitBB);
+ } else {
+ Builder.SetInsertPoint(ExitTI);
+ }
} else {
IntegerType *IntCastTy =
IntegerType::get(M.getContext(), XElemTy->getScalarSizeInBits());
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index bfd7d65912bdb..9ea4d96de4f95 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1621,7 +1621,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
// Convert values and types.
auto &innerOpList = opInst.getRegion().front().getOperations();
- bool isRegionArgUsed{false}, isXBinopExpr{false};
+ bool isRegionArgUsed{false}, isXBinopExpr{false}, shouldEmitLibCall{false};
llvm::AtomicRMWInst::BinOp binop;
mlir::Value mlirExpr;
// Find the binary update operation that uses the region argument
@@ -1640,8 +1640,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
}
}
if (!isRegionArgUsed)
- return opInst.emitError("no atomic update operation with region argument"
- " as operand found inside atomic.update region");
+ shouldEmitLibCall = true;
llvm::Value *llvmExpr = moduleTranslation.lookupValue(mlirExpr);
llvm::Value *llvmX = moduleTranslation.lookupValue(opInst.getX());
@@ -1679,7 +1678,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
builder.restoreIP(ompBuilder->createAtomicUpdate(
ompLoc, allocaIP, llvmAtomicX, llvmExpr, atomicOrdering, binop, updateFn,
- isXBinopExpr));
+ isXBinopExpr, shouldEmitLibCall));
return updateGenStatus;
}
|
@llvm/pr-subscribers-mlir-openmp Author: None (NimishMishra) ChangesThis patch adds functionality to emit relevant libcalls in case atomicrmw instruction can not be emitted (for instance, in case of complex types). Fixes #83760 Full diff: https://github.com/llvm/llvm-project/pull/92364.diff 4 Files Affected:
diff --git a/flang/test/Integration/OpenMP/atomic-update-complex.f90 b/flang/test/Integration/OpenMP/atomic-update-complex.f90
new file mode 100644
index 0000000000000..7fe37673e7dce
--- /dev/null
+++ b/flang/test/Integration/OpenMP/atomic-update-complex.f90
@@ -0,0 +1,59 @@
+!===----------------------------------------------------------------------===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===----------------------------------------------------------------------===!
+
+!RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s
+
+!CHECK: define void @_QQmain() {
+!CHECK: %[[VAL_1:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_2:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_3:.*]] = alloca { float, float }, align 8
+!CHECK: %[[X_NEW_VAL:.*]] = alloca { float, float }, align 8
+!CHECK: %[[VAL_4:.*]] = alloca { float, float }, i64 1, align 8
+!CHECK: %[[VAL_5:.*]] = alloca { float, float }, i64 1, align 8
+!CHECK: store { float, float } { float 2.000000e+00, float 2.000000e+00 }, ptr %[[VAL_4]], align 4
+!CHECK: br label %entry
+
+program main
+ complex*8 ia, ib
+ ia = (2, 2)
+
+!CHECK: entry:
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_3]])
+!CHECK: call void @__atomic_load(i64 8, ptr %[[VAL_4]], ptr %[[VAL_3]], i32 0)
+!CHECK: %[[VAL_6:.*]] = load { float, float }, ptr %[[VAL_3]], align 8
+!CHECK: call void @llvm.lifetime.end.p0(i64 8, ptr %[[VAL_3]])
+!CHECK: br label %.atomic.cont
+
+
+!CHECK: .atomic.cont
+!CHECK: %[[VAL_7:.*]] = phi { float, float } [ %[[VAL_6]], %entry ], [ {{.*}}, %.atomic.cont ]
+!CHECK: %[[VAL_8:.*]] = extractvalue { float, float } %[[VAL_7]], 0
+!CHECK: %[[VAL_9:.*]] = extractvalue { float, float } %[[VAL_7]], 1
+!CHECK: %[[VAL_10:.*]] = fadd contract float %[[VAL_8]], 1.000000e+00
+!CHECK: %[[VAL_11:.*]] = fadd contract float %[[VAL_9]], 1.000000e+00
+!CHECK: %[[VAL_12:.*]] = insertvalue { float, float } undef, float %[[VAL_10]], 0
+!CHECK: %[[VAL_13:.*]] = insertvalue { float, float } %[[VAL_12]], float %[[VAL_11]], 1
+!CHECK: store { float, float } %[[VAL_13]], ptr %[[X_NEW_VAL]], align 4
+!CHECK: %[[VAL_14:.*]] = load { float, float }, ptr %[[X_NEW_VAL]], align 4
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_1]])
+!CHECK: store { float, float } %[[VAL_7]], ptr %[[VAL_1]], align 8
+!CHECK: call void @llvm.lifetime.start.p0(i64 8, ptr %[[VAL_2]])
+!CHECK: store { float, float } %[[VAL_14]], ptr %[[VAL_2]], align 8
+!CHECK: %[[VAL_15:.*]] = call zeroext i1 @__atomic_compare_exchange(i64 8, ptr %[[VAL_4]], ptr %[[VAL_1]], ptr %[[VAL_2]], i32 0, i32 0)
+!CHECK: call void @llvm.lifetime.end.p0(i64 8, ptr %[[VAL_2]])
+!CHECK: %[[VAL_16:.*]] = load { float, float }, ptr %[[VAL_1]], align 8
+!CHECK: %[[VAL_17:.*]] = insertvalue { { float, float }, i1 } poison, { float, float } %[[VAL_16]], 0
+!CHECK: %[[VAL_18:.*]] = insertvalue { { float, float }, i1 } %[[VAL_17]], i1 %[[VAL_15]], 1
+!CHECK: %[[VAL_19:.*]] = extractvalue { { float, float }, i1 } %[[VAL_18]], 0
+!CHECK: %[[VAL_20:.*]] = extractvalue { { float, float }, i1 } %[[VAL_18]], 1
+!CHECK: br i1 %[[VAL_20]], label %.atomic.exit, label %.atomic.cont
+ !$omp atomic update
+ ia = ia + (1, 1)
+ !$omp end atomic
+ print *, ia
+end program main
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index c9ee0c25194c2..8a2dd1e40f5c2 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2418,14 +2418,25 @@ class OpenMPIRBuilder {
/// \param IsXBinopExpr true if \a X is Left H.S. in Right H.S. part of the
/// update expression, false otherwise.
/// (e.g. true for X = X BinOp Expr)
- ///
+ /// \param shouldEmitLibCall true is atomicrmw cannot be emitted for \a X
/// \returns A pair of the old value of X before the update, and the value
/// used for the update.
std::pair<Value *, Value *>
emitAtomicUpdate(InsertPointTy AllocaIP, Value *X, Type *XElemTy, Value *Expr,
AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy &UpdateOp, bool VolatileX,
- bool IsXBinopExpr);
+ bool IsXBinopExpr, bool shouldEmitLibCall = false);
+
+ bool emitAtomicCompareExchangeLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, llvm::PHINode *PHI, llvm::BasicBlock *ContBB,
+ llvm::BasicBlock *ExitBB);
+
+ bool emitAtomicLoadLibCall(Instruction *I, unsigned Size, Align Alignment,
+ Value *PointerOperand, Value *ValueOperand,
+ Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, Value *&LoadedVal);
/// Emit the binary op. described by \p RMWOp, using \p Src1 and \p Src2 .
///
@@ -2487,14 +2498,15 @@ class OpenMPIRBuilder {
/// \param IsXBinopExpr true if \a X is Left H.S. in Right H.S. part of the
/// update expression, false otherwise.
/// (e.g. true for X = X BinOp Expr)
- ///
+ /// \param shouldEmitLibCall true is atomicrmw cannot be emitted for \a X
/// \return Insertion point after generated atomic update IR.
InsertPointTy createAtomicUpdate(const LocationDescription &Loc,
InsertPointTy AllocaIP, AtomicOpValue &X,
Value *Expr, AtomicOrdering AO,
AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy &UpdateOp,
- bool IsXBinopExpr);
+ bool IsXBinopExpr,
+ bool shouldEmitLibCall = false);
/// Emit atomic update for constructs: --- Only Scalar data types
/// V = X; X = X BinOp Expr ,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 42ea20919a5e1..42afaa8aed36c 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -5936,7 +5936,8 @@ OpenMPIRBuilder::createAtomicWrite(const LocationDescription &Loc,
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicUpdate(
const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X,
Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
- AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr) {
+ AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr,
+ bool shouldEmitLibCall) {
assert(!isConflictIP(Loc.IP, AllocaIP) && "IPs must not be ambiguous");
if (!updateToLocation(Loc))
return Loc.IP;
@@ -5955,7 +5956,7 @@ OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicUpdate(
});
emitAtomicUpdate(AllocaIP, X.Var, X.ElemTy, Expr, AO, RMWOp, UpdateOp,
- X.IsVolatile, IsXBinopExpr);
+ X.IsVolatile, IsXBinopExpr, shouldEmitLibCall);
checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Update);
return Builder.saveIP();
}
@@ -5993,10 +5994,180 @@ Value *OpenMPIRBuilder::emitRMWOpAsInstruction(Value *Src1, Value *Src2,
llvm_unreachable("Unsupported atomic update operation");
}
+bool OpenMPIRBuilder::emitAtomicCompareExchangeLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, llvm::PHINode *PHI, llvm::BasicBlock *ContBB,
+ llvm::BasicBlock *ExitBB) {
+
+ LLVMContext &Ctx = I->getContext();
+ Module *M = I->getModule();
+ const DataLayout &DL = M->getDataLayout();
+ IRBuilder<> Builder(I);
+ IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front());
+
+ Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8);
+
+ const Align AllocaAlignment = DL.getPrefTypeAlign(SizedIntTy);
+
+ ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size);
+ assert(Ordering != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Constant *OrderingVal =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering));
+ Constant *Ordering2Val = nullptr;
+ if (CASExpected) {
+ assert(Ordering2 != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Ordering2Val =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering2));
+ }
+
+ bool HasResult = I->getType() != Type::getVoidTy(Ctx);
+ AllocaInst *AllocaCASExpected = nullptr;
+ AllocaInst *AllocaValue = nullptr;
+ AllocaInst *AllocaResult = nullptr;
+
+ Type *ResultTy;
+ SmallVector<Value *, 6> Args;
+ AttributeList Attr;
+
+ Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size));
+
+ Value *PtrVal = PointerOperand;
+ PtrVal = Builder.CreateAddrSpaceCast(PtrVal, PointerType::getUnqual(Ctx));
+ Args.push_back(PtrVal);
+
+ if (CASExpected) {
+ AllocaCASExpected = AllocaBuilder.CreateAlloca(CASExpected->getType());
+ AllocaCASExpected->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaCASExpected, SizeVal64);
+ Builder.CreateAlignedStore(CASExpected, AllocaCASExpected, AllocaAlignment);
+ Args.push_back(AllocaCASExpected);
+ }
+
+ if (ValueOperand) {
+ AllocaValue = AllocaBuilder.CreateAlloca(ValueOperand->getType());
+ AllocaValue->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaValue, SizeVal64);
+ Builder.CreateAlignedStore(ValueOperand, AllocaValue, AllocaAlignment);
+ Args.push_back(AllocaValue);
+ }
+
+ if (!CASExpected && HasResult) {
+ AllocaResult = AllocaBuilder.CreateAlloca(I->getType());
+ AllocaResult->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaResult, SizeVal64);
+ Args.push_back(AllocaResult);
+ }
+
+ Args.push_back(OrderingVal);
+
+ if (Ordering2Val)
+ Args.push_back(Ordering2Val);
+
+ ResultTy = Type::getInt1Ty(Ctx);
+ Attr = Attr.addRetAttribute(Ctx, Attribute::ZExt);
+
+ SmallVector<Type *, 6> ArgTys;
+ for (Value *Arg : Args)
+ ArgTys.push_back(Arg->getType());
+
+ FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false);
+ FunctionCallee LibcallFn =
+ M->getOrInsertFunction("__atomic_compare_exchange", FnType, Attr);
+ CallInst *Call = Builder.CreateCall(LibcallFn, Args);
+ Call->setAttributes(Attr);
+ Value *Result = Call;
+
+ if (ValueOperand)
+ Builder.CreateLifetimeEnd(AllocaValue, SizeVal64);
+
+ Type *FinalResultTy = I->getType();
+ Value *V = PoisonValue::get(FinalResultTy);
+ Value *ExpectedOut = Builder.CreateAlignedLoad(
+ CASExpected->getType(), AllocaCASExpected, AllocaAlignment);
+ Builder.CreateLifetimeEnd(AllocaCASExpected, SizeVal64);
+ V = Builder.CreateInsertValue(V, ExpectedOut, 0);
+ V = Builder.CreateInsertValue(V, Result, 1);
+ I->replaceAllUsesWith(V);
+ Value *PreviousVal = Builder.CreateExtractValue(V, /*Idxs=*/0);
+ Value *SuccessFailureVal = Builder.CreateExtractValue(V, /*Idxs=*/1);
+ PHI->addIncoming(PreviousVal, Builder.GetInsertBlock());
+ Builder.CreateCondBr(SuccessFailureVal, ExitBB, ContBB);
+ return true;
+}
+
+bool OpenMPIRBuilder::emitAtomicLoadLibCall(
+ Instruction *I, unsigned Size, Align Alignment, Value *PointerOperand,
+ Value *ValueOperand, Value *CASExpected, AtomicOrdering Ordering,
+ AtomicOrdering Ordering2, Value *&LoadedVal) {
+
+ LLVMContext &Ctx = I->getContext();
+ Module *M = I->getModule();
+ const DataLayout &DL = M->getDataLayout();
+ IRBuilder<> Builder(I);
+ IRBuilder<> AllocaBuilder(&I->getFunction()->getEntryBlock().front());
+
+ Type *SizedIntTy = Type::getIntNTy(Ctx, Size * 8);
+
+ const Align AllocaAlignment = DL.getPrefTypeAlign(SizedIntTy);
+
+ ConstantInt *SizeVal64 = ConstantInt::get(Type::getInt64Ty(Ctx), Size);
+ assert(Ordering != AtomicOrdering::NotAtomic && "expect atomic MO");
+ Constant *OrderingVal =
+ ConstantInt::get(Type::getInt32Ty(Ctx), (int)toCABI(Ordering));
+ Constant *Ordering2Val = nullptr;
+
+ bool HasResult = I->getType() != Type::getVoidTy(Ctx);
+ AllocaInst *AllocaCASExpected = nullptr;
+ AllocaInst *AllocaValue = nullptr;
+ AllocaInst *AllocaResult = nullptr;
+
+ Type *ResultTy;
+ SmallVector<Value *, 6> Args;
+ AttributeList Attr;
+
+ Args.push_back(ConstantInt::get(DL.getIntPtrType(Ctx), Size));
+
+ Value *PtrVal = PointerOperand;
+ PtrVal = Builder.CreateAddrSpaceCast(PtrVal, PointerType::getUnqual(Ctx));
+ Args.push_back(PtrVal);
+
+ if (!CASExpected && HasResult) {
+ AllocaResult = AllocaBuilder.CreateAlloca(I->getType());
+ AllocaResult->setAlignment(AllocaAlignment);
+ Builder.CreateLifetimeStart(AllocaResult, SizeVal64);
+ Args.push_back(AllocaResult);
+ }
+
+ Args.push_back(OrderingVal);
+
+ if (Ordering2Val)
+ Args.push_back(Ordering2Val);
+
+ ResultTy = Type::getVoidTy(Ctx);
+
+ SmallVector<Type *, 6> ArgTys;
+ for (Value *Arg : Args)
+ ArgTys.push_back(Arg->getType());
+
+ FunctionType *FnType = FunctionType::get(ResultTy, ArgTys, false);
+ FunctionCallee LibcallFn =
+ M->getOrInsertFunction("__atomic_load", FnType, Attr);
+ CallInst *Call = Builder.CreateCall(LibcallFn, Args);
+ Call->setAttributes(Attr);
+
+ LoadedVal =
+ Builder.CreateAlignedLoad(I->getType(), AllocaResult, AllocaAlignment);
+ Builder.CreateLifetimeEnd(AllocaResult, SizeVal64);
+ I->replaceAllUsesWith(LoadedVal);
+ return true;
+}
+
std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
InsertPointTy AllocaIP, Value *X, Type *XElemTy, Value *Expr,
AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp,
- AtomicUpdateCallbackTy &UpdateOp, bool VolatileX, bool IsXBinopExpr) {
+ AtomicUpdateCallbackTy &UpdateOp, bool VolatileX, bool IsXBinopExpr,
+ bool shouldEmitLibCall) {
// TODO: handle the case where XElemTy is not byte-sized or not a power of 2
// or a complex datatype.
bool emitRMWOp = false;
@@ -6018,7 +6189,7 @@ std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
emitRMWOp &= XElemTy->isIntegerTy();
std::pair<Value *, Value *> Res;
- if (emitRMWOp) {
+ if (emitRMWOp && !shouldEmitLibCall) {
Res.first = Builder.CreateAtomicRMW(RMWOp, X, Expr, llvm::MaybeAlign(), AO);
// not needed except in case of postfix captures. Generate anyway for
// consistency with the else part. Will be removed with any DCE pass.
@@ -6027,6 +6198,64 @@ std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate(
Res.second = Res.first;
else
Res.second = emitRMWOpAsInstruction(Res.first, Expr, RMWOp);
+ } else if (shouldEmitLibCall) {
+ LoadInst *OldVal =
+ Builder.CreateLoad(XElemTy, X, X->getName() + ".atomic.load");
+ OldVal->setAtomic(AO);
+ const DataLayout &LoadDL = OldVal->getModule()->getDataLayout();
+ unsigned LoadSize =
+ LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType());
+
+ Value *LoadedVal = nullptr;
+ emitAtomicLoadLibCall(OldVal, LoadSize, OldVal->getAlign(),
+ OldVal->getPointerOperand(), nullptr, nullptr,
+ OldVal->getOrdering(), AtomicOrdering::NotAtomic,
+ LoadedVal);
+
+ BasicBlock *CurBB = Builder.GetInsertBlock();
+ Instruction *CurBBTI = CurBB->getTerminator();
+ CurBBTI = CurBBTI ? CurBBTI : Builder.CreateUnreachable();
+ BasicBlock *ExitBB =
+ CurBB->splitBasicBlock(CurBBTI, X->getName() + ".atomic.exit");
+ BasicBlock *ContBB = CurBB->splitBasicBlock(CurBB->getTerminator(),
+ X->getName() + ".atomic.cont");
+ ContBB->getTerminator()->eraseFromParent();
+ Builder.restoreIP(AllocaIP);
+ AllocaInst *NewAtomicAddr = Builder.CreateAlloca(XElemTy);
+ NewAtomicAddr->setName(X->getName() + "x.new.val");
+ Builder.SetInsertPoint(ContBB);
+ llvm::PHINode *PHI = Builder.CreatePHI(OldVal->getType(), 2);
+ PHI->addIncoming(LoadedVal, CurBB);
+ Value *OldExprVal = PHI;
+
+ Value *Upd = UpdateOp(OldExprVal, Builder);
+ Builder.CreateStore(Upd, NewAtomicAddr);
+ LoadInst *DesiredVal = Builder.CreateLoad(XElemTy, NewAtomicAddr);
+ AtomicOrdering Failure =
+ llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO);
+ AtomicCmpXchgInst *Result = Builder.CreateAtomicCmpXchg(
+ X, PHI, DesiredVal, llvm::MaybeAlign(), AO, Failure);
+
+ const DataLayout &DL = Result->getModule()->getDataLayout();
+ unsigned Size = DL.getTypeStoreSize(Result->getCompareOperand()->getType());
+
+ emitAtomicCompareExchangeLibCall(
+ Result, Size, Result->getAlign(), Result->getPointerOperand(),
+ Result->getNewValOperand(), Result->getCompareOperand(),
+ Result->getSuccessOrdering(), Result->getFailureOrdering(), PHI, ContBB,
+ ExitBB);
+
+ Result->eraseFromParent();
+ OldVal->eraseFromParent();
+ Res.first = OldExprVal;
+ Res.second = Upd;
+ if (UnreachableInst *ExitTI =
+ dyn_cast<UnreachableInst>(ExitBB->getTerminator())) {
+ CurBBTI->eraseFromParent();
+ Builder.SetInsertPoint(ExitBB);
+ } else {
+ Builder.SetInsertPoint(ExitTI);
+ }
} else {
IntegerType *IntCastTy =
IntegerType::get(M.getContext(), XElemTy->getScalarSizeInBits());
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index bfd7d65912bdb..9ea4d96de4f95 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1621,7 +1621,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
// Convert values and types.
auto &innerOpList = opInst.getRegion().front().getOperations();
- bool isRegionArgUsed{false}, isXBinopExpr{false};
+ bool isRegionArgUsed{false}, isXBinopExpr{false}, shouldEmitLibCall{false};
llvm::AtomicRMWInst::BinOp binop;
mlir::Value mlirExpr;
// Find the binary update operation that uses the region argument
@@ -1640,8 +1640,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
}
}
if (!isRegionArgUsed)
- return opInst.emitError("no atomic update operation with region argument"
- " as operand found inside atomic.update region");
+ shouldEmitLibCall = true;
llvm::Value *llvmExpr = moduleTranslation.lookupValue(mlirExpr);
llvm::Value *llvmX = moduleTranslation.lookupValue(opInst.getX());
@@ -1679,7 +1678,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst,
llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
builder.restoreIP(ompBuilder->createAtomicUpdate(
ompLoc, allocaIP, llvmAtomicX, llvmExpr, atomicOrdering, binop, updateFn,
- isXBinopExpr));
+ isXBinopExpr, shouldEmitLibCall));
return updateGenStatus;
}
|
Hello all, For this particular patch, my main approach was to leverage AtomicExpandPass in order to automatically expand atomic load and atomic compare exchange into relevant libcalls (and as a result, refactor such code from Clang too). However, I was not able to get the pass run on desired function housing these instructions. Hence, I extracted out the necessary functionality and created this PR. Please do suggest if there is a way to improve upon this. For now, this patch fixes the issue of complex types in atomic. For instance, given the following source:
The final value printed with |
I was looking through a C example (given below),
and the LLVM IR generated by Clang looks a bit different (as given below)
The code that gets called is llvm-project/clang/lib/CodeGen/CGAtomic.cpp Line 1764 in 33bf08e
|
@@ -1640,8 +1640,7 @@ convertOmpAtomicUpdate(omp::AtomicUpdateOp &opInst, | |||
} | |||
} | |||
if (!isRegionArgUsed) | |||
return opInst.emitError("no atomic update operation with region argument" | |||
" as operand found inside atomic.update region"); | |||
shouldEmitLibCall = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here has changed (as you know).
The code in Clang seems to do the following for this boolean,
UseLibcall = !C.getTargetInfo().hasBuiltinAtomic(
AtomicSizeInBits, C.toBits(lvalue.getAlignment()));
We should also probably do something similar and probably delegate this to the OpenMPIRBuilder to decide. If that is not possible then an alternative would be to look at the block argument and if it is a struct then use LibCall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay thanks. I'll make this change in the next update
What happened to the idea to refactor the language-independent part of CGAtomic into LLVMFrontend to avoid duplicating the functionality? |
Thanks @kiranchandramohan and @Meinersbur for the comments. I did try making the IR emitted here closer to Clang, as the first option. From what I understand, there are differences in the way I sent across this PR for review to request for suggestions on this only. Is it okay if Clang is also made dependent on the OpenMPIRBuilder for emitting |
Hi @kiranchandramohan. The main issue I faced was determining which of the two IRs (Clang codegen vs AtomicExpandPass) was the one we should try. I could not determine the same, so thought to send this initial PR once, and seek clarifications on the same. |
Clang shouldn't use OpenMPIRBuilder for non-OpenMP stuff. The approach would move
The answer is both. Comments in CGAtomic justify why part is lowered there, part in
unless there is a reason not to. |
Thanks @Meinersbur for the clarification. I'll work on these changes then. I think I'll begin from studying some existing functionality wherein Clang interfaces with LLVM Frontend, and then follow similar steps to refactor atomic libcall emission code into the LLVM Frontend. This should let Clang emit libcalls just as it does now. Finally, we can allow the IRBuilder to use this code in the LLVM Frontend to let Flang also depend upon the same code. Could you suggest some existing functionality where Clang interfaces with LLVM frontend? I'll explore anyway; but if you have something in mind, please do share. |
46e58a1
to
6423811
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you squashed in my proposed refactoring of AtomicInfo. I think it should be its own PR. It also was just a sketch and needs to be completed (check-clang
is failing) and needs some cleanup (auto
-> type, emitAtomicLibcalls2
). Ideally, it would be NFC for clang. I was hoping you would continue with my sketch, but I could work on it as well.
@@ -6033,6 +6034,52 @@ std::pair<Value *, Value *> OpenMPIRBuilder::emitAtomicUpdate( | |||
Res.second = Res.first; | |||
else | |||
Res.second = emitRMWOpAsInstruction(Res.first, Expr, RMWOp); | |||
} else if (RMWOp == llvm::AtomicRMWInst::BinOp::BAD_BINOP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BAD_BINOP
? Doesn't this implement BinOp::Xchg
?
Builder.CreateStore(Upd, NewAtomicAddr); | ||
AtomicOrdering Failure = | ||
llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO); | ||
auto Result = atomicInfo.EmitAtomicCompareExchangeLibcall( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto Result = atomicInfo.EmitAtomicCompareExchangeLibcall( | |
auto Result = atomicInfo.EmitAtomicCompareExchange( |
The decision of whether to to use libcall should be made by AtomicInfo itself.
LoadInst *OldVal = | ||
Builder.CreateLoad(XElemTy, X, X->getName() + ".atomic.load"); | ||
OldVal->setAtomic(AO); | ||
const DataLayout &LoadDL = OldVal->getModule()->getDataLayout(); | ||
unsigned LoadSize = | ||
LoadDL.getTypeStoreSize(OldVal->getPointerOperand()->getType()); | ||
OpenMPIRBuilder::AtomicInfo atomicInfo(&Builder, XElemTy, LoadSize * 8, | ||
LoadSize * 8, OldVal->getAlign(), | ||
OldVal->getAlign(), true, X); | ||
auto AtomicLoadRes = atomicInfo.EmitAtomicLoadLibcall(AO); | ||
BasicBlock *CurBB = Builder.GetInsertBlock(); | ||
Instruction *CurBBTI = CurBB->getTerminator(); | ||
CurBBTI = CurBBTI ? CurBBTI : Builder.CreateUnreachable(); | ||
BasicBlock *ExitBB = | ||
CurBB->splitBasicBlock(CurBBTI, X->getName() + ".atomic.exit"); | ||
BasicBlock *ContBB = CurBB->splitBasicBlock(CurBB->getTerminator(), | ||
X->getName() + ".atomic.cont"); | ||
ContBB->getTerminator()->eraseFromParent(); | ||
Builder.restoreIP(AllocaIP); | ||
AllocaInst *NewAtomicAddr = Builder.CreateAlloca(XElemTy); | ||
NewAtomicAddr->setName(X->getName() + "x.new.val"); | ||
Builder.SetInsertPoint(ContBB); | ||
llvm::PHINode *PHI = Builder.CreatePHI(OldVal->getType(), 2); | ||
PHI->addIncoming(AtomicLoadRes.first, CurBB); | ||
Value *OldExprVal = PHI; | ||
Value *Upd = UpdateOp(OldExprVal, Builder); | ||
Builder.CreateStore(Upd, NewAtomicAddr); | ||
AtomicOrdering Failure = | ||
llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO); | ||
auto Result = atomicInfo.EmitAtomicCompareExchangeLibcall( | ||
AtomicLoadRes.second, NewAtomicAddr, AO, Failure); | ||
LoadInst *PHILoad = Builder.CreateLoad(XElemTy, Result.first); | ||
PHI->addIncoming(PHILoad, Builder.GetInsertBlock()); | ||
Builder.CreateCondBr(Result.second, ExitBB, ContBB); | ||
OldVal->eraseFromParent(); | ||
Res.first = OldExprVal; | ||
Res.second = Upd; | ||
|
||
if (UnreachableInst *ExitTI = | ||
dyn_cast<UnreachableInst>(ExitBB->getTerminator())) { | ||
CurBBTI->eraseFromParent(); | ||
Builder.SetInsertPoint(ExitBB); | ||
} else { | ||
Builder.SetInsertPoint(ExitTI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comment explaining the code+control flow that is generated here?
I could complete the portion for Clang. I was currently working on supporting types (currently, the libcall arguments are accepted as pointers). Apart from cleanup and support for types, is there anything else you had in mind that I could cover? |
RFC for refactoring CGAtomic: https://discourse.llvm.org/t/rfc-refactoring-cgatomic-into-llvmfrontend/80168 |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/80/builds/4471 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/172/builds/4300 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/130/builds/4430 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/20/builds/3960 Here is the relevant piece of the build log for the reference
|
It seems that the aarch64 build bots are failing for atomic-capture integration test (the same IR is passing when the MLIR tests are running). The pre-merge CI was fine, so I went ahead with the merge (https://buildkite.com/llvm-project/github-pull-requests/builds/106435). I'm not entirely sure where the issue could be. Should I submit another PR to remove this particular integration test (no changes will be needed to the MLIR tests) ? |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/143/builds/2578 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/161/builds/2501 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/17/builds/3088 Here is the relevant piece of the build log for the reference
|
Some of the test failures look like LLVM going ahead and replacing something with a loop. This can happen when a specific intrinsic is not supported natively for a target architecture. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/4/builds/2603 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/41/builds/2503 Here is the relevant piece of the build log for the reference
|
Merged #110969 that reverts the integration test
|
I think the problem might have been that |
Oh. I didn't know that. Thanks. |
…xchange libcalls for complex types in atomic update (llvm#92364) This patch adds functionality to emit relevant libcalls in case atomicrmw instruction can not be emitted (for instance, in case of complex types). The IRBuilder is modified to directly emit __atomic_load and __atomic_compare_exchange libcalls. The added functions follow a similar codegen path as Clang, so that LLVM Flang generates almost similar IR as Clang. Fixes llvm#83760 and llvm#75138 Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>
An integration test added in llvm#92364 causes aarch64 buildbot failures. Reverting the same. The relevant functionality added in llvm#92364 is still tested by the MLIR tests added in that PR, which pass on the relevant aarch64 builds.
@NimishMishra |
Hi @DanielCChen, I merged #110969 which removes this test case. Would a rebase and re-trigger help? |
I see some recent failures on |
Thanks @NimishMishra ! I will rebase and rebuild. |
Thanks. Please rebase over 71b2c4d. I think the build-bots would work fine now; it should solve some recent issues with https://lab.llvm.org/buildbot/#/builders/157/builds/9665 as well |
This build on my latest commit (https://lab.llvm.org/buildbot/#/builders/157/builds/9666) seems ok. Let me know if the issue is also solved on your end. Thanks! |
|
||
namespace llvm { | ||
|
||
template <typename IRBuilderTy> struct AtomicInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to template over IRBuilderTy, you should be using IRBuilderBase instead.
uint64_t AtomicSizeInBits; | ||
uint64_t ValueSizeInBits; | ||
llvm::Align AtomicAlign; | ||
llvm::Align ValueAlign; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not be using llvm::
prefixes inside namespace llvm {}
.
|
||
namespace {} // namespace | ||
|
||
namespace llvm {} // end namespace llvm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a cpp file that doesn't contain any code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nikic. I will prepare a NFC to address all of your comments.
This patch adds functionality to emit relevant libcalls in case atomicrmw instruction can not be emitted (for instance, in case of complex types). The IRBuilder is modified to directly emit __atomic_load and __atomic_compare_exchange libcalls. The added functions follow a similar codegen path as Clang, so that LLVM Flang generates almost similar IR as Clang.
Fixes #83760 and #75138
Co-authored-by: Michael Kruse llvm-project@meinersbur.de