-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAG: Simplify demanded bits for truncating atomic_store #90113
DAG: Simplify demanded bits for truncating atomic_store #90113
Conversation
It's really unfortunate that STORE and ATOMIC_STORE are separate opcodes. This duplicates a basic simplify demanded for the truncating case. This avoids some AMDGPU lit regressions in a future patch. I'm not sure how to craft a test that exposes this without first introducing the regressions by promoting half to i16.
@llvm/pr-subscribers-llvm-selectiondag Author: Matt Arsenault (arsenm) ChangesIt's really unfortunate that STORE and ATOMIC_STORE are separate opcodes. This duplicates a basic simplify demanded for the truncating case. This avoids some AMDGPU lit regressions in a future patch. I'm not sure how to craft a test that exposes this without first introducing the regressions by promoting half to i16. Full diff: https://github.com/llvm/llvm-project/pull/90113.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index aa746f1c7b7b3b..f115a39a6953ce 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -530,6 +530,7 @@ namespace {
bool refineExtractVectorEltIntoMultipleNarrowExtractVectorElts(SDNode *N);
SDValue visitSTORE(SDNode *N);
+ SDValue visitATOMIC_STORE(SDNode *N);
SDValue visitLIFETIME_END(SDNode *N);
SDValue visitINSERT_VECTOR_ELT(SDNode *N);
SDValue visitEXTRACT_VECTOR_ELT(SDNode *N);
@@ -1909,6 +1910,7 @@ SDValue DAGCombiner::visit(SDNode *N) {
case ISD::BR_CC: return visitBR_CC(N);
case ISD::LOAD: return visitLOAD(N);
case ISD::STORE: return visitSTORE(N);
+ case ISD::ATOMIC_STORE: return visitATOMIC_STORE(N);
case ISD::INSERT_VECTOR_ELT: return visitINSERT_VECTOR_ELT(N);
case ISD::EXTRACT_VECTOR_ELT: return visitEXTRACT_VECTOR_ELT(N);
case ISD::BUILD_VECTOR: return visitBUILD_VECTOR(N);
@@ -21096,6 +21098,24 @@ SDValue DAGCombiner::replaceStoreOfInsertLoad(StoreSDNode *ST) {
ST->getMemOperand()->getFlags());
}
+SDValue DAGCombiner::visitATOMIC_STORE(SDNode *N) {
+ AtomicSDNode *ST = cast<AtomicSDNode>(N);
+ SDValue Val = ST->getVal();
+ EVT VT = Val.getValueType();
+ EVT MemVT = ST->getMemoryVT();
+
+ if (MemVT.bitsLT(VT)) { // Is truncating store
+ APInt TruncDemandedBits = APInt::getLowBitsSet(VT.getScalarSizeInBits(),
+ MemVT.getScalarSizeInBits());
+ // See if we can simplify the operation with SimplifyDemandedBits, which
+ // only works if the value has a single use.
+ if (SimplifyDemandedBits(Val, TruncDemandedBits))
+ return SDValue(N, 0);
+ }
+
+ return SDValue();
+}
+
SDValue DAGCombiner::visitSTORE(SDNode *N) {
StoreSDNode *ST = cast<StoreSDNode>(N);
SDValue Chain = ST->getChain();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
MemVT.getScalarSizeInBits()); | ||
// See if we can simplify the operation with SimplifyDemandedBits, which | ||
// only works if the value has a single use. | ||
if (SimplifyDemandedBits(Val, TruncDemandedBits)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need any of the worklist fiddling that the corresponding code in visitSTORE does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments there suggest it's due to merge optimizations triggering, which I assumed don't apply for the atomic case. It's probably not important to revisit these aggressively in any case
Implement the promotion in the DAG. Depends #90113
It's really unfortunate that STORE and ATOMIC_STORE are separate opcodes. This duplicates a basic simplify demanded for the truncating case. This avoids some AMDGPU lit regressions in a future patch.
I'm not sure how to craft a test that exposes this without first introducing the regressions by promoting half to i16.