-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support for x86_fp80 (long double) #1604
Conversation
in the forward and reverse modes: - The testcases for division and sine already pass. - In the "fp80" testcases adapted from EnzymeAD#1600, the opt call fails.
Commit b01d87f adds Next, I'll think about extending the bit-trick handling. |
@@ -650,6 +652,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> { | |||
chunk = 8; | |||
} else if (flt->isHalfTy()) { | |||
chunk = 2; | |||
} else if (flt->isX86_FP80Ty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well here (within isFloat) simplify this to chunk = dl.getTypeSizeInBits(flt) / 8;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, changed in 24c8098
@@ -731,6 +735,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> { | |||
chunk = 8; | |||
} else if (flt->isHalfTy()) { | |||
chunk = 2; | |||
} else if (flt->isX86_FP80Ty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about chunk = dl.getTypeSizeInBits(flt) / 8;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't simplify the code here like at the other places because there is no DataLayout
instance (as far as I can see).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would it look like to add an argument for a datalayout here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In every place where IsAllFloat
is called, there is a DataLayout instance lying around. In 6a70777, I add a DataLayout argument to the signature of IsAllFloat, in order to simplify its implementation in the same way as in the other places. (Could you please take another look and confirm that this change is correct?)
@@ -554,6 +554,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> { | |||
chunk = 8; | |||
} else if (flt->isHalfTy()) { | |||
chunk = 2; | |||
} else if (flt->isX86_FP80Ty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also changed in 24c8098
enzyme/Enzyme/GradientUtils.cpp
Outdated
@@ -5181,6 +5181,8 @@ Value *GradientUtils::invertPointerM(Value *const oval, IRBuilder<> &BuilderM, | |||
chunk = 8; | |||
} else if (flt->isHalfTy()) { | |||
chunk = 2; | |||
} else if (flt->isX86_FP80Ty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also changed in 24c8098
Does anyone know what the purpose of this piece of code in AdjointGenerator.h is: switch (BO.getOpcode()) {
case Instruction::And: {
// If & against 0b10000000000 and a float the result is 0
// [...]
auto FT = TR.query(&BO).IsAllFloat(size);
auto eFT = FT;
if (FT)
for (int i = 0; i < 2; ++i) {
auto CI = dyn_cast<ConstantInt>(BO.getOperand(i));
if (CI && dl.getTypeSizeInBits(eFT) ==
dl.getTypeSizeInBits(CI->getType())) {
if (eFT->isDoubleTy() && CI->getValue() == -134217728) {
setDiffe(&BO, Constant::getNullValue(diffTy), Builder2);
// Derivative is zero (equivalent to rounding as just chopping off
// bits of mantissa), no update
return;
}
}
} Is some weird bit-trick handled here? How does the bit-trick work? 134217728 is 2**27, but I don't see right now how real arithmetic could be hidden here... |
From the comment above it looks like it is saying float & signbit -> 0 derivative. Also note that it was checking against a negative number (which is twos complement has different bits set) |
Co-authored-by: William Moses <gh@wsmoses.com>
Could we also update the shift indices function that already manually handled fp80 to use the datalayout size check as well? |
and use it to simplify the implementation of IsAllFloat.
Yes, I also changed it in 24c8098. |
I can now compile and run the other example from #1600, https://fwd.gymni.ch/HHKrZj, on my local system. The |
LGTM, just fix the format per CI. Yeah those tests are presently expected issues for 11 |
If you're up for it, it may also be useful to test fp128 in a follow up PR. |
Thanks! I might look at bit-tricks and/or fp128 in a separate PR in the next days. |
This is an attempt to increase Enzyme's coverage of the LLVM type
x86_fp80
, in response to #1600. I'm not at all familiar with Enzyme from a developer perspective, and very happy about any kind of input.Existing coverage
Some
x86_fp80
coverage is already there, e.g. because the differentiation of elementary operations and floating-point conversions is defined independent of the type (if my understanding is correct).Hence, the new division and sine testcases suggested in this PR pass from the beginning.
Missing coverage
A few locations in the code where
x86_fp80
support is missing can be identified by searching forisDoubleTy
andisFloatTy
in the code:I'm not sure whether additional work is necessary at other places. Please post if you are aware of some. Eventually, tests should also tell us.