Improve support for x86_fp80 (long double) #1604

maxaehle · 2024-01-04T17:26:42Z

This is an attempt to increase Enzyme's coverage of the LLVM type x86_fp80, in response to #1600. I'm not at all familiar with Enzyme from a developer perspective, and very happy about any kind of input.

Existing coverage

Some x86_fp80 coverage is already there, e.g. because the differentiation of elementary operations and floating-point conversions is defined independent of the type (if my understanding is correct).

Hence, the new division and sine testcases suggested in this PR pass from the beginning.

Missing coverage

A few locations in the code where x86_fp80 support is missing can be identified by searching for isDoubleTy and isFloatTy in the code:

Handling of bit-tricks (where real arithmetic is performed using e.g. bitwise logical operations), e.g. in AdjointGenerator.h and TypeAnalysis.cpp.
A particular case distinction that appears four times in TypeTree.h. At one of them the 80-bit case is already considered, at the other three it is not.
There are some more places, related to e.g. conversion from floating-point type to integer type of the same size, obtaining the type as a string, etc.

I'm not sure whether additional work is necessary at other places. Please post if you are aware of some. Eventually, tests should also tell us.

in the forward and reverse modes: - The testcases for division and sine already pass. - In the "fp80" testcases adapted from EnzymeAD#1600, the opt call fails.

maxaehle · 2024-01-04T17:32:06Z

Commit b01d87f adds x86_fp80 support in TypeTree.h and at the "some more places" mentioned above. With these changes, opt can successfully form differentiated code for the new fp80 testcases (before b01d87f, opt would crash as reported in #1600)). 895ed2e updates the expected output of the opt run.

Next, I'll think about extending the bit-trick handling.

wsmoses · 2024-01-04T17:32:29Z

enzyme/Enzyme/TypeAnalysis/TypeTree.h

@@ -650,6 +652,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> {
                chunk = 8;
              } else if (flt->isHalfTy()) {
                chunk = 2;
+              } else if (flt->isX86_FP80Ty()) {


We might as well here (within isFloat) simplify this to chunk = dl.getTypeSizeInBits(flt) / 8;

Thanks, changed in 24c8098

wsmoses · 2024-01-04T17:32:53Z

enzyme/Enzyme/TypeAnalysis/TypeTree.h

@@ -731,6 +735,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> {
        chunk = 8;
      } else if (flt->isHalfTy()) {
        chunk = 2;
+      } else if (flt->isX86_FP80Ty()) {


Same comment here about chunk = dl.getTypeSizeInBits(flt) / 8;

I couldn't simplify the code here like at the other places because there is no DataLayout instance (as far as I can see).

What would it look like to add an argument for a datalayout here?

In every place where IsAllFloat is called, there is a DataLayout instance lying around. In 6a70777, I add a DataLayout argument to the signature of IsAllFloat, in order to simplify its implementation in the same way as in the other places. (Could you please take another look and confirm that this change is correct?)

wsmoses · 2024-01-04T17:33:50Z

enzyme/Enzyme/TypeAnalysis/TypeTree.h

@@ -554,6 +554,8 @@ class TypeTree : public std::enable_shared_from_this<TypeTree> {
                chunk = 8;
              } else if (flt->isHalfTy()) {
                chunk = 2;
+              } else if (flt->isX86_FP80Ty()) {


also changed in 24c8098

wsmoses · 2024-01-04T17:34:02Z

enzyme/Enzyme/GradientUtils.cpp

@@ -5181,6 +5181,8 @@ Value *GradientUtils::invertPointerM(Value *const oval, IRBuilder<> &BuilderM,
                chunk = 8;
              } else if (flt->isHalfTy()) {
                chunk = 2;
+              } else if (flt->isX86_FP80Ty()) {


also changed in 24c8098

maxaehle · 2024-01-04T17:34:32Z

Does anyone know what the purpose of this piece of code in AdjointGenerator.h is:

switch (BO.getOpcode()) {
    case Instruction::And: {
      // If & against 0b10000000000 and a float the result is 0
      
      // [...]
      auto FT = TR.query(&BO).IsAllFloat(size);
      auto eFT = FT;
      if (FT)
        for (int i = 0; i < 2; ++i) {
          auto CI = dyn_cast<ConstantInt>(BO.getOperand(i));
          if (CI && dl.getTypeSizeInBits(eFT) ==
                        dl.getTypeSizeInBits(CI->getType())) {
            if (eFT->isDoubleTy() && CI->getValue() == -134217728) {
              setDiffe(&BO, Constant::getNullValue(diffTy), Builder2);
              // Derivative is zero (equivalent to rounding as just chopping off
              // bits of mantissa), no update
              return;
            }
          }
        }

Is some weird bit-trick handled here? How does the bit-trick work? 134217728 is 2**27, but I don't see right now how real arithmetic could be hidden here...

wsmoses · 2024-01-04T17:42:22Z

From the comment above it looks like it is saying float & signbit -> 0 derivative.

Also note that it was checking against a negative number (which is twos complement has different bits set)

Co-authored-by: William Moses <gh@wsmoses.com>

wsmoses · 2024-01-04T17:57:24Z

Could we also update the shift indices function that already manually handled fp80 to use the datalayout size check as well?

and use it to simplify the implementation of IsAllFloat.

maxaehle · 2024-01-04T18:19:49Z

Could we also update the shift indices function that already manually handled fp80 to use the datalayout size check as well?

Yes, I also changed it in 24c8098.

maxaehle · 2024-01-04T18:35:32Z

I can now compile and run the other example from #1600, https://fwd.gymni.ch/HHKrZj, on my local system. The Integration CI LLVM 11 Release ubuntu-20.04 run is marked as "failed" because three testcases unexpectedly pass now. They all use the C++ header <random> in a similar way (see e.g. here) as this example, so probably that's a good message.

wsmoses · 2024-01-04T18:43:33Z

LGTM, just fix the format per CI.

Yeah those tests are presently expected issues for 11

wsmoses · 2024-01-04T18:44:44Z

If you're up for it, it may also be useful to test fp128 in a follow up PR.

maxaehle · 2024-01-04T18:58:17Z

Thanks! I might look at bit-tricks and/or fp128 in a separate PR in the next days.

maxaehle added 3 commits January 4, 2024 17:26

Add testcases for float and x86_fp80

c11f61c

in the forward and reverse modes: - The testcases for division and sine already pass. - In the "fp80" testcases adapted from EnzymeAD#1600, the opt call fails.

Improve x86_fp80 support

b01d87f

Update expected output of fp80 testcases

895ed2e

wsmoses reviewed Jan 4, 2024

View reviewed changes

Use DataLayout::getTypeSizeInBits

24c8098

Co-authored-by: William Moses <gh@wsmoses.com>

Let IsAllFloat take an additional DataLayout argument

6a70777

and use it to simplify the implementation of IsAllFloat.

wsmoses approved these changes Jan 4, 2024

View reviewed changes

clang-format

3b4955f

wsmoses merged commit 4dcea04 into EnzymeAD:main Jan 4, 2024
38 of 54 checks passed

This was referenced Jan 5, 2024

Clang segfaults for program using header <random> vgvassilev/clad#700

Closed

Support more C math.h functions #1605

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve support for x86_fp80 (long double) #1604

Improve support for x86_fp80 (long double) #1604

maxaehle commented Jan 4, 2024

maxaehle commented Jan 4, 2024

wsmoses Jan 4, 2024

maxaehle Jan 4, 2024

wsmoses Jan 4, 2024

maxaehle Jan 4, 2024

wsmoses Jan 4, 2024

maxaehle Jan 4, 2024

wsmoses Jan 4, 2024

maxaehle Jan 4, 2024

wsmoses Jan 4, 2024

maxaehle Jan 4, 2024

maxaehle commented Jan 4, 2024

wsmoses commented Jan 4, 2024

wsmoses commented Jan 4, 2024

maxaehle commented Jan 4, 2024

maxaehle commented Jan 4, 2024

wsmoses commented Jan 4, 2024

wsmoses commented Jan 4, 2024

maxaehle commented Jan 4, 2024

Improve support for x86_fp80 (long double) #1604

Improve support for x86_fp80 (long double) #1604

Conversation

maxaehle commented Jan 4, 2024

Existing coverage

Missing coverage

maxaehle commented Jan 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maxaehle commented Jan 4, 2024

wsmoses commented Jan 4, 2024

wsmoses commented Jan 4, 2024

maxaehle commented Jan 4, 2024

maxaehle commented Jan 4, 2024

wsmoses commented Jan 4, 2024

wsmoses commented Jan 4, 2024

maxaehle commented Jan 4, 2024