Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer division of types larger than 128-bits reports a fatal error in the DAG legalizer #44994

Closed
erichkeane opened this issue Apr 23, 2020 · 13 comments
Labels
bugzilla Issues migrated from bugzilla llvm:codegen

Comments

@erichkeane
Copy link
Collaborator

erichkeane commented Apr 23, 2020

Bugzilla Link 45649
Version trunk
OS Linux
CC @Quuxplusone,@chfast,@efriedma-quic,@LebedevRI,@RKSimon,@nunoplopes,@rotateright

Extended Description

https://godbolt.org/z/nCa-4f

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i129 @_Z3divU7_ExtIntILi129EEiS_(i129 %0, i129 %1) #0  {
  %3 = alloca i129, align 8
  %4 = alloca i129, align 8
  store i129 %0, i129* %3, align 8
  store i129 %1, i129* %4, align 8
  %5 = load i129, i129* %3, align 8 
  %6 = load i129, i129* %4, align 8
  %7 = sdiv i129 %5, %6
  ret i129 %7 
}

LLVM ERROR: Unsupported library call operation!

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.

Stack dump:

0.	Program arguments: /opt/compiler-explorer/clang-trunk/bin/llc -o ./output.s -x86-asm-syntax=intel <source> 
1.	Running pass 'Function Pass Manager' on module '<source>'.
2.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_Z3divU7_ExtIntILi129EEiS_'
 #0 0x000056238dd98b3a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x24a3b3a)
 #1 0x000056238dd96994 llvm::sys::RunSignalHandlers() (/opt/compiler-explorer/clang-trunk/bin/llc+0x24a1994)
 #2 0x000056238dd96ae3 SignalHandler(int) (/opt/compiler-explorer/clang-trunk/bin/llc+0x24a1ae3)
 #3 0x00007fca314ff890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)
 #4 0x00007fca303dae97 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3ee97)
 #5 0x00007fca303dc801 abort (/lib/x86_64-linux-gnu/libc.so.6+0x40801)
 #6 0x000056238dd1660e llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/llc+0x242160e)
 #7 0x000056238dd16738 (/opt/compiler-explorer/clang-trunk/bin/llc+0x2421738)
 #8 0x000056238dc40488 llvm::TargetLowering::makeLibCall(llvm::SelectionDAG&, llvm::RTLIB::Libcall, llvm::EVT, llvm::ArrayRef<llvm::SDValue>, llvm::TargetLowering::MakeLibCallOptions, llvm::SDLoc const&, llvm::SDValue) const (/opt/compiler-explorer/clang-trunk/bin/llc+0x234b488)
 #9 0x000056238dcb8e50 llvm::DAGTypeLegalizer::ExpandIntRes_SDIV(llvm::SDNode*, llvm::SDValue&, llvm::SDValue&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x23c3e50)
#10 0x000056238dcccf8b llvm::DAGTypeLegalizer::ExpandIntegerResult(llvm::SDNode*, unsigned int) (/opt/compiler-explorer/clang-trunk/bin/llc+0x23d7f8b)
#11 0x000056238dc5faa5 llvm::DAGTypeLegalizer::run() (/opt/compiler-explorer/clang-trunk/bin/llc+0x236aaa5)
#12 0x000056238dc6019e llvm::SelectionDAG::LegalizeTypes() (/opt/compiler-explorer/clang-trunk/bin/llc+0x236b19e)
#13 0x000056238dbff250 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/opt/compiler-explorer/clang-trunk/bin/llc+0x230a250)
#14 0x000056238dc036ac llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x230e6ac)
#15 0x000056238dc0560d llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (.part.809) (/opt/compiler-explorer/clang-trunk/bin/llc+0x231060d)
#16 0x000056238cbc61f2 (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x12d11f2)
#17 0x000056238d330e10 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x1a3be10)
#18 0x000056238d6e2277 llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x1ded277)
#19 0x000056238d6e2961 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x1ded961)
#20 0x000056238d6e2d61 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x1dedd61)
#21 0x000056238bff568e compileModule(char**, llvm::LLVMContext&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x70068e)
#22 0x000056238bf5612f main (/opt/compiler-explorer/clang-trunk/bin/llc+0x66112f)
#23 0x00007fca303bdb97 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b97)
#24 0x000056238bfeefaa _start (/opt/compiler-explorer/clang-trunk/bin/llc+0x6f9faa)

Compiler returned: 255
@RKSimon
Copy link
Collaborator

RKSimon commented Apr 23, 2020

We don't have much support for arbitrary sized integer division, and once we can't legalize to a wider type it does tend to fail - I'm not sure if this is likely to change anytime soon.

We don't even handle the likes of:

define i129 @div_i31_sext_i129(i31 %0, i31 %1) {
  %3 = sext i31 %0 to i129
  %4 = sext i31 %1 to i129
  %5 = sdiv i129 %3, %4
  ret i129 %5
}

I may be wrong here but we should be able to legalize (as long as the sign/zero bits are at least half the bitwidth) to:

define i129 @div_i31_sext_i129(i31 %0, i31 %1) {
  %3 = sext i31 %0 to i64
  %4 = sext i31 %1 to i64
  %5 = sdiv i64 %3, %4
  %6 = sext i64 %5 to i129
  ret i129 %6
}

Alive2 (with smaller types): http://volta.cs.utah.edu:8080/z/FtCZCa

----------------------------------------
define i19 @src(i7 %0, i7 %1) {
%2:
  %3 = sext i7 %0 to i19
  %4 = sext i7 %1 to i19
  %5 = sdiv i19 %3, %4
  ret i19 %5
}
=>
define i19 @tgt(i7 %0, i7 %1) {
%2:
  %3 = sext i7 %0 to i16
  %4 = sext i7 %1 to i16
  %5 = sdiv i16 %3, %4
  %6 = sext i16 %5 to i19
  ret i19 %6
}
Transformation seems to be correct!

@efriedma-quic
Copy link
Collaborator

Trying to emit general division when we don't have an instruction or libcall is really inconvenient. There is a general algorithm for splitting a divide, but it doesn't map onto SelectionDAG operations easily. And we don't want to ship an arbitrary precision divide implementation in compiler-rt.

It would be straightforward to emit a better error message, at least.

@erichkeane
Copy link
Collaborator Author

It is unfortunate that we cannot support this division... I believe there is interest from the Rust community in implementing these as well (based on some responses I received to the _ExtInt patch), so it would be nice if we could figure out how to get at least usable code here.

And we don't want to ship an arbitrary precision divide implementation in compiler-rt
This is also unfortunate.

It would be straightforward to emit a better error message, at least.

It seems wrong to me (though my LLVM-IR knowledge is limited) that we have IR that passes the versifier that cannot be emitted. Seems to be a violation of the VM-like-model of LLVM.

@efriedma-quic
Copy link
Collaborator

And we don't want to ship an arbitrary precision divide implementation in compiler-rt
This is also unfortunate.

If there was really enough demand, we could, I guess, but I don't want to maintain such an implementation. A high-quality division implementation is really complicated.

It would be straightforward to emit a better error message, at least.

It seems wrong to me (though my LLVM-IR knowledge is limited) that we have
IR that passes the versifier that cannot be emitted. Seems to be a violation
of the VM-like-model of LLVM.

There's always going to be certain constructs that aren't supported by all targets. Stuff like target-specific intrinsics, exotic floating-point types, exception handling models that only exist on specific targets, etc. We try to avoid this when it's practical, but that isn't always feasible.

@erichkeane
Copy link
Collaborator Author

There's always going to be certain constructs that aren't supported by all targets. Stuff like target-specific intrinsics, exotic floating-point types, exception handling models that only exist on specific targets, etc. We try to avoid this when it's practical, but that isn't always feasible.

While I agree in concept that there are certain things that aren't supported on all targets, this is 99.9992% of our supported integers, running 2 of the basic 5 operands on it. While a touch cheeky, it seems like it is pretty high visibility (from an IR model, if not necessarily from a consuming language perspective).

That said, I suspect we'll get a number of similar issues filed (where LLVM poorly handles integers >128 bits) now that it is exposed in a consuming language.

@Quuxplusone
Copy link
Contributor

A high-quality division implementation is really complicated.

My two cents: I think it would suffice to provide a low-quality division implementation -- just something simple so that people could compile expressions like ei / 2, ei / 10, or ei % bigprime and get codegen that worked (even if it were slow). Going from "the code doesn't compile" to "the code is less than perfectly efficient" would still be a big leap forward.

FWIW, I think here is a correct implementation of "divmod" for power-of-two-sized unsigned integers: https://github.com/Quuxplusone/WideIntProofOfConcept

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 21, 2020

A high-quality division implementation is really complicated.

My two cents: I think it would suffice to provide a low-quality division
implementation -- just something simple so that people could compile
expressions like ei / 2, ei / 10, or ei % bigprime and get codegen
that worked (even if it were slow). Going from "the code doesn't compile" to
"the code is less than perfectly efficient" would still be a big leap
forward.

FWIW, I think here is a correct implementation of "divmod" for
power-of-two-sized unsigned integers:
https://github.com/Quuxplusone/WideIntProofOfConcept

Hi all, I encountered this bug when writing a JIT for Ethereum VM which has 256-bit integers. So here is a low quality but correct implementation of such a divmod algorithm:

https://godbolt.org/z/9fEhco

#include <stdio.h>
typedef _ExtInt(256) I;
void udiv256(I *n, I *d, I* q) {
    *q = 0;
    while (*n >= *d) {
        I i = 0, d_t = *d;
        while (*n >= (d_t << (I)1) && ++i)
            d_t <<= 1;
        *q |= (I)1 << i;
        *n -= d_t;
    }
}
void sdiv256(I* n, I* d, I* q) {
    I ret = (I)1;
    if (*n < (I)0) { ret *= (I)-1; *n = -*n; }
    if (*d < (I)0) { ret *= (I)-1; *d = -*d; }
    udiv256(n, d, q);
    *q *= ret;
}

int main() {
    I a = 0xAA00;
    I b = 0x1;
    I c = 0;
    udiv256(&a, &b, &c);
    for (int i = 0; i < 32; i++) {
        printf("%02X", ((unsigned char*)(&c))[i]);
    }
    I ret = a % b;
}

@jtmott-intel
Copy link
Contributor

mentioned in issue llvm/llvm-bugzilla-archive#46337

@AaronBallman
Copy link
Collaborator

This issue happens in all backends, btw, and is causing us to specify BITINT_MAXWIDTH as 128 for Clang 14 instead of supporting the documented largest integer width in LLVM, which is quite unfortunate IMO.

@mgehre-amd
Copy link
Contributor

mgehre-amd commented Jan 25, 2022

We are considering to built an extra library in compiler-rt to to ship arbitrary precision division and others, called libclang_rt.bitint (or similar). We didn't want to make it part of compiler-rt's builtins library, so it would also work when linking against libgcc instead.
Clang would then always link against libclang_rt.bitint when it is available. Does that sound like the correct approach?

@erichkeane
Copy link
Collaborator Author

We are considering to built an extra library in compiler-rt to to ship arbitrary precision division and others, called libclang_rt.bitint (or similar). We didn't want to make it part of compiler-rt's builtins library, so it would also work when linking against libgcc instead. Clang would then always link against libclang_rt.bitint when it is available. Does that sound like the correct approach?

That seems like a reasonable approach to me.

@mgehre-amd
Copy link
Contributor

We have made an implementation of this with a new library in compiler-rt, a little change to the clang driver to unconditionally link this library and the changes in SelectionDAG to properly call the new functions. The prototypes are like

void __udivmodei5(unsigned int *quo,
                               unsigned int *rem, unsigned int *a,
                               unsigned int *b,
                               unsigned int bits)

and it's using Knuth's Algorithm (adapted from llvm/ADT/APInt.h).
I will start to prepare this for upstreaming.

@mgehre-amd
Copy link
Contributor

mgehre-amd added a commit that referenced this issue Aug 26, 2022
Adds a pass ExpandLargeDivRem to expand div/rem instructions
with more than 128 bits into a loop computing that value.

As discussed on https://reviews.llvm.org/D120327, this approach has the advantage
that it is independent of the runtime library. This also helps the clang driver,
which otherwise would need to understand enough about the runtime library
to know whether to allow _BitInts with more than 128 bits.

Targets are still free to disable this pass and instead provide a faster
implementation in a runtime library.

Fixes #44994

Differential Revision: https://reviews.llvm.org/D126644
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen
Projects
None yet
Development

No branches or pull requests

8 participants