Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot select llvm.canonicalize.f64 #32650

Open
nagisa opened this issue Jun 4, 2017 · 8 comments
Open

Cannot select llvm.canonicalize.f64 #32650

nagisa opened this issue Jun 4, 2017 · 8 comments
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@nagisa
Copy link
Member

nagisa commented Jun 4, 2017

Bugzilla Link 33303
Version 4.0
OS Linux
CC @topperc,@hfinkel,@RKSimon,@arsenm,@rotateright

Extended Description

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind readnone uwtable
define double @max(double, double) unnamed_addr #0 {
start:
  %2 = fcmp olt double %0, %1
  %3 = fcmp uno double %0, 0.000000e+00
  %or.cond.i.i = or i1 %3, %2
  %4 = select i1 %or.cond.i.i, double %1, double %0
  %5 = tail call double @llvm.canonicalize.f64(double %4) #2
  ret double %5
}

; Function Attrs: nounwind readnone
declare double @llvm.canonicalize.f64(double) #1

attributes #0 = { nounwind readnone uwtable }
attributes #1 = { nounwind readnone }
attributes #2 = { nounwind }

Will fail during instruction selection with (when compiled with `llc test.ll`)

LLVM ERROR: Cannot select: 0x2258ce8: f64 = fcanonicalize 0x2258c80
  0x2258c80: f64 = X86ISD::FOR 0x2258a10, 0x2258bb0
    0x2258a10: f64 = X86ISD::FANDN 0x2258ef0, 0x2258c18
      0x2258ef0: f64 = X86ISD::FSETCC 0x22588d8, 0x22588d8, Constant:i8<3>
        0x22588d8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg0
          0x2258870: f64 = Register %vreg0
        0x22588d8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg0
          0x2258870: f64 = Register %vreg0
        0x2258a78: i8 = Constant<3>
      0x2258c18: f64 = X86ISD::FMAX 0x22589a8, 0x22588d8
        0x22589a8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg1
          0x2258940: f64 = Register %vreg1
        0x22588d8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg0
          0x2258870: f64 = Register %vreg0
    0x2258bb0: f64 = X86ISD::FAND 0x2258ef0, 0x22589a8
      0x2258ef0: f64 = X86ISD::FSETCC 0x22588d8, 0x22588d8, Constant:i8<3>
        0x22588d8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg0
          0x2258870: f64 = Register %vreg0
        0x22588d8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg0
          0x2258870: f64 = Register %vreg0
        0x2258a78: i8 = Constant<3>
      0x22589a8: f64,ch = CopyFromReg 0x21eb3c0, Register:f64 %vreg1
        0x2258940: f64 = Register %vreg1
In function: max

Happens with x86, arm, powerpc, mips, etc backends.

@nagisa
Copy link
Member Author

nagisa commented Jun 4, 2017

Also happens with 3.9

@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2017

@RKSimon
Copy link
Collaborator

RKSimon commented Feb 18, 2018

What should SSE (or x87 even, although I really don't want to deal with pseudos and unnormals....) be expected to do for float canonicalization?

http://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic

As a stop gap would it make sense for a canonicalize to expand to just a copy of the original value? Is that better or worse than just crashing?

@rotateright
Copy link
Contributor

What should SSE (or x87 even, although I really don't want to deal with
pseudos and unnormals....) be expected to do for float canonicalization?

http://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic

As a stop gap would it make sense for a canonicalize to expand to just a
copy of the original value? Is that better or worse than just crashing?

"This function should always be implementable as multiplication by 1.0, provided that the compiler does not constant fold the operation."

DAGCombiner has that fold of course, and it's not limited to pre-legalization. But any target that cares would be handling this node given that it's 2.5 years since:
https://reviews.llvm.org/rL241977 ?

Eg:
https://reviews.llvm.org/rL266272

@arsenm
Copy link
Contributor

arsenm commented Feb 19, 2018

What should SSE (or x87 even, although I really don't want to deal with
pseudos and unnormals....) be expected to do for float canonicalization?

http://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic

As a stop gap would it make sense for a canonicalize to expand to just a
copy of the original value? Is that better or worse than just crashing?

"This function should always be implementable as multiplication by 1.0,
provided that the compiler does not constant fold the operation."

DAGCombiner has that fold of course, and it's not limited to
pre-legalization. But any target that cares would be handling this node
given that it's 2.5 years since:
https://reviews.llvm.org/rL241977 ?

Eg:
https://reviews.llvm.org/rL266272

This exists to avoid the multiply by one fold. This needs to be selected to the instruction directly and can't be legalized / lowered to the equivalent multiply

@rotateright
Copy link
Contributor

This exists to avoid the multiply by one fold. This needs to be selected to
the instruction directly and can't be legalized / lowered to the equivalent
multiply

IIUC, that means every target is expected to duplicate something like this:

def : Pat<
  (fcanonicalize f32:$src),
  (V_MUL_F32_e64 0, CONST.FP32_ONE, 0, $src, 0, 0)
>;

for each FP data type to avoid crashing?

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@RKSimon
Copy link
Collaborator

RKSimon commented Jan 27, 2023

We still don't handle this: https://gcc.godbolt.org/z/1rd9cTzcT (and even amdgpu fails for <8 x half>)

@RKSimon
Copy link
Collaborator

RKSimon commented May 20, 2023

@arsenm This is the amdgpu canonicalize failure for v8f16 that I mentioned at EuroLLVM: https://gcc.godbolt.org/z/9jGfj7E3E

arsenm added a commit that referenced this issue May 22, 2023
This assert should have the same set of vector types as the binary
and ternary case (although this assert is kind of pointless, the code
should work for any vector type as-is).

Fixes part of issue #32650.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

5 participants