Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen error with AVX-512BW #4377

Open
Barinzaya opened this issue Oct 14, 2024 · 4 comments
Open

Codegen error with AVX-512BW #4377

Barinzaya opened this issue Oct 14, 2024 · 4 comments

Comments

@Barinzaya
Copy link
Contributor

Barinzaya commented Oct 14, 2024

Context

	Odin:    dev-2024-10-nightly
	OS:      Arch Linux, Linux 6.11.2-zen1-1-zen
	CPU:     AMD Ryzen 9 9950X 16-Core Processor
	RAM:     61886 MiB
	Backend: LLVM 18.1.6

Expected Behavior

The code in the snippet below should compile without issues, and should execute without issues if AVX-512BW is available on the machine.

Current Behavior

When building the code in the snippet below (and other similarly-constructed code involving masks), an LLVM error (see below) is produced and the Odin compiler aborts. This only happens when relevant parts of the AVX-512 instruction set are enabled (in this case avx512bw), either via an attribute or via the command-line. When enabling other SIMD instruction sets (e.g. avx2), the code builds without issue.

In the sample code below, this also occurs when swapping main for a test procedure with the same body and attempting to run tests (odin test).

Failure Information (for bugs)

Example error:

LLVM ERROR: Cannot select: 0x740fd819bd00: v16i1 = setcc 0x740fd819b830, 0x740fd819c320, setgt:ch
  0x740fd819b830: v16i16 = sub 0x740fd819b6e0, 0x740fd819b7c0
    0x740fd819b6e0: v16i16,ch = load<(load (s256) from %ir.0 + 32, basealign 64)> 0x740fd819b590, 0x740fd819bfa0, undef:i64
      0x740fd819bfa0: i64 = add 0x740fd819b8a0, Constant:i64<32>
        0x740fd819b8a0: i64,ch = CopyFromReg 0x740fd8ae4360, Register:i64 %1
          0x740fd819b9f0: i64 = Register %1
        0x740fd819c080: i64 = Constant<32>
      0x740fd819c0f0: i64 = undef
    0x740fd819b7c0: v16i16,ch = load<(load (s256) from constant-pool)> 0x740fd8ae4360, 0x740fd819b520, undef:i64
      0x740fd819b520: i64 = X86ISD::Wrapper TargetConstantPool:i64<<16 x i16> <i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384>> 0
        0x740fd819bf30: i64 = TargetConstantPool<<16 x i16> <i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384, i16 16384>> 0
      0x740fd819c0f0: i64 = undef
  0x740fd819c320: v16i16 = bitcast 0x740fd819c160
    0x740fd819c160: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
      0x740fd819c240: i32 = Constant<-1>
In function: mre.foo
fish: Job 1, '~/Downloads/odin-linux-amd64-de…' terminated by signal SIGABRT (Abort)

Pointer values change with each build.

Steps to Reproduce

  1. Create an Odin source file mre.odin with the following code:
package mre

import "core:simd"

@(enable_target_feature = "avx512bw")
foo :: proc(src: #simd[32]u16, dst: ^[32]u16) {
	simd.masked_store(dst, src, simd.lanes_lt(src - auto_cast 16384, auto_cast 32768))
}

main :: proc() {
	a : [32]u16
	foo({}, &a)
}
  1. Attempt to build this file (odin build mre.odin -file).

This error also occurs if the enable_target_feature attribute is removed and the target feature is enabled via the command-line (-target-features:avx512bw). This error seems to be highly dependent on compiler flags; it does not occur if -o:size, -o:speed, or -o:aggressive are given, and also only seems to occur with some microarches (e.g. the default x86-64-v2 and x86-64-v3 fail, x86-64and x86-64-v4 work).

@laytan
Copy link
Collaborator

laytan commented Oct 15, 2024

Looks like you also need avx512vl enabled to make codegen happy.

@laytan
Copy link
Collaborator

laytan commented Oct 15, 2024

Could be this, which they say may be fixed in LLVM 19: llvm/llvm-project#111380

@Barinzaya
Copy link
Contributor Author

Barinzaya commented Oct 15, 2024

Looks like you also need avx512vl enabled to make codegen happy.

There are a lot of different ways to make the codegen happy, too. Setting optimization flags sometimes does it, changing the microarch sometimes does it (even if it's one that doesn't support AVX-512)... probably others too. This is extremely sensitive to compiler flags.

@laytan
Copy link
Collaborator

laytan commented Oct 15, 2024

Optimization modes makes sense because it probably just removes the entire function because the program doesn't have any side effects, it is very hard to get llvm (with optimizations) to behave in a bug reproduction because it can just remove things.

And even if you get it to not remove your function it could be optimizing it to very different instructions.

The microarch affecting it is a little weird, especially if it doesn't enable the avx512 features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants