Small SIMD test fails with --release but passes without #50154

danielrh · 2018-04-22T09:06:06Z

Full repro here:
https://github.com/danielrh/simd_playground run with cargo test --release
failing test is here:

#![feature(stdsimd)]
mod test {
#[test]
fn baseline() {
use std::simd::*;
   let symbol = 2i16;
   let inc = 1i16;
   let data = i16x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
   let one_to_16 = i16x16::new(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);
   let increment_v = i16x16::splat(inc);
   let mask_v = unsafe {
            ::std::arch::x86_64::_mm256_cmpgt_epi16(::std::arch::x86_64::__m256i::from_bits(one_to_16),
                                                   ::std::arch::x86_64::__m256i::from_bits(i16x16::splat(i16::from(symbol))))
    };
    let output = data + (increment_v & i16x16::from_bits(mask_v));
    let mut xfinal = [0i16; 16];
    output.store_unaligned(&mut xfinal);
    assert_eq!(xfinal, [4, 8, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65]);
}
}

using:

binary: rustc
commit-hash: ac3c2288f9f9d977acb46406ba60033d65165a7b
commit-date: 2018-04-18
host: x86_64-apple-darwin
release: 1.27.0-nightly
LLVM version: 6.0

on OSX 10.12.6 (16G1314)

and also on linux

rustc --verbose --version
rustc 1.27.0-nightly (ac3c2288f 2018-04-18)
binary: rustc
commit-hash: ac3c2288f9f9d977acb46406ba60033d65165a7b
commit-date: 2018-04-18
host: x86_64-unknown-linux-gnu
release: 1.27.0-nightly
LLVM version: 6.0

I was unable to reproduce the above problem by making a simple main with the above function, so it could be something to do with the build options.

One more note: the same exact test and code have been working for months with stdsimd 0.0.3 and 0.0.4 crate. I couldn't get that crate to build with nightly 1.27.0, so I couldn't see if it still worked.

The text was updated successfully, but these errors were encountered:

sfackler · 2018-04-22T16:25:10Z

Are you compiling with the avx2 target feature enabled?

danielrh · 2018-04-22T18:45:45Z

Great question: I can avoid the bug by specifying RUSTFLAGS="-C target-cpu=core-avx-i" cargo test --release as the build command line

But in the past, llvm has been able to polyfill the instructions down to SSE2 with stdsimd-0.0.4 and therefore I didn't have to specify any flag at all.

I've noticed that asking LLVM to polyfill the instructions from avx2 down to core-avx-i actually improves performance on most available AVX2 hardware unless your instructions are sufficiently dense. This is because the AVX2 instructions downclock the chip for some time, and so it's much better to keep the full clock speed, but then keep the AVX2 code around for newer chips like skylake.

I really loved writing AVX2 intrinsics and having the compiler match them to my desired architecture...having 4 code paths (avx2, avx and SSE4.2 and SSE2) to match my desired targets is a significant support burden, so from my perspective, the old behavior with the SSE2 polyfill was ideal.

hanna-kruppe · 2018-04-22T18:49:40Z

This... really isn't what those intrinsics are for. Sometimes the path of least resistance for the compiler is to treat some intrinsics as a generic operation that can be lowered to other instruction sets as well, but that is not at all guaranteed. If you want something portable across different SIMD instruction sets, you should use the (in-development) portable SIMD types, not AVX2 intrinsics.

danielrh · 2018-04-23T03:26:15Z

Good suggestion about the portable simd types: I was able to make this helper function which seems to translate into the avx2 intrinsic when needed. Would this kind of thing be worth providing for all portable SIMD types?

#[inline(always)]
fn cmp_gt_i16x16(lhs: i16x16, rhs: i16x16) -> i16x16 {
    let lz = rhs - lhs;
    let sign_bit = lz & i16x16::splat(-32768);
    sign_bit >> 15
}

I do, however, think that this should either error out in the development build, or preferably yield a compiler error (or at least SIGILL) instead of providing wrong arithmetic results in release. Also, I suspect this is a recent LLVM bug in their polyfill...and may still be worth correcting

alexcrichton · 2018-05-07T20:45:34Z

This is an issue with opt-level 3 specifically and I believe is a bug inside of LLVM. The problem is that we're passing all arguments by reference (the SIMD arguments) and LLVM is accidentally promoting them to by-value which is known to produce bugs.

Specifically LLVM's Promote 'by reference' arguments to scalars on SCC pass is promoting pass-by-reference to pass-by-value, which is invalid in the sense of how we're expecting to use these functions.

I've opened an upstream LLVM bug at https://bugs.llvm.org/show_bug.cgi?id=37358

gnzlbg · 2018-07-05T08:13:48Z

reproducer in the playground
using #[target_feature(enable = "avx")] fixes the issue: https://play.rust-lang.org/?gist=3789e402a11276bd7d618b8bc13ac3d7&version=nightly&mode=release&edition=2015
@danielrh workaround: - the following solution using std::simd works as well (playground), is portable, and should emit resonable assembly on the most common targets (stdsimd has some workarounds for these types of cases for SSE2, SSE4.2, AVX2, and arm+v7+neon, and aarch64+neon; on other archs your mileage might vary.. greatly.. but bug reports and PR welcome):

fn baseline() {
    let data = i16x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
    let one_to_16 = i16x16::new(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
    let output = one_to_16.gt(i16x16::splat(2i16)).select(data + 1i16, data);
    // note: if the mask is often false for all lanes you could guard the select
    // behind an `if mask.any() { ... }`
    assert_eq!(
        output,
        i16x16::new(4, 8, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65)
    );
}

@alexcrichton I've pinged a couple of more people on that LLVM bug, it would be nice to get this fixed for LLVM 7. The code in the OP uses unsafe to call an AVX function, but this is safe (defined behavior) if the code path is only reached in a CPU with AVX. Therefore, this bug is turning defined behavior into undefined behavior, and all of this in stable Rust, so we should probably mark this with I-unsound.

Is there a way to tell which from the I-unsound issues affect stable Rust only? There are a lot of them, and many affect only nightly Rust, but it is hard to tell them apart.

alexcrichton · 2018-07-05T14:04:10Z

Thanks for the extra pings @gnzlbg! Let's see how that plays out...

hellow554 · 2018-07-23T08:55:02Z

@gnzlbg Your playground does not work anymore :(

2 | use std::simd::i16x16;
  |          ^^^^ Could not find `simd` in `std`

gnzlbg · 2018-07-23T08:56:46Z

@hellow554 you need to use the packed_simd crate, std::simd is (hopefully temporarily) not part of std anymore.

gnzlbg · 2018-09-04T10:58:20Z

So it appears that this won't be fixed in LLVM any time soon, and AFAICT this is not something we can easily warn about in the Rust side of things for the time being :/

raphlinus · 2018-10-14T15:21:29Z

I find this bug unfortunate, as I'm trying to do safe wrappers (called fearless_simd), but this basically makes that approach unworkable. If the bug won't be fixed soon, maybe we should document the danger zone.

I read the llvm issue. It's interesting that this bug has persisted so long without getting triggered; it's evidence that the way people use C++ and Rust are quite different in spite of the similar approaches to zero-cost abstractions etc.

alexcrichton · 2018-10-14T19:42:54Z

I'm hoping that I woke up on the right side of the bed this morning as after reading #55059 I was struck with inspiration about how we might solve this, manifested in #55073. If others are familiar with LLVM review on that would be greatly appreciated!

The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154

rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes #50154 Closes #52636 Closes #54583 Closes #55059 [quite a lot]: #47743 [discussion]: #44367 [wasn't]: #50154

The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154

rustc: Fix (again) simd vectors by-val in ABI The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154

The issue of passing around SIMD types as values between functions has seen [quite a lot] of [discussion], and although we thought [we fixed it][quite a lot] it [wasn't]! This PR is a change to rustc to, again, try to fix this issue. The fundamental problem here remains the same, if a SIMD vector argument is passed by-value in LLVM's function type, then if the caller and callee disagree on target features a miscompile happens. We solve this by never passing SIMD vectors by-value, but LLVM will still thwart us with its argument promotion pass to promote by-ref SIMD arguments to by-val SIMD arguments. This commit is an attempt to thwart LLVM thwarting us. We, just before codegen, will take yet another look at the LLVM module and demote any by-value SIMD arguments we see. This is a very manual attempt by us to ensure the codegen for a module keeps working, and it unfortunately is likely producing suboptimal code, even in release mode. The saving grace for this, in theory, is that if SIMD types are passed by-value across a boundary in release mode it's pretty unlikely to be performance sensitive (as it's already doing a load/store, and otherwise perf-sensitive bits should be inlined). The implementation here is basically a big wad of C++. It was largely copied from LLVM's own argument promotion pass, only doing the reverse. In local testing this... Closes rust-lang#50154 Closes rust-lang#52636 Closes rust-lang#54583 Closes rust-lang#55059 [quite a lot]: rust-lang#47743 [discussion]: rust-lang#44367 [wasn't]: rust-lang#50154

alexcrichton · 2018-10-23T08:03:51Z

I'm posting a revert for the fix in #55281 because I don't think the fix was quite right (causing segfaults for me). LLVM, however, in the meantime should have an official fix, so this should hopefully get closed out in the near future once that lands.

mati865 · 2019-02-12T13:10:48Z

Upstream fix has landed in Rust's LLVM fork: rust-lang/llvm-project@3d36e5c

gnzlbg · 2019-02-12T13:33:42Z

Does this still reproduce with nightly? If not we can close this.

andersk · 2019-02-12T21:46:32Z

Fails with nightly-2019-01-26 == rustc 1.33.0-nightly (bf669d1 2019-01-25).
Passes with nightly-2019-01-27 == rustc 1.33.0-nightly (20c2cba 2019-01-26).

This is consistent with the fix being in the LLVM update of #57675, so yeah, I think we can close this.

araspik · 2019-08-09T20:41:20Z

Is there any reason this is not yet closed?

alexcrichton added A-SIMD Area: SIMD (Single Instruction Multiple Data) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels May 7, 2018

hanna-kruppe mentioned this issue Jul 23, 2018

_mm256_loadu_si256 only loads 128 bits when compiled with default cargo build --release #52636

Closed

XAMPPRocky added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Sep 25, 2018

hanna-kruppe mentioned this issue Sep 26, 2018

Release builds using AVX code produce incorrect output #54583

Closed

hanna-kruppe mentioned this issue Oct 14, 2018

Miscompilation of SIMD when crossing target_feature boundaries #55059

Closed

alexcrichton mentioned this issue Oct 14, 2018

rustc: Fix (again) simd vectors by-val in ABI #55073

Merged

bors closed this as completed in #55073 Oct 21, 2018

alexcrichton reopened this Oct 23, 2018

kali mentioned this issue Dec 18, 2018

x86_64 simd fmadd (and other) bug (release only) #56950

Closed

parched mentioned this issue Jan 7, 2019

Incorrect optimization calling function with __m256d parameters #57427

Closed

ignopeverell mentioned this issue Feb 3, 2019

Grin binary release fails on CPU without avx2 support mimblewimble/grin#2494

Closed

zzau13 mentioned this issue Feb 4, 2019

Rust stable version 4x times performance loss in avx2 rust-lang/stdarch#674

Closed

KyleSiefring mentioned this issue Feb 8, 2019

target_feature doesn't trickle down to closures and internal fns #58279

Open

nikic closed this as completed Aug 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small SIMD test fails with --release but passes without #50154

Small SIMD test fails with --release but passes without #50154

danielrh commented Apr 22, 2018 •

edited

Loading

sfackler commented Apr 22, 2018

danielrh commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018

danielrh commented Apr 23, 2018 •

edited

Loading

alexcrichton commented May 7, 2018

gnzlbg commented Jul 5, 2018 •

edited

Loading

alexcrichton commented Jul 5, 2018

hellow554 commented Jul 23, 2018

gnzlbg commented Jul 23, 2018 •

edited

Loading

gnzlbg commented Sep 4, 2018

raphlinus commented Oct 14, 2018 •

edited

Loading

alexcrichton commented Oct 14, 2018

alexcrichton commented Oct 23, 2018

mati865 commented Feb 12, 2019

gnzlbg commented Feb 12, 2019

andersk commented Feb 12, 2019

araspik commented Aug 9, 2019

Small SIMD test fails with --release but passes without #50154

Small SIMD test fails with --release but passes without #50154

Comments

danielrh commented Apr 22, 2018 • edited Loading

sfackler commented Apr 22, 2018

danielrh commented Apr 22, 2018

hanna-kruppe commented Apr 22, 2018

danielrh commented Apr 23, 2018 • edited Loading

alexcrichton commented May 7, 2018

gnzlbg commented Jul 5, 2018 • edited Loading

alexcrichton commented Jul 5, 2018

hellow554 commented Jul 23, 2018

gnzlbg commented Jul 23, 2018 • edited Loading

gnzlbg commented Sep 4, 2018

raphlinus commented Oct 14, 2018 • edited Loading

alexcrichton commented Oct 14, 2018

alexcrichton commented Oct 23, 2018

mati865 commented Feb 12, 2019

gnzlbg commented Feb 12, 2019

andersk commented Feb 12, 2019

araspik commented Aug 9, 2019

danielrh commented Apr 22, 2018 •

edited

Loading

danielrh commented Apr 23, 2018 •

edited

Loading

gnzlbg commented Jul 5, 2018 •

edited

Loading

gnzlbg commented Jul 23, 2018 •

edited

Loading

raphlinus commented Oct 14, 2018 •

edited

Loading