Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining causes miscompilation of code that mixes target features #116573

Open
RalfJung opened this issue Oct 9, 2023 · 57 comments
Open

Inlining causes miscompilation of code that mixes target features #116573

RalfJung opened this issue Oct 9, 2023 · 57 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

The following code ought to be completely fine and UB-free:

use std::mem::transmute;
#[cfg(target_arch = "x86")]
use std::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

extern "C" fn no_target_feature(_dummy: f32, x: __m256) {
    let val = unsafe { transmute::<_, [u32; 8]>(x) };
    dbg!(val);
}

#[inline(always)] 
fn no_target_feature_intermediate(dummy: f32, x: __m256) {
    no_target_feature(dummy, x);
}

#[target_feature(enable = "avx")]
unsafe fn with_target_feature(x: __m256) {
  // Critical call: caller and callee have different target features.
  // However, we use the Rust ABI, so this is fine.
  no_target_feature_intermediate(0.0, x);
}

fn main() {
    assert!(is_x86_feature_detected!("avx"));
    // SAFETY: we checked that the `avx` feature is present.
    unsafe {
        with_target_feature(transmute([1; 8]));
    }
}

There's some unsafe going on, but the safety comment explains why that is okay. We are even taking care to follow the target-feature related ABI rules (see #115476); all calls between functions with different target-features use the "Rust" ABI.

And yet, this prints (when built without optimizations)

[src/main.rs:9] val = [
    1,
    1,
    1,
    1,
    538976288,
    538976288,
    538976288,
    538976288,
]

The value got clobbered while being passed through the various functions.

Replacing inline(always) by inline(never) makes the issue disappear. But inline attributes must never cause miscompilation, so there's still a soundness bug here.

I don't know if this is the MIR inliner (Cc @rust-lang/wg-mir-opt) or the LLVM inliner going wrong.

@RalfJung RalfJung added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Oct 9, 2023
@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 9, 2023
@saethlin
Copy link
Member

saethlin commented Oct 9, 2023

This still miscompiles with -Zmir-opt-level=0. The playground does not let you pass flags (rust-lang/rust-playground#781), so I generally advise against using it. godbolt supports flags, execution, setting environment variables, and picking among old toolchains: https://godbolt.org/z/WcMq14MPG

@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 9, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2023

Okay so it's an LLVM bug then it seems. Cc @nikic

@tmiasko tmiasko added A-codegen Area: Code generation A-mir-opt Area: MIR optimizations A-mir-opt-inlining Area: MIR inlining labels Oct 9, 2023
@nikic
Copy link
Contributor

nikic commented Oct 9, 2023

Is there a way to reproduce this without #[inline(always)]? Forcing inlining disables target-feature safety checks in LLVM.

(Incidentally, there was an attempt to not do that in LLVM 17, but this was reverted due to the large amount of regressions it caused. People rely on that a lot, including in Rust.)

@saethlin
Copy link
Member

saethlin commented Oct 9, 2023

Forcing inlining disables target-feature safety checks in LLVM.

Are you saying #[inline(always)] is unsound?

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2023

Yeah that is no good, we can't have (safe!) attributes just override checks which are needed for soundness.

I don't know a reproducer without inline(always), but I consider this a critical bug even with inline(always).

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2023

(Incidentally, there was an attempt to not do that in LLVM 17, but this was reverted due to the large amount of regressions it caused. People rely on that a lot, including in Rust.)

Perf regressions are acceptable when fixing soundness bugs. We then have to see how much of the perf we can get back without compromising soundness.

@briansmith
Copy link
Contributor

briansmith commented Oct 9, 2023

There wouldn't necessarily need to be a perf regression. I would expect it to compile down to the code that would exist as if the intermediate function were not there:

use std::mem::transmute;
#[cfg(target_arch = "x86")]
use std::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

extern "C" fn no_target_feature(_dummy: f32, x: __m256) {
    let val = unsafe { transmute::<_, [u32; 8]>(x) };
    dbg!(val);
}

#[target_feature(enable = "avx")]
unsafe fn with_target_feature(x: __m256) {
  // Critical call: caller and callee have different target features.
  // The compiler needs to deal with the ABI transition here.
  no_target_feature(0.0, x);
}

fn main() {
    assert!(is_x86_feature_detected!("avx"));
    // SAFETY: we checked that the `avx` feature is present.
    unsafe {
        with_target_feature(transmute([1; 8]));
    }
}

I would expect that a function, when inlined, doesn't have effects on ABI issues in itself.

BTW, I hope to be writing code just like this very soon, but instead of extern "C" we'll sometimes force the sysv ABI (even on Windows), ideally.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2023

I would expect it to compile down to the code that would exist as if the intermediate function were not there:

That's what it does, and that's the bug. That code is wrong, see #116558.

Basically LLVM tied together flags affecting ABI and flags relevant for codegen, and I think that was a huge mistake. This issue and #116558 show why.

BTW, I hope to be writing code just like this very soon, but instead of extern "C" we'll sometimes force the sysv ABI (even on Windows), ideally.

This issue affects all non-"Rust" ABIs.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 9, 2023

If inline(always) is unsound, we need to castrate it so it's just inline. We can do that on our end, without any need to consult LLVM for its preferences.

@nikic
Copy link
Contributor

nikic commented Oct 9, 2023

(Incidentally, there was an attempt to not do that in LLVM 17, but this was reverted due to the large amount of regressions it caused. People rely on that a lot, including in Rust.)

Perf regressions are acceptable when fixing soundness bugs. We then have to see how much of the perf we can get back without compromising soundness.

It's a bit more complex than that. Examples of regressions this caused are:

  • Compiler crashes due to selection failures. Builtins were not inlined into functions with target features that would allow them to select, causing a crash.
  • Failure to inline function with inline assembly where all functions were consistently annotated with target features.
  • Failure to inline platform vector intrinsics into functions.

You might call this "just a perf issue", but inlining of platform vector intrinsics is an important part of their semantics. They are useless if this does not happen reliably.

These issues are not fundamental, but caused by target feature checks being too conservative, especially for non-X86 targets.

The semantics of always_inline can be changed, but it would require some work to make sure we have at least somewhat accurate compatibility checks across targets.


I believe something that was discussed in the past but never happened, is that we should add a lint for calling a function with less target features, while passing vector values to it. Independent of the soundness issues discussed here, the lack of inlining makes this a performance footgun, and it's almost certainly not what people want to do.

@briansmith
Copy link
Contributor

If inline(always) is unsound, we need to castrate it so it's just inline.

Pretty much anybody who would write code like the above would very much appreciate at least a warning if that is going to happen. If/when I see such a warning I would remove the "intermediate" wrapper. Then I would rewrite the code into the form I shared.

I would expect it to compile down to the code that would exist as if the intermediate function were not there:

That's what it does, and that's the bug. That code is wrong, see #116558.

Then how is this particular issue a distinct bug from #116558, especially considering that nobody wants their #[inline(always)] function to not be inlined?

@workingjubilee
Copy link
Member

@briansmith: Pretty much anybody who would write code like the above would very much appreciate at least a warning if that is going to happen. If/when I see such a warning I would remove the "intermediate" wrapper. Then I would rewrite the code into the form I shared.

Completely understandable. We should design a lint that will fire on all cases here.

@nikic: I believe something that was discussed in the past but never happened, is that we should add a lint for calling a function with less target features, while passing vector values to it. Independent of the soundness issues discussed here, the lack of inlining makes this a performance footgun, and it's almost certainly not what people want to do.

Sounds good to me.

@nikic: You might call this "just a perf issue", but inlining of platform vector intrinsics is an important part of their semantics. They are useless if this does not happen reliably.

And yeah, nikic is right here. We might have to hack in an #[rustc_REALLY_always_inline] for usage by core::arch, while we work on fixing the soundness issues. I think that's fine.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 9, 2023

Then how is this particular issue a distinct bug from #116558, especially considering that nobody wants their #[inline(always)] function to not be inlined?

This is a clear soundness bug IMO, #116558 is "just" very odd semantics and ABI footguns. I think we should resolve #116558 by refusing to compile the example there but I'm not convinced that will suffice to fix this soundness bug.

Failure to inline platform vector intrinsics into functions.

If the caller had the target feature, they should still get inlined, no? And if someone calls an AVX2 intrinsic from a function that doesn't have the AVX2 feature then surely exploding that code is fine, it should probably not even compile...

@workingjubilee
Copy link
Member

People use dynamic feature dispatch, however?

@nikic
Copy link
Contributor

nikic commented Oct 9, 2023

Failure to inline platform vector intrinsics into functions.

If the caller had the target feature, they should still get inlined, no? And if someone calls an AVX2 intrinsic from a function that doesn't have the AVX2 feature then surely exploding that code is fine, it should probably not even compile...

The relevant case is more along the lines of: The caller has features +a,+b and the platform intrinsic has +a. LLVM refuses to inline because this is potentially unsafe. LLVM's default assumption about what is safe to inline are very conservative. If the target doesn't tell it that e.g. subset inlining is always safe, it's only going to inline if the target features are exactly the same. Not all targets implement the necessary hook to provide a more precise compatibility check.

Or to give a less obvious example, you have a function with +armv8-a and an intrinsic with +armv7-a. That's not a case of subset inlining and requires special handling (and I wouldn't be able to say off the top of my head whether that is universally safe in the first place or not).

@workingjubilee
Copy link
Member

Inlining across Arm "major versions" is honestly pretty dangerous because they routinely retire older instructions on the majors.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 25, 2023
…oli-obk

Require target features to match exactly during inlining

In general it is not correct to inline a callee with a target features
that are subset of the callee. Require target features to match exactly
during inlining.

The exact match could be potentially relaxed, but this would require
identifying specific feature that are allowed to differ, those that need
to match, and those that can be present in caller but not in callee.

This resolves MIR part of rust-lang#116573. For other concerns with respect to
the previous implementation also see areInlineCompatible in LLVM.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 25, 2023
Rollup merge of rust-lang#117141 - tmiasko:inline-target-features, r=oli-obk

Require target features to match exactly during inlining

In general it is not correct to inline a callee with a target features
that are subset of the callee. Require target features to match exactly
during inlining.

The exact match could be potentially relaxed, but this would require
identifying specific feature that are allowed to differ, those that need
to match, and those that can be present in caller but not in callee.

This resolves MIR part of rust-lang#116573. For other concerns with respect to
the previous implementation also see areInlineCompatible in LLVM.
@tmiasko tmiasko removed A-mir-opt Area: MIR optimizations A-mir-opt-inlining Area: MIR inlining labels Oct 25, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Oct 28, 2023

I am trying to catch LLVM in the act of moving a call instruction between functions with different target features, but so far I have not succeeded. Somehow when I translate the example here to LLVM IR and pass that to clang, it doesn't get the optimizations I am hoping for. Here is what I got so far -- does anyone have an idea how to produce such an example?

Here's another version, still doesn't get inlined though.

@RalfJung
Copy link
Member Author

I think I finally got it. Not sure what is different about this than my previous attempts...

@RalfJung
Copy link
Member Author

Here's an LLVM issue for the problem: llvm/llvm-project#70563

@sarah-ek
Copy link

sarah-ek commented Nov 20, 2023

i found an example that doesn't use extern "C"

it should print (0, 1, 2, 3), but instead, when executed in release mode on the playground, it shows (0, 1, 206158430224, 140735013294640) for me

https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=f9070ae872e66ba389fcba256e4f00fc

use core::arch::x86_64::__m256i;
use core::hint::black_box;
use core::mem::transmute;

#[allow(non_camel_case_types)]
#[derive(Copy, Clone, Debug)]
pub struct u64x4(u64, u64, u64, u64);

#[inline(never)]
#[target_feature(enable = "avx")]
unsafe fn return_as_is_avx(a: __m256i) -> __m256i {
    a
}

#[inline(never)]
unsafe fn return_as_is(a: u64x4) -> u64x4 {
    transmute(return_as_is_avx(transmute(a)))
}

#[target_feature(enable = "avx")]
#[inline]
unsafe fn imbue_avx<F: Fn()>(f: F) -> F::Output {
    f()
}

pub unsafe fn buggy() {
    imbue_avx(
        #[inline(always)]
        || {
            dbg!(return_as_is(black_box(u64x4(0, 1, 2, 3))));
        },
    );
}

pub fn main() {
    assert!(is_x86_feature_detected!("avx"));
    unsafe {
        buggy();
    }
}

@RalfJung
Copy link
Member Author

RalfJung commented Nov 20, 2023

On Zulip, someone suggested this might be due to LLVM turning a ptr argument into a by-val argument as an optimization.

(Please mention such observations when carrying issues from Zulip to Github, or else people will have to waste time re-discovering the same thing!)

@sarah-ek
Copy link

im not sure if that's what's causing the issue. even when passing the argument with multiple indirections and black_boxing the reference so it doesn't get promoted, i still get the same issue

https://godbolt.org/z/EaGxGjWhT

#[inline(never)]
#[target_feature(enable = "avx")]
unsafe fn return_as_is_avx(a: &&__m256i) -> u64x4 {
    transmute(**black_box(a))
}

#[inline(never)]
unsafe fn return_as_is(a: u64x4) -> u64x4 {
    return_as_is_avx(&&transmute(a))
}

this is the asm for return_as_is_avx, so it is performing the pointer dereferences

example::return_as_is_avx:
  mov qword ptr [rsp - 8], rsi
  lea rax, [rsp - 8]
  mov rax, qword ptr [rsp - 8]
  mov rax, qword ptr [rax]
  vmovaps ymm0, ymmword ptr [rax]
  vmovups ymmword ptr [rdi], ymm0
  vzeroupper
  ret

output

[/app/example.rs:30] return_as_is(black_box(u64x4(0, 1, 2, 3))) = u64x4(
    0,
    1,
    206158430224,
    140726132754736,
)

@sarah-ek
Copy link

this part looks suspicious to me

i might be misreading this, but it looks like return_as_is is expecting the input to be split in xmm0 and xmm1

example::return_as_is:
  push rbp
  mov rbp, rsp
  and rsp, -32
  sub rsp, 96
  movaps xmmword ptr [rsp + 48], xmm1  // <--
  movaps xmmword ptr [rsp + 32], xmm0  // <--
  lea rax, [rsp + 32]
  mov qword ptr [rsp + 24], rax
  lea rsi, [rsp + 24]
  call example::return_as_is_avx
  mov rsp, rbp
  pop rbp
  ret

but in imbue_avx it might be getting passed in one register ymm0 (no mention of xmm1 or ymm1)

example::imbue_avx:
  push r14
  push rbx
  sub rsp, 168
  vmovaps ymm0, ymmword ptr [rip + .LCPI6_0]
  vmovups ymmword ptr [rsp + 80], ymm0
  lea r14, [rsp + 80]
  vmovups ymm0, ymmword ptr [rsp + 80]  // <--
  lea rbx, [rsp + 136]
  mov rdi, rbx
  call example::return_as_is

@RalfJung
Copy link
Member Author

RalfJung commented Nov 20, 2023 via email

@sarah-ek
Copy link

i don't think it's a closure issue, still happens if i get rid of it https://godbolt.org/z/cW9GdPWdM

@RalfJung
Copy link
Member Author

Then the only other idea I have is that LLVM tries to optimize passing u64x4 (which is defined as a regular tuple struct here) but applies the optimization in an inconsistent way. That might be worth an LLVM bug report, if someone can turn this into an LLVM IR example.

Interestingly, one can even remove the target-feature from return_as_is_avx, the issue remains.

@sarah-ek
Copy link

Interestingly, one can even remove the target-feature from return_as_is_avx, the issue remains.

could you post an example? i can't reproduce this

@RalfJung
Copy link
Member Author

Here you go: https://godbolt.org/z/nqf8Ee9PM

@sarah-ek
Copy link

thanks! i tried reproducing the issue outside of godbolt/playground and i noticed an interesting pattern.

this is the project structure

// src/lib.rs
use std::arch::x86_64::__m256i;
use std::hint::black_box;
use std::mem::transmute;

#[allow(non_camel_case_types)]
#[derive(Copy, Clone)]
pub struct u64x4(u64, u64, u64, u64);

#[inline(never)]
pub unsafe fn return_as_is_avx(a: &&__m256i) -> u64x4 {
    transmute(**black_box(a))
}

#[inline(never)]
pub unsafe fn return_as_is(a: u64x4) -> u64x4 {
    return_as_is_avx(&&transmute(a))
}

#[inline(always)]
pub unsafe fn buggy_intermediate() {
    let result = return_as_is(black_box(u64x4(13, 14, 15, 16)));
    println!("({}, {}, {}, {})", result.0, result.1, result.2, result.3)
}

#[target_feature(enable = "avx")]
#[inline(never)]
pub unsafe fn buggy_avx() {
    buggy_intermediate();
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    pub fn test_inner() {
        if !is_x86_feature_detected!("avx") {
            return;
        }
        unsafe { buggy_avx() };
    }
}
// tests/bug.rs
use abi_bug::*;

#[test]
pub fn test_outer() {
    if !is_x86_feature_detected!("avx") {
        return;
    }
    unsafe { buggy_avx() };
}

test_inner shows the wrong result, but test_outer shows the correct one.

after disassembling the test binaries, it looks like the one in src/lib.rs uses the fastcc calling convention, since it sees that everything is in the same crate. but when it's exported and used from tests/bug.rs, it uses the usual calling convention.

so this might be a bug with fastcc

@sarah-ek
Copy link

here's an example of the buggy llvm-ir (i think, im not very familiar with llvm)

https://godbolt.org/z/hTfKxcETY

@sarah-ek
Copy link

actually, i don't think fastcc is the issue. it looks like u64x4 is being turned into <4 x i64> at some point in the signature of return_as_is, which results in the abi mismatch between the caller and the callee

im not sure how it's being promoted to <4 x i64>, but since (i assume) it's rustc that generates this IR, maybe the bug is fixable on our side?

@RalfJung
Copy link
Member Author

I'm pretty sure rustc doesn't automatically do such transformations, so it's likely an LLVM optimization.

@sarah-ek
Copy link

looks like you're right, with RUSTFLAGS="-C llvm-args=-print-after-all", it looks like the transformation is being done by the argument promotion pass

@RalfJung
Copy link
Member Author

RalfJung commented Aug 15, 2024

#127731 will make the original example not compile any more, and thus fix the easiest way to hit the soundness bug.

The "obvious" way to still reproduce the issue involves swapping the role of which functions have target features and which don't. However, Rust rejects having inline(always) and target_feature attributes on the same function (I think that wasn't even meant as a soundness fix, just pointing out that this can't always inline).

Now I wonder, how does -C target-feature work in multi-crate situations? Does each function remember which target features were enabled when it was declared? Or does this flag affect all functions that just happen to be codegen'd in the rustc invocation that has the flag?

Basically the plan would be to compile one crate with target-feature=+avx that contains

#[inline(never)]
unsafe extern "C" fn with_target_feature(_dummy: f32, x: __m256) {
    let val = unsafe { transmute::<_, [u32; 8]>(x) };
    dbg!(val);
}

#[inline(always)]
unsafe fn with_target_feature_intermediate(dummy: f32, x: __m256) {
    with_target_feature(dummy, x);
}

And then another crate without avx that does

#[inline(never)]
unsafe fn no_target_feature(x: __m256) {
    assert!(is_x86_feature_detected!("avx"));
    with_target_feature_intermediate(0.0, x);
}

fn main() {
    no_target_feature(transmute([1; 8]));
}

Now if the target feature information is preserved per-function, LLVM should be inlining with_target_feature_intermediate into no_target_feature and that then causes UB.

But is there a way to get rustc + LLVM to actually do that?

@RalfJung
Copy link
Member Author

Assuming that -C flags apply to everything we codegen in the current crate, and we don't mix LLVM IR from different crates ourselves (LTO does that and I wonder how LLVM merges bitcode files with different target features...), it might be the case that #127731 indeed fixes the issue for us. In that case we can only have inline(always) on functions that have the least amount of target features, meaning whatever function calls they are making is only using types that can be used everywhere in the current file, and no matter where we move those calls they will always maintain the right ABI.

If we had #[target_feature(disable = ...)], we could again reproduce the issue I think, but that's not a thing. There are features like target_feature(enable = "softfloat") that take away things; those could cause the same problem, so we'll have to be careful to never let people use them with per-function target_feature.

@veluca93
Copy link
Contributor

The original example seems to work correctly with rustc 1.80.1:

veluca@veluca4 ~/rust master $ rustc -O test_inline.rs && ./test_inline
warning: `extern` fn uses type `std::arch::x86_64::__m256`, which is not FFI-safe
 --> test_inline.rs:7:49
  |
7 | extern "C" fn no_target_feature(_dummy: f32, x: __m256) {
  |                                                 ^^^^^^ not FFI-safe
  |
  = help: consider adding a `#[repr(C)]` or `#[repr(transparent)]` attribute to this struct
  = note: this struct has unspecified layout
  = note: `#[warn(improper_ctypes_definitions)]` on by default

warning: 1 warning emitted

[test_inline.rs:9:5] val = [
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
]

@RalfJung
Copy link
Member Author

@nikic any idea if something changed on the LLVM side that would explain why this does not reproduce any more?

@RalfJung
Copy link
Member Author

RalfJung commented Aug 16, 2024

Ah, try running the example without optimizations. That still reproduces it on the playground.

[src/main.rs:9:5] val = [
    1,
    1,
    1,
    1,
    0,
    0,
    8192,
    0,
]

@veluca93
Copy link
Contributor

To be on the safer side, it can be useful to add #[inline(never)] to the extern "C" function:

#[cfg(target_arch = "x86")]
use std::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
use std::mem::transmute; 

#[inline(never)]                                                           
extern "C" fn no_target_feature(_dummy: f32, x: __m256) {
    let val = unsafe { transmute::<_, [u32; 8]>(x) };
    dbg!(val);
}

#[inline(always)]
fn no_target_feature_intermediate(dummy: f32, x: __m256) {
    no_target_feature(dummy, x);
}

#[target_feature(enable = "avx")]
unsafe fn with_target_feature(x: __m256) {
    // Critical call: caller and callee have different target features.
    // However, we use the Rust ABI, so this is fine.
    no_target_feature_intermediate(0.0, x);
}

fn main() {
    assert!(is_x86_feature_detected!("avx"));
    // SAFETY: we checked that the `avx` feature is present.
    unsafe {
        with_target_feature(transmute([1; 8]));
    }
}

@RalfJung
Copy link
Member Author

Okay, so the issue still exists. But the question remains whether there's a reproducer that works even with #127731, i.e. a reproducer that only uses vector types on extern "C" functions that have the right feature flags enabled.

The original reproducer was a "no target feature" function taking a vector and having its call site incorrectly inlined into a "with target feature" function. That can't happen any more, so we need a "with target feature" function and coerce LLVM into inlining its call site into a "no target feature" function. It will only do that when the "with target feature" function has inline(always), which means the target feature must be enabled via -Ctarget-feature rather than the attribute. Now is there a way to mix code from crates compiled with and without -Ctarget-feature to cause a problem?

Cc @nikic @comex

@tgross35 tgross35 added the A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. label Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests