Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining + target_feature broken in powerpc64 #60637

Open
gnzlbg opened this issue May 8, 2019 · 1 comment
Open

Inlining + target_feature broken in powerpc64 #60637

gnzlbg opened this issue May 8, 2019 · 1 comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-SIMD Area: SIMD (Single Instruction Multiple Data) C-bug Category: This is a bug. O-PowerPC Target: PowerPC processors T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented May 8, 2019

godbolt

#![feature(repr_simd, powerpc_target_feature)]
#![allow(non_camel_case_types)]

#[repr(simd)] pub struct u32x4(u32, u32, u32, u32);

impl u32x4 {
    #[inline]
    // #[inline(always)]
    fn splat(x: u32) -> Self {
        u32x4(x, x, x, x)
    }
}

#[target_feature(enable = "altivec")]
pub unsafe fn splat_u32x4(x: u32) -> u32x4 {
    u32x4::splat(x)
}

with #[inline] that code produces a function call within splat_u32x4 (b example::u32x4::splat) to u32x4::splat, which is not eliminated, even though this method is module private. With #[inline(always)], u32x4::splat is inlined into splat_u32x4, and no code for u32x4::splat is generated.

#[inline] should not be needed here, much less #[inline(always)], yet without #[inline(always)] this produces bad codegen.

Removing the #[target_feature] attribute from splat_u32x4 fixes the issue, no #[inline] necessary: godbolt. So there must be some interaction between inlining and target features going on here.

@jonas-schievink jonas-schievink added C-bug Category: This is a bug. O-PowerPC Target: PowerPC processors T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 8, 2019
@Enselic Enselic added the A-SIMD Area: SIMD (Single Instruction Multiple Data) label Nov 20, 2023
@sayantn
Copy link
Contributor

sayantn commented Jun 29, 2024

This is something on the LLVM side. See this comment. This happens in all microarchitectures except for ARM and x86 I think.

@workingjubilee workingjubilee added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-SIMD Area: SIMD (Single Instruction Multiple Data) C-bug Category: This is a bug. O-PowerPC Target: PowerPC processors T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants