Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad codegen for bitwise OR/AND masks #303

Open
Nugine opened this issue Sep 22, 2022 · 4 comments
Open

Bad codegen for bitwise OR/AND masks #303

Nugine opened this issue Sep 22, 2022 · 4 comments
Labels
I-scalarize Impact: code that should be vectorized, isn't I-slow Impact: Slowww

Comments

@Nugine
Copy link

Nugine commented Sep 22, 2022

https://rust.godbolt.org/z/5obPq9W3G

#![feature(portable_simd)]

use std::simd::{u8x16, LaneCount, Simd, SimdElement, SimdInt, SimdPartialOrd, SupportedLaneCount};

fn splat<T, const N: usize>(x: T) -> Simd<T, N>
where
    T: SimdElement,
    LaneCount<N>: SupportedLaneCount,
{
    Simd::splat(x)
}

pub fn is_hex(chunk: &[u8; 16]) -> bool {
    let x = u8x16::from_array(*chunk).cast();
    let m1 = x.simd_gt(splat(b'0' - 1));
    let m2 = x.simd_lt(splat(b'9' + 1));
    let m3 = x.simd_gt(splat(b'a' - 1));
    let m4 = x.simd_lt(splat(b'f' + 1));
    let m = (m1 & m2) | (m3 & m4);
    m.all()
}

pub fn is_ascii(chunk: &[u8; 16]) -> bool {
    let x = u8x16::from_array(*chunk);
    let m = x.cast::<i8>().simd_lt(splat(0));
    m.all()
}

The m.all() expression in function is_hex should generate umaxv(uminv/smaxv/sminv) instruction like the one in function is_ascii. But it generates lots of scalar instructions, leading to poor performance.

@Nugine Nugine changed the title Bad codgen for bitwise OR/AND masks Bad codegen for bitwise OR/AND masks Sep 22, 2022
@calebzulawski
Copy link
Member

I think this is a duplicate of #146. I haven't seen any updates to llvm/llvm-project#50466.

@workingjubilee
Copy link
Member

Hmm. May or may not be the same bug, given this isn't regarding the same instructions. Depends on if the same machine opt passes are failing to do their work.

@calebzulawski
Copy link
Member

I'm pretty sure it's the same class of bug. The previous one just referred to "or" and this one "and". I reference both in my LLVM issue, at least.

@jhorstmann
Copy link

This might have been fixed by the llvm 18 update on nightly: https://rust.godbolt.org/z/cojTW48PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-scalarize Impact: code that should be vectorized, isn't I-slow Impact: Slowww
Projects
None yet
Development

No branches or pull requests

4 participants