Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add vshl and vshr neon instructions #1111

Merged
merged 6 commits into from
Apr 9, 2021
Merged

Conversation

SparrowLii
Copy link
Member

@SparrowLii SparrowLii commented Apr 8, 2021

This PR adds vshl, vshl_n, vshll, vshll_high, vshr_n, vshrn, vshrn_high neon instructions.
It makes some modifications to the code generator to generate the correct suffixes.

@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

@SparrowLii
Copy link
Member Author

SparrowLii commented Apr 8, 2021

I used manual implementation in vshrq_n_s8 and vshrq_n_u8, because on the aarch64 architecture, calling simd_shr on the i8x16 type seems to have some problems.
Actually, when I test it like this:

let a = i8x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
let b = i8x16::new(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2);
unsafe {
    let c = simd_shr(a, b);
    println!("{:?}", c);
}

On my x86_64, the result was OK:

i8x16(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)

But on my aarch64, it got the wrong result:

i8x16(1, 1, 3, 2, 5, 3, 7, 4, 9, 5, 11, 6, 13, 7, 15, 8)

The strange part is that in this commit, no such error occurred in the CI test.
I am not sure under what circumstances this error will occur. To be safe, I used manual implementation on both vshrq_n_s8 and vshrq_n_u8.

@bors
Copy link
Contributor

bors commented Apr 8, 2021

☔ The latest upstream changes (presumably 5260eb1) made this pull request unmergeable. Please resolve the merge conflicts.

@Amanieu
Copy link
Member

Amanieu commented Apr 8, 2021

Does the incorrect result with simd_shr only happen on your aarch64 machine? Does it happen when running on CI?

@SparrowLii
Copy link
Member Author

SparrowLii commented Apr 9, 2021

It only happens on my aarch64. It is all right when running the CI.

@Amanieu
Copy link
Member

Amanieu commented Apr 9, 2021

If you can give me a small self-contained test program I can look into this issue. I would prefer if we use simd_shr here like we do for the other types.

@SparrowLii
Copy link
Member Author

Here is all my test program.
It's OK if we use simd_shr uniformly, as well as we can see if others have the same problem.

@Amanieu
Copy link
Member

Amanieu commented Apr 9, 2021

This is a bug in LLVM:

# Debug
0000000000401a38 <simd_shr>:
  401a38:       3dc00000        ldr     q0, [x0]
  401a3c:       3dc00021        ldr     q1, [x1]
  401a40:       6e60b821        neg     v1.8h, v1.8h <---- THIS IS WRONG
  401a44:       4e214400        sshl    v0.16b, v0.16b, v1.16b
  401a48:       3d800100        str     q0, [x8]
  401a4c:       d65f03c0        ret

# Release
000000000040181c <simd_shr>:
  40181c:       3dc00020        ldr     q0, [x1]
  401820:       3dc00001        ldr     q1, [x0]
  401824:       6e20b800        neg     v0.16b, v0.16b
  401828:       4e204420        sshl    v0.16b, v1.16b, v0.16b
  40182c:       3d800100        str     q0, [x8]
  401830:       d65f03c0        ret

It doesn't happen in CI because we build with optimizations enabled.

I will open a bug in LLVM for this.

@SparrowLii
Copy link
Member Author

Thanks for helping!

@Amanieu
Copy link
Member

Amanieu commented Apr 9, 2021

Let's just merge it with simd_shr, we can fix the LLVM bug in rustc separately.

@SparrowLii
Copy link
Member Author

That's all right

@Amanieu Amanieu merged commit 92ca389 into rust-lang:master Apr 9, 2021
@Amanieu
Copy link
Member

Amanieu commented Apr 9, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants