Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_mm512_set4_epi64 reverses order of arguments #1555

Closed
tslnc04 opened this issue Apr 7, 2024 · 0 comments · Fixed by #1557
Closed

_mm512_set4_epi64 reverses order of arguments #1555

tslnc04 opened this issue Apr 7, 2024 · 0 comments · Fixed by #1557

Comments

@tslnc04
Copy link
Contributor

tslnc04 commented Apr 7, 2024

The implementation in core::arch for _mm512_set4_epi64 is

pub unsafe fn _mm512_set4_epi64(d: i64, c: i64, b: i64, a: i64) -> __m512i {
    let r = i64x8::new(d, c, b, a, d, c, b, a);
    transmute(r)
}

so the first argument provided becomes the first lane.
However, the Intel Intrinsics Guide defines it as

__m512i _mm512_set4_epi64 (__int64 d, __int64 c, __int64 b, __int64 a)
dst[63:0] := a
dst[127:64] := b
dst[191:128] := c
dst[255:192] := d
dst[319:256] := a
dst[383:320] := b
dst[447:384] := c
dst[511:448] := d
dst[MAX:512] := 0

which means that the last argument provided becomes the first lane.

The implementation for _mm512_set_epi64 is correct though, which leads to a disparity between _mm512_set4_epi64 and _mm512_set_epi64 that doesn't exist in C. I've created this gist to show this difference between C and Rust.

tslnc04 added a commit to tslnc04/stdarch that referenced this issue Apr 7, 2024
Fixes rust-lang#1555 by changing the implementations of _mm512_set4_epi64 and _mm512_setr4_epi64 to use _mm512_set_epi64. This makes these implementations consistent with the other _mm512_set[r]4_* implementations as well as changes their behavior to be in line with what the intrinsics guide describes.
Amanieu pushed a commit to tslnc04/stdarch that referenced this issue Apr 9, 2024
Fixes rust-lang#1555 by changing the implementations of _mm512_set4_epi64 and _mm512_setr4_epi64 to use _mm512_set_epi64. This makes these implementations consistent with the other _mm512_set[r]4_* implementations as well as changes their behavior to be in line with what the intrinsics guide describes.
Amanieu pushed a commit that referenced this issue Apr 10, 2024
Fixes #1555 by changing the implementations of _mm512_set4_epi64 and _mm512_setr4_epi64 to use _mm512_set_epi64. This makes these implementations consistent with the other _mm512_set[r]4_* implementations as well as changes their behavior to be in line with what the intrinsics guide describes.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 12, 2024
Update stdarch submodule

`asm_experimental_arch` is required in `core` as we're now using unstable inline assembly when building Arm64EC.

Brings in the fix for <rust-lang/stdarch#1555> (cc `@tslnc04).`

r? `@Amanieu`
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 12, 2024
Rollup merge of rust-lang#123833 - dpaoliello:stdarch, r=Amanieu

Update stdarch submodule

`asm_experimental_arch` is required in `core` as we're now using unstable inline assembly when building Arm64EC.

Brings in the fix for <rust-lang/stdarch#1555> (cc `@tslnc04).`

r? `@Amanieu`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant