Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete vld1 instructions with some corrections #1216

Merged
merged 3 commits into from
Sep 18, 2021

Conversation

SparrowLii
Copy link
Member

@SparrowLii SparrowLii commented Sep 10, 2021

This PR does the following:
(1) completes vld1 neon instructions, mainly for *_p64s
(2) replaces crypto feature with aes for arm
(3) adds "core_arch/src/arm_shared/neon" dir in stdarch-verify. Based on this, the function names of vmull_n_s16, vmull_n_s32, vmull_n_u16, vmull_n_u32, vqdmulhq_n_s16, vqdmulhq_n_s32 have been corrected
(4) optimizes stdarch-gen so that vld2* (and other) instructions can be generated more clearly(For the convenience of reviewing, they are not generated in this PR)

@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

name = vmull
n-suffix
name = vmull_n
no-q
Copy link
Contributor

@hkratz hkratz Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this change should be documented in the PR or in an extra commit/extra PR as it changes the names of public intrinsics (to their correct names).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out! I added relevant documents to the description of the PR.

@@ -3568,7 +3733,7 @@ generate int16x4_t:i16:int16x4_t, int32x2_t:i32:int32x2_t

/// Vector saturating doubling multiply high with scalar
name = vqdmulhq_n
out-suffix
no-q
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@SparrowLii SparrowLii changed the title Complete vld1 instructions with some optimization Complete vld1 instructions with some corrections Sep 10, 2021
#[target_feature(enable = "neon,aes")]
#[cfg_attr(test, assert_instr(ldr))]
pub unsafe fn vld1q_p64(ptr: *const p64) -> poly64x2_t {
transmute(u64x2::new(*ptr, *ptr.offset(1)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a concern raised in #1148 that this doesn't get optimized down to a single instruction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, all others have been already changed to use read_unaligned() instead which cannot be seen in the diff as it this PR is based on an older commit.

Copy link
Member Author

@SparrowLii SparrowLii Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I will rebase the PR and make changes accordingly.

@hkratz
Copy link
Contributor

hkratz commented Sep 14, 2021

@Amanieu Can this be merged? Fixing #1212 and #1217 requires changes to stdarch-gen and this PR contains a big refactoring of it. I would rather like to build on this instead of running into rebase conflicts.

@Amanieu Amanieu merged commit 30b3eb3 into rust-lang:master Sep 18, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 27, 2021
Update stdarch submodule

This is mainly to fix the critical issue of aarch64 store intrinsics overwriting additional memory, see rust-lang/stdarch#1220

Changes:
* aarch64/armv7: additional vld1/vst1 intrinsics + perf fixes for existing ones
  * rust-lang/stdarch#1205
  * rust-lang/stdarch#1207
  * rust-lang/stdarch#1216
* armv7: Make FMA work with vfpv4 and optimize
  * rust-lang/stdarch#1219
* Non-visible changes to the testing framework
  * rust-lang/stdarch#1208
  * rust-lang/stdarch#1211
  * rust-lang/stdarch#1213
  * rust-lang/stdarch#1215
  * rust-lang/stdarch#1218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants