-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change aarch64 vld1* instructions to not cause individual loads #1207
Change aarch64 vld1* instructions to not cause individual loads #1207
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) soon. Please see the contribution instructions for more information. |
I think at least the vst1* instruction should also use |
@hkratz Thanks! Could you update the vst1 intrinsics as well like @SparrowLii suggested. |
f6c48fb
to
a6d925e
Compare
LGTM. Are you planning on implementing something else (the PR is still a draft). I don't think a test is needed for this. |
I was thinking of a regression test for #1148, but it is not strictly necessary and afaics there is no easy way to do it. |
Update stdarch submodule This is mainly to fix the critical issue of aarch64 store intrinsics overwriting additional memory, see rust-lang/stdarch#1220 Changes: * aarch64/armv7: additional vld1/vst1 intrinsics + perf fixes for existing ones * rust-lang/stdarch#1205 * rust-lang/stdarch#1207 * rust-lang/stdarch#1216 * armv7: Make FMA work with vfpv4 and optimize * rust-lang/stdarch#1219 * Non-visible changes to the testing framework * rust-lang/stdarch#1208 * rust-lang/stdarch#1211 * rust-lang/stdarch#1213 * rust-lang/stdarch#1215 * rust-lang/stdarch#1218
vld1* instructions are required to always compile to a single load instruction (see ARM developer documentation). The current implementation causes individual loads to be emitted in LLVM-IR which are not always combined to a single load instruction during LLVM optimization passes. This change causes a single load to be emitted in all cases.
cc @SparrowLii
TODO:
Fixes #1148.