-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rotate_left/right #496
Conversation
It is not as easy as for the intrinsics but it is possible. You can create a test in the For examples, all intrinsics in the Let me know if you run into issues while giving it a try. |
Any concerns about organization etc? |
26bf899
to
e14710b
Compare
Huh, some of these are being compiled from Also not sure what's happening with the other failures |
coresimd/ppsv/api/rotates.rs
Outdated
#![allow(unused)] | ||
|
||
/// Trait used for overloading rotates | ||
pub trait Rotate<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~~Given that you provide a specific implementation for each vector type below, this doesn't really need to be a trait.
In general we try to keep the traits we use to a minimum, and most of them are sealed (not exposed in the public API of the library).~~ EDIT: it's ok, see below.
coresimd/ppsv/api/rotates.rs
Outdated
} | ||
} | ||
|
||
impl super::api::Rotate<$elem_ty> for $id { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I see why you need the trait now.
|
||
#[inline] | ||
#[target_feature(enable = "avx512f")] | ||
#[cfg_attr(test, assert_instr(vprorvq))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert_instr
and target_feature
need to be guarded on x86
and x86_64
. Same for all other functions below:
#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable = ...))]
#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), assert_instr(....))]
should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this looks good.
The only thing that worries me is about adding a trait called Rotate
to core
, which is a very general name and also begs the question why don't the integer types implement it.
I think it would be less controversial to just implement this for vectors as methods, without the trait, and require people to use splat
. You can then post in the RFC rust-lang/rfcs#2366 about adding a trait for this, and see what people's consensus about this is.
So... I am not fundamentally opposed to adding the trait. In fact I think using the trait makes rotate nicer to use, but there are organizational concerns about exposing traits in the std library that really must make them be worth the effort. For things like shuffles and select, there really aren't any good alternatives to using traits, but for rotates, this is a bit less the case. Also, for select and shuffle those traits are sealed and not part of the library API, but this one would be in the library API which kinds of puts it at the same level as FromBits/IntoBits which are not proposed in the RFC. Does that make sense?
Could there be some way to seal the Otherwise I agree. Rotates probably aren't used that much anyway. |
Making progress. The only failures now are a couple shift implementations and left/right swapping. |
I am not sure. IIUC sealed traits that have to be imported by users don't work because they can't be imported. That is, if you have an implementation detail that's not exposed to users, you can make it a sealed trait. If you want to express a One could try to make the impl f32x4 {
fn rotate_right<T: RotateArg<Self>>(self, x: T) -> Self {
let x: f32x4 = x.to_vec();
... x and self are f32x4, so implement for vectors as usual ...
}
}
trait RotateArg<T> {
fn to_vec(self) -> T;
}
impl RotateArg<f32x4> for f32x4 {
fn to_vec(self) -> f32x4 { self }
}
impl RotateArg<f32x4> for f32 {
fn to_vec(self) -> f32x4 { f32x4::splat(self) }
} But note that we could extend a non-generic |
Is there some way to do |
Easily only using For what exactly do you need it? (IIRC the |
Some compilations turn a left rotate into a right. Not sure why |
That's weird. As long as the tests pass that might just be llvm optimizations at work. If you tell me which one exactly I can take a look. |
This reverts commit be0b786.
Nearly there. Just a few still doing |
Nvm I was misusing godbolt |
I think I've reached the end of my understanding of why these are still producing bad code |
I see you are only enabling the avx512f target feature, could you try enabling avx512vl as well? https://www.felixcloutier.com/x86/VPROLD:VPROLVD:VPROLQ:VPROLVQ.html mentions both feature flags for the instructions that do not take immediate mode arguments, and the run-time ones seem to be the only ones failing. The intel intrinsics guide agrees that both features are required: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=vprolvd&expand=4395,4405 |
That didn't seem to have any change (I may have implemented it incorrectly though) |
#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable = "avx512f"))] | ||
#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable = "avx512vl"))] | ||
#[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), assert_instr(vpro))] | ||
unsafe fn rotate_right_variable(x: u64x8) -> u64x8 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add tests here for 128-bit wide and 256-bit wide vectors as well?
Looking at the implementation of The LLVM-IR intrinsic that these should generate is So what you can do is emit the LLVM-IR that is being generated for the variable length rotates (e.g. using One weird thing is that in clang these require Note also that if instead of While those examples use So it seems that somewhere down the chain we are making a mistake that prevents LLVM from using rotates here, we just need to find out exactly where :D |
I think the Intel docs say |
@TheIronBorn sorry about the organisational issues, but would you mind sending this PR to https://github.com/gnzlbg/ppv ? The ppsv module will be removed from I'll leave this open here until we remove the ppsv module, and before removal will try to port this PR myself if you don't beat me to it. |
The ppsv module has been removed and this has already been merged upstream. @TheIronBorn I'll try to propagate the |
(also removed some duplicate tests)
It would be good to test that on supported hardware this compiles to the correct instruction, but I don't know how to do that.
Also some scalar convenience impls would be good.
(rust-random/rand#377)