Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable vector shuffles. #387

Closed
wants to merge 4 commits into from
Closed

Portable vector shuffles. #387

wants to merge 4 commits into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Mar 20, 2018

This PR implements an API for portable vector shuffles.

I've opened #388 to discuss this API.

@gnzlbg gnzlbg mentioned this pull request Mar 20, 2018
@alexcrichton
Copy link
Member

Nice!

I'll admit though that I'm pretty wary about landing this, so much so that I think we'll want to keep this out of stdsimd for now if we can. I feel like the API for shuffles here is pretty up in the air (especially wrt language support), and I'm also not certain of the impact of this change once we include it in the standard library itself.

The stability of exported macros in libstd is historically a tricky topic (and even the exported traits here) and since this module will be directly included into libstd I'm hesitant to include this. I feel like in the long run we'll either want this to be a procedural macro in rustc and also have more MIR/typeck support for preventing errors at compile time.

How critical are shuffles though to the first pass of a portable API?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

I feel like in the long run we'll either want this to be a procedural macro in rustc and also have more MIR/typeck support for preventing errors at compile time.

I thought about this. Do you have a pointer to some macro that is implemented like this in rustc? I might give this a shot. About the type checking, the only thing that isn't type checked here is that the indices access the vectors in bounds (it is checked in trans). As mentioned in the comments, this should be possible in MIR/typeck, but not in librust_typeck/check/intrinsics.rs.

Also, I forgot to mention that the intrinsics should probably be annotated with the macro that checks that [T; N] is a compile-time constant in typeck instead of doing that in trans as well.

How critical are shuffles through to the first pass of a portable API?

They aren't in the first pass so they aren't critical at all. I just wanted to open an issue about a possible design, and thought that should better come with an implementation. I could add a #[cfg(feature = "stdbuild")] to the files and tests here so that these are not included in libstd builds but... i am just going to close this for now.

@gnzlbg gnzlbg closed this Mar 20, 2018
@alexcrichton
Copy link
Member

Hm so thinking more implementation wise this would probably actually not be much of a procedural macro but rather almost entirely a typeck thing. In typeck we can do things like const eval and otherwise type checking so I the only reason we'd want to use a procedural macro would be to perhaps use a special AST node that can't be syntactically constructed (like asm!).

In that sense it may make sense to just leave these as intrinsics and just have the intrinsic be super specially typechecked?

Before implementing that though this is probably something we'd want agreement on via an RFC before having the implementation

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

In that sense it may make sense to just leave these as intrinsics and just have the intrinsic be super specially typechecked?

cc @eddyb because were were talking about this a couple of hours ago. Basically if we are going to go through all that trouble, we must type check that the indices in the array of constants are in bounds. In particular, that for the one vector case they are in range [0, T::lanes()) and for the two-vectors case in range [0, 2*T::lanes()).

An alternative would be to have the macro in this PR have zero monomorphization time errors. The naive way to do that would be to not error on an index out-of-bounds in trans, and instead, insert a panic. But honestly I prefer the monomorphization-time error to that solution.

@danielrh
Copy link

Another way of approaching this is to have the function take in a trait with associated consts. I haven't found a less clunky way of doing it yet, but it could be something like this, where indices are checked at compile time:

struct SIMD {
    pub data: [i16;8],
}
macro_rules! check_indices {
    () => {fn check_indices() {
        let _test0: [u8;7 - Self::INDEX[0]] = [0;7 - Self::INDEX[0]];
        let _test1: [u8;7 - Self::INDEX[1]] = [0;7 - Self::INDEX[1]];
        let _test2: [u8;7 - Self::INDEX[2]] = [0;7 - Self::INDEX[2]];
        let _test3: [u8;7 - Self::INDEX[3]] = [0;7 - Self::INDEX[3]];
        let _test4: [u8;7 - Self::INDEX[4]] = [0;7 - Self::INDEX[4]];
        let _test5: [u8;7 - Self::INDEX[5]] = [0;7 - Self::INDEX[5]];
        let _test6: [u8;7 - Self::INDEX[6]] = [0;7 - Self::INDEX[6]];
        let _test7: [u8;7 - Self::INDEX[7]] = [0;7 - Self::INDEX[7]];
    }}
}
trait ConstIndices {
    const INDEX: [usize; 8];
    fn check_indices();
}

struct Backwards {}
impl ConstIndices for Backwards {
    const INDEX: [usize;8] = [7,6,5,4,3,2,1,0];
    check_indices!();
}

fn shuffle<Indices:ConstIndices>(vv: SIMD, _ind:Indices) -> SIMD{
    let v = vv.data;
    SIMD{
        data:[v[Indices::INDEX[0]],
             v[Indices::INDEX[1]],
             v[Indices::INDEX[2]],
             v[Indices::INDEX[3]],
             v[Indices::INDEX[4]],
             v[Indices::INDEX[5]],
             v[Indices::INDEX[6]],
             v[Indices::INDEX[7]]]
    }
}

fn main() {
    let _result = shuffle(SIMD{data:[2;8]}, Backwards{});
}

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Apr 27, 2018

Another way of approaching this is to have the function take in a trait with associated consts.

Note that the number of indices is variable: you can use shuffles to create smaller or larger vectors than the input ones:

// Given:
let a: i32x8;
let b: i32x8;

// All of these work:
let c: i32x2 = shuffle!(a, b, [3, 15]);
let d: i32x4 = shuffle!(a, b, [1, 15, 3, 12]);
let e: i32x16 = shuffle!(a, b, [0, 1, ..., 15]);

IIUC the associated const approach is going to need a little bit more work, since without const generics, the length of the associated const array cannot be generic either.

@danielrh
Copy link

Not sure this is still a good idea, but just to throw this out there with existing mechanisms: what if you had a separate function for narrowing or widening a vector that didn't take arguments (either 0 padding it or repeating it, whatever was easiest/fastest)
eg

// Given:
let a: i32x8;
let b: i32x8;

let c: i32x2 = i32x2::prefix_trunc(shuffle!(a, b, [3, 15, 0, 0]));
let d: i32x4 = i32x4::prefix_trunc(shuffle!(a, b, [1, 15, 3, 12, 0,  0, 0 ,0]));
let e: i32x16 = shuffle!(i32x16::concat(a, b), [0, 1, ..., 15]);

and then teach the optimizer to fuse the two shuffles you do internally
that way the type system stays simple... but it is a little more verbose than it could be.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Apr 28, 2018

Ideally shuffle would be just a method on vectors with the following signature:

fn shuffle<const N: usize, R>(self, other: Self, const indices: [usize; N]) -> R 
    where R: SimdVector<Item=Self::Item, Length=N> { ... }

There are multiple problems that we currently have to face:

    1. lack of const generics
    1. lack of const function arguments

We can workaround lack of const generics by using a trait on arrays, so we can specify:

fn shuffle<I: Indices, R>(self, other: Self, const indices: I) -> R 
    where R: SimdVector<Item=Self::Item, Length=I::Length> { ... }

We can work around lack of const function arguments by making it a shuffle! macro instead, which means that we loose method position, but otherwise that's not too bad.

We could make it a very special free function (instead of a macro), by implementing it in MIR typeck as @alexcrichton suggested. There we can require that indices is an array of const items, inspect the array values to error if the indices are out-of-bounds at compile-time, etc.

So we would get a magic "function" with this signature instead:

fn shuffle<T: SimdVector, R, /*N is magic*/>(a: T, b: T, /*magically const*/ indices: [usize; N]) -> R 
    where R: SimdVector<Item=T::Item, Length=N> { ... }

This PR implements it as a macro in the language, because that's basically the only way we currently have to do this with the available compiler magic, but I agree with @alexcrichton that doing this in MIR typeck is the best path forward. Maybe as the language gets const generics and const function arguments, the shuffle "function" signature can become less and less magical.

FWIW, once you have shuffle, you can implement a.concat(b) on top of it without any magic:

trait Concat: SimdVector {
    type Result: SimdVector<Item=Self::Item>;
    fn concat(self, other: Self) -> Self::Result;
}

impl Concat for u32x4 {
    type Result = u32x8;
    fn concat(self, other: u32x4) -> u32x8 {
        shuffle!(self, other, [0, 1, 2, 3, 4, 5, 6, 7])
    }
}

let a: u32x4;
let b: u32x4;
let c: u32x8 = a.concat(b);

I think that adding concat to std::simd is something worth doing, but I prefer to nail down shuffle first.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 4, 2018

@alexcrichton shall I reopen and merge this. In a nutshell, I agree that it would be better to move this macro to rustc, but I don't have the time to do it, and its API is something worth getting experience with in the meantime.

@danielrh
Copy link

danielrh commented Jun 4, 2018 via email

@gnzlbg gnzlbg reopened this Jun 12, 2018
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 13, 2018

@alexcrichton maybe we could ask feedback for the lib teams on this?

}
};
($vec:expr, [$($l:expr),*]) => {
shuffle!($vec, $vec, [$($l),*])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will evaluate $vec twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amanieu

What's the best way to fix this? Just:

{
    let v = $vec;
    shuffle!(v, v, ...)
}

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match $vec { v => shuffle!(v, v, ...) } is preferred because of let ... in ... semantics, which regular let doesn't have (less relevant here, but all the temporaries in $vec stay alive for the duration of the match).

@Amanieu
Copy link
Member

Amanieu commented Jun 15, 2018

Are there any plans to support single-element vectors (e.g. u64x1)? NEON has such types and LLVM does not use the same codegen as scalar types for these (for integer types, values are kept in SIMD registers rather then being first moved to a general-purpose register).

@Amanieu
Copy link
Member

Amanieu commented Jun 15, 2018

This is somewhat relevant to this issue since we will need to add a simd_shuffle1 intrinsic to support this, and I was wondering if it was worth extending this to the generic API as well.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 15, 2018 via email

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jun 21, 2018

I am holding this PR until we have an idea about how to resolve: rust-lang/rfcs#2366 (comment)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Jul 16, 2018

Superseeded by https://github.com/gnzlbg/ppv

@gnzlbg gnzlbg closed this Jul 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants