Add mask method to extract the mask as an integer #166

gnzlbg · 2018-09-12T14:00:42Z

A common operation on masks is converting the vector into an integer where each bit denotes a lane.

GabrielMajeri · 2018-09-12T14:47:53Z

@gnzlbg From what I can understand, I think this might help with #150, specifically:
we can use a mask to track which lanes have diverged, and then convert this mask to an integer which we then directly output as a bitmap.

gnzlbg · 2018-09-12T16:52:19Z

@GabrielMajeri I think that to implement this portably we might want to generate llvm-ir that looks like this (cc @rkruppe - https://gcc.godbolt.org/z/VNbIFQ):

define i8 @m32x4_to_i8(<4 x i32>) {
    %a = trunc <4 x i32> %0 to <4 x i1>
    %b = bitcast <4 x i1> %a to i4
    %c = zext i4 %b to i8
    ret i8 %c
}

Since we can't directly use the <N x i1> types from Rust, we will probably need to add a rustc intrinsic for this, but that shouldn't be hard.

gnzlbg · 2018-09-12T17:12:05Z

After optimizations, that IR becomes

define i8 @m32x4_to_i8_opt(<4 x i32>) {
  %2 = and <4 x i32> %0, <i32 1, i32 1, i32 1, i32 1>
  %a = icmp ne <4 x i32> %2, zeroinitializer
  %b = bitcast <4 x i1> %a to i4
  %c = zext i4 %b to i8
  ret i8 %c
}

We could also generate IR like this:

define i8  @m32x4_to_i8_2(<4 x i32>) {
  %a = icmp ne <4 x i32> %0, zeroinitializer
  %b = bitcast <4 x i1> %a to i4
  %c = zext i4 %b to i8
  ret i8 %c
}

but for some reason the quality of the generated assembly differs significantly, and the unoptimized IR seems to be the best... (https://gcc.godbolt.org/z/I-Sco7):

m32x4_to_i8: #unoptimized
 vpslld $0x1f,%xmm0,%xmm0
 vpsrad $0x1f,%xmm0,%xmm0
 vmovmskps %xmm0,%eax
 retq   
 nop
m32x4_to_i8_opt:
 vpbroadcastd 0x0(%rip),%xmm1        # 19 <m32x4_to_i8_opt+0x9>
 vpand  %xmm1,%xmm0,%xmm0
 vpcmpeqd %xmm1,%xmm0,%xmm0
 vmovmskps %xmm0,%eax
 retq   
 nopw   %cs:0x0(%rax,%rax,1)
m32x4_to_i8_2:
 vpxor  %xmm1,%xmm1,%xmm1
 vpcmpeqd %xmm1,%xmm0,%xmm0
 vpcmpeqd %xmm1,%xmm1,%xmm1
 vpxor  %xmm1,%xmm0,%xmm0
 vmovmskps %xmm0,%eax
 retq

Add intrinsic to create an integer bitmask from a vector mask This PR adds a new simd intrinsic: `simd_bitmask(vector) -> unsigned integer` that creates an integer bitmask from a vector mask by extracting one bit of each vector lane. This is required to implement: rust-lang/packed_simd#166 . EDIT: the reason we need an intrinsics for this is that we have to truncate the vector lanes to an `<i1 x N>` vector, and then bitcast that to an `iN` integer (while making sure that we only materialize `i8`, ... , `i64` - that is, no `i1`, `i2`, `i4`, types), and we can't do any of that in a Rust library. r? @rkruppe

gnzlbg added the Enhancement New feature or request label Sep 12, 2018

gnzlbg referenced this issue in hsivonen/encoding_rs Sep 12, 2018

Switch from simd to packed_simd.

8fe5c19

gnzlbg mentioned this issue Jan 2, 2019

Add intrinsic to create an integer bitmask from a vector mask rust-lang/rust#57269

Merged

gnzlbg mentioned this issue Jan 31, 2019

Implement bitmask #209

Merged

gnzlbg closed this as completed in #209 Jan 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mask method to extract the mask as an integer #166

Add mask method to extract the mask as an integer #166

gnzlbg commented Sep 12, 2018

GabrielMajeri commented Sep 12, 2018

gnzlbg commented Sep 12, 2018 •

edited

Loading

gnzlbg commented Sep 12, 2018 •

edited

Loading

Add mask method to extract the mask as an integer #166

Add mask method to extract the mask as an integer #166

Comments

gnzlbg commented Sep 12, 2018

GabrielMajeri commented Sep 12, 2018

gnzlbg commented Sep 12, 2018 • edited Loading

gnzlbg commented Sep 12, 2018 • edited Loading

gnzlbg commented Sep 12, 2018 •

edited

Loading

gnzlbg commented Sep 12, 2018 •

edited

Loading