[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078

RKSimon · 2023-11-02T16:49:34Z

X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases.

void foo(const int *src, float *dst) {
    for (int i = 0; i != 16; ++i) {
        *dst++ = (float)(*src++ + ((i % 8) + 1));
    }
}

llc -mcpu=x86-64-v3

foo(int const*, float*): # @foo(int const*, float*)
  vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [1,2,3,4,5,6,7,8]
  vpaddd (%rdi), %ymm0, %ymm1
  vcvtdq2ps %ymm1, %ymm1
  vmovups %ymm1, (%rsi)
  vpaddd 32(%rdi), %ymm0, %ymm0
  vcvtdq2ps %ymm0, %ymm0
  vmovups %ymm0, 32(%rsi)
  vzeroupper
  retq

We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd

llvmbot · 2023-11-02T16:49:52Z

@llvm/issue-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases. ```c void foo(const int *src, float *dst) { for (int i = 0; i != 16; ++i) { *dst++ = (float)(*src++ + ((i % 8) + 1)); } } ``` llc -mcpu=x86-64-v3 ```asm foo(int const*, float*): # @foo(int const*, float*) vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [1,2,3,4,5,6,7,8] vpaddd (%rdi), %ymm0, %ymm1 vcvtdq2ps %ymm1, %ymm1 vmovups %ymm1, (%rsi) vpaddd 32(%rdi), %ymm0, %ymm0 vcvtdq2ps %ymm0, %ymm0 vmovups %ymm0, 32(%rsi) vzeroupper retq ``` We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd

…maller vector constant data If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both #70947 and better X86FixupVectorConstantsPass usage for #71078).

…maller vector constant data If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both llvm#70947 and better X86FixupVectorConstantsPass usage for llvm#71078).

…maller vector constant data (REAPPLIED) If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both #70947 and better X86FixupVectorConstantsPass usage for #71078). Reapplied with fix to ensure we don't 'flip-flop' between multiple matching constants - only perform the fold if the new constant pool entry is larger than the current entry.

This helps ensure the encoding details are next to the EVEX tag Noticed while preparing to add more constant commenting as part of llvm#73783 and llvm#71078

…78585) This helps ensure the encoding details are next to the EVEX tag Noticed while preparing to add more constant commenting as part of #73783 and #71078

…or future patches. NFC. Add helper to convert raw APInt bit stream into ConstantDataVector elements. This was used internally by rebuildSplatableConstant but will be reused in future patches for #73783 and #71078

RKSimon · 2024-01-29T13:01:46Z

VPMOVSX cases: #79815

RKSimon · 2024-02-07T12:56:03Z

VPMOVSX: b5d35fe
VPMOVZX: 7bfcf8c

RKSimon added the backend:X86 label Nov 2, 2023

RKSimon mentioned this issue Nov 29, 2023

[X86] X86FixupVectorConstantsPass - add support for VMOVD/VMOVQ/VMOVSS/VMOVSD zero upper constant loads #73783

Closed

RKSimon mentioned this issue Jan 18, 2024

[X86] Emit verbose (constant) comments before EVEX compression tag #78585

Merged

RKSimon mentioned this issue Jan 24, 2024

[X86] X86FixupVectorConstants - shrink vector load to movsd/movsd/movd/movq 'zero upper' instructions #79000

Merged

RKSimon closed this as completed Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078

[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078

RKSimon commented Nov 2, 2023

llvmbot commented Nov 2, 2023

RKSimon commented Jan 29, 2024

RKSimon commented Feb 7, 2024

[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078

[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078

Comments

RKSimon commented Nov 2, 2023

llvmbot commented Nov 2, 2023

RKSimon commented Jan 29, 2024

RKSimon commented Feb 7, 2024