-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension #71078
Labels
Comments
@llvm/issue-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon)
X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases.
```c
void foo(const int *src, float *dst) {
for (int i = 0; i != 16; ++i) {
*dst++ = (float)(*src++ + ((i % 8) + 1));
}
}
```
llc -mcpu=x86-64-v3
```asm
foo(int const*, float*): # @foo(int const*, float*)
vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [1,2,3,4,5,6,7,8]
vpaddd (%rdi), %ymm0, %ymm1
vcvtdq2ps %ymm1, %ymm1
vmovups %ymm1, (%rsi)
vpaddd 32(%rdi), %ymm0, %ymm0
vcvtdq2ps %ymm0, %ymm0
vmovups %ymm0, 32(%rsi)
vzeroupper
retq
```
We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd
|
RKSimon
added a commit
that referenced
this issue
Nov 17, 2023
…maller vector constant data If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both #70947 and better X86FixupVectorConstantsPass usage for #71078).
sr-tream
pushed a commit
to sr-tream/llvm-project
that referenced
this issue
Nov 20, 2023
…maller vector constant data If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both llvm#70947 and better X86FixupVectorConstantsPass usage for llvm#71078).
zahiraam
pushed a commit
to zahiraam/llvm-project
that referenced
this issue
Nov 20, 2023
…maller vector constant data If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both llvm#70947 and better X86FixupVectorConstantsPass usage for llvm#71078).
RKSimon
added a commit
that referenced
this issue
Nov 20, 2023
…maller vector constant data (REAPPLIED) If we already have a YMM/ZMM constant that a smaller XMM/YMM has matching lower bits, then ensure we reuse the same constant pool entry. Extends the similar combines we already have to reuse VBROADCAST_LOAD/SUBV_BROADCAST_LOAD constant loads. This is a mainly a canonicalization, but should make it easier for us to merge constant loads in a future commit (related to both #70947 and better X86FixupVectorConstantsPass usage for #71078). Reapplied with fix to ensure we don't 'flip-flop' between multiple matching constants - only perform the fold if the new constant pool entry is larger than the current entry.
RKSimon
added a commit
to RKSimon/llvm-project
that referenced
this issue
Jan 18, 2024
This helps ensure the encoding details are next to the EVEX tag Noticed while preparing to add more constant commenting as part of llvm#73783 and llvm#71078
RKSimon
added a commit
that referenced
this issue
Jan 18, 2024
VPMOVSX cases: #79815 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases.
llc -mcpu=x86-64-v3
We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd
The text was updated successfully, but these errors were encountered: