Compiler no longer producing single horizontal add instruction after ab2c499d #49736

dyung · 2021-05-18T17:38:24Z


Bugzilla Link	50392
Version	trunk
OS	All
CC	@alexey-bataev,@anton-afanasyev,@RKSimon

Extended Description

One of our internal tests compiles the following code and verifies that it generates a horizontal add instruction in the resulting assembly when optimizations are enabled and targeting btver2.

__attribute__((noinline))
__m256d add_pd_002(__m256d a, __m256d b) {
  __m256d r = (__m256d){ a[0] + a[1], a[2] + a[3], b[0] + b[1], b[2] + b[3] };
  return __builtin_shufflevector(r, a, 0, -1, 2, 3);
}

Prior to commit ab2c499, when compiled with "-g0 -O3 -march=btver2", the compiler would produce the following code:

        vinsertf128     $1, %xmm1, %ymm0, %ymm0
        vhaddpd %ymm1, %ymm0, %ymm0

But following the mentioned change, the compiler is now producing the following code instead:

        vextractf128    $1, %ymm1, %xmm2
        vhaddpd %xmm0, %xmm0, %xmm0
        vhaddpd %xmm2, %xmm1, %xmm1
        vperm2f128      $2, %ymm0, %ymm1, %ymm0 # ymm0 = ymm0[0,1],ymm1[0,1]

The text was updated successfully, but these errors were encountered:

dyung · 2021-05-18T17:39:42Z

Link to review of the change that caused the regression: https://reviews.llvm.org/D98714

anton-afanasyev · 2021-05-18T17:54:59Z

To add more details: this should be fixed after Alexey's patch (https://reviews.llvm.org/D57059) for the non-power-of-two vector sizes landing.

I've kept the old PR50392 tag since this is such an old issue....

…"binop (shuffle), (shuffle)" (llvm#114101) Add foldPermuteOfBinops - to fold a permute (single source shuffle) through a binary op that is being fed by other shuffles. Fixes llvm#94546 Fixes llvm#49736

I've kept the old PR50392 tag since this is such an old issue....

…"binop (shuffle), (shuffle)" (llvm#114101) Add foldPermuteOfBinops - to fold a permute (single source shuffle) through a binary op that is being fed by other shuffles. Fixes llvm#94546 Fixes llvm#49736

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021

RKSimon added a commit that referenced this issue May 8, 2022

[SLP][X86] Add test coverage for PR50392 / Issue #49736

9a12138

RKSimon mentioned this issue Oct 29, 2024

[VectorCombine] Fold "shuffle (binop (shuffle, shuffle)), undef" --> "binop (shuffle), (shuffle)" #114101

Merged

RKSimon self-assigned this Oct 29, 2024

RKSimon added a commit that referenced this issue Oct 30, 2024

[PhaseOrdering][X86] Add additional test coverage for #49736

2de1fc8

I've kept the old PR50392 tag since this is such an old issue....

RKSimon closed this as completed in #114101 Oct 31, 2024

RKSimon closed this as completed in 92af82a Oct 31, 2024

EugeneZelenko added the vectorizers label Oct 31, 2024

NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this issue Nov 4, 2024

[PhaseOrdering][X86] Add additional test coverage for llvm#49736

2404710

I've kept the old PR50392 tag since this is such an old issue....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler no longer producing single horizontal add instruction after ab2c499d #49736

Compiler no longer producing single horizontal add instruction after ab2c499d #49736

dyung commented May 18, 2021 •

edited by RKSimon

Loading

dyung commented May 18, 2021

anton-afanasyev commented May 18, 2021

Compiler no longer producing single horizontal add instruction after ab2c499d #49736

Compiler no longer producing single horizontal add instruction after ab2c499d #49736

Comments

dyung commented May 18, 2021 • edited by RKSimon Loading

Extended Description

dyung commented May 18, 2021

anton-afanasyev commented May 18, 2021

dyung commented May 18, 2021 •

edited by RKSimon

Loading