Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WRONG code when enabling Loop Vectorizer with -Os #82665

Closed
JonPsson1 opened this issue Feb 22, 2024 · 6 comments · Fixed by #84339
Closed

WRONG code when enabling Loop Vectorizer with -Os #82665

JonPsson1 opened this issue Feb 22, 2024 · 6 comments · Fixed by #84339

Comments

@JonPsson1
Copy link
Contributor

This program should print 1:

cat wrong0.i

int printf(const char *, ...);
int *IntPtr = 0;
int Val, ValB;
int *IntPtrArr5[5] = {0, 0, 0, 0, 0};
int *IntPtr2 = &ValB;
int main() {
  for (int IV1 = 0; IV1 < 5; IV1++)
    IntPtrArr5[IV1] = &Val;
  for (int IV2 = 1; IV2 <= 4; IV2 += 1) {
    int **PtrPtr = &IntPtr2;    // -> IntPtr2          
    int **PtrPtr2 = &IntPtr;    // -> IntPtr
    *PtrPtr2 = IntPtrArr5[IV2]; // -> IntPtr -> Val  
    *IntPtr2 ^= 1;              //    ValB = 1         Val = 1 (2nd iter)
    *PtrPtr = IntPtrArr5[IV2];  // -> IntPtr2 -> Val
  }
  printf("checksum = %X\n", Val);
}

In the first iteration, IntPtr2 points to ValB and sets it to 1. Then, only in the second iteration does IntPtr2 point to Val and set that to 1. I wonder if the loopvectorizer has done something wrong, or if there is some later pass possibly that is messing it up..?

rm a.out; clang -O0 -march=z16 wrong0.i -o a.out -w; ./a.out
checksum = 1

rm a.out; clang -Os -march=z16 wrong0.i -o a.out -w -fno-vectorize -fno-slp-vectorize; ./a.out
checksum = 1

rm a.out; clang -Os -march=z16 wrong0.i -o a.out -w -fno-slp-vectorize; ./a.out
checksum = 0

@fhahn @david-arm @nilanjana87

@fhahn
Copy link
Contributor

fhahn commented Feb 22, 2024

Hm interesting, let me take a look

@fhahn
Copy link
Contributor

fhahn commented Mar 7, 2024

Reproduces on AArch64 with -mllvm -force-vector-width=2 (probably on X86 too). The issue seems to be that we miss the dependence for the indirect memory access in the loop (similar to #69744). Looking into a fix

fhahn added a commit that referenced this issue Mar 7, 2024
fhahn added a commit to fhahn/llvm-project that referenced this issue Mar 7, 2024
At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes llvm#82665.
fhahn added a commit that referenced this issue Mar 12, 2024
…her (#84339)

At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes #82665.

PR: #84339
@nikic nikic added this to the LLVM 18.X Release milestone Mar 12, 2024
@nikic
Copy link
Contributor

nikic commented Mar 12, 2024

/cherry-pick b274b23

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 12, 2024

Failed to cherry-pick: b274b23

https://github.com/llvm/llvm-project/actions/runs/8252583153

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

@nikic
Copy link
Contributor

nikic commented Mar 12, 2024

/cherry-pick 4cfd4a7 b274b23

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 12, 2024
Test case for llvm#82665.

(cherry picked from commit 4cfd4a7)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 12, 2024
…her (llvm#84339)

At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes llvm#82665.

PR: llvm#84339
(cherry picked from commit b274b23)
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 12, 2024

/pull-request #84950

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 13, 2024
Test case for llvm#82665.

(cherry picked from commit 4cfd4a7)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 13, 2024
…her (llvm#84339)

At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes llvm#82665.

PR: llvm#84339
(cherry picked from commit b274b23)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

5 participants