Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Fix crash in allowsMisalignedMemoryAccesses with i1 #105794

Merged

Conversation

kerbowa
Copy link
Member

@kerbowa kerbowa commented Aug 23, 2024

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Aug 23, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-amdgpu

Author: Austin Kerbow (kerbowa)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/105794.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+2-1)
  • (modified) llvm/test/CodeGen/AMDGPU/load-local-i1.ll (+13)
  • (added) llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/load-i1-misaligned.ll (+20)
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index c954c0aa71f734..8d21f529b42740 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -1695,7 +1695,8 @@ bool SITargetLowering::allowsMisalignedMemoryAccessesImpl(
     if (!Subtarget->hasUnalignedDSAccessEnabled() && Alignment < Align(4))
       return false;
 
-    Align RequiredAlignment(PowerOf2Ceil(Size/8)); // Natural alignment.
+    Align RequiredAlignment(
+        PowerOf2Ceil(std::max(Size / 8, 1u))); // Natural alignment.
     if (Subtarget->hasLDSMisalignedBug() && Size > 32 &&
         Alignment < RequiredAlignment)
       return false;
diff --git a/llvm/test/CodeGen/AMDGPU/load-local-i1.ll b/llvm/test/CodeGen/AMDGPU/load-local-i1.ll
index 578170941efaaa..43d102e4655b23 100644
--- a/llvm/test/CodeGen/AMDGPU/load-local-i1.ll
+++ b/llvm/test/CodeGen/AMDGPU/load-local-i1.ll
@@ -462,4 +462,17 @@ define amdgpu_kernel void @local_sextload_v64i1_to_v64i64(ptr addrspace(3) %out,
   ret void
 }
 
+; FUNC-LABEL: {{^}}local_load_i1_misaligned:
+; SICIVI: s_mov_b32 m0
+; GFX9-NOT: m0
+define amdgpu_kernel void @local_load_i1_misaligned(ptr addrspace(3) %in, ptr addrspace (3) %out) #0 {
+  %in.gep.1 = getelementptr i1, ptr addrspace(3) %in, i32 1
+  %load.1 = load <16 x i1>, ptr addrspace(3) %in.gep.1, align 4
+  %load.2 = load <8 x i1>, ptr addrspace(3) %in, align 1
+  %out.gep.1 = getelementptr i1, ptr addrspace(3) %out, i32 16
+  store <16 x i1> %load.1, ptr addrspace(3) %out
+  store <8 x i1> %load.2, ptr addrspace(3) %out.gep.1
+  ret void
+}
+
 attributes #0 = { nounwind }
diff --git a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/load-i1-misaligned.ll b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/load-i1-misaligned.ll
new file mode 100644
index 00000000000000..6f3d2cb69090eb
--- /dev/null
+++ b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/load-i1-misaligned.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=amdgcn-amd-amdhsa --mcpu=gfx940 -passes=load-store-vectorizer -S -o - %s | FileCheck %s
+
+; Don't crash when checking for misaligned accesses with sub-byte size.
+
+define void @misaligned_access_i1(ptr addrspace(3) %in) #0 {
+; CHECK-LABEL: define void @misaligned_access_i1(
+; CHECK-SAME: ptr addrspace(3) [[IN:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[IN_GEP_1:%.*]] = getelementptr i1, ptr addrspace(3) [[IN]], i32 1
+; CHECK-NEXT:    [[TMP1:%.*]] = load <16 x i1>, ptr addrspace(3) [[IN_GEP_1]], align 4
+; CHECK-NEXT:    [[TMP2:%.*]] = load <8 x i1>, ptr addrspace(3) [[IN]], align 1
+; CHECK-NEXT:    ret void
+;
+  %in.gep.1 = getelementptr i1, ptr addrspace(3) %in, i32 1
+
+  %1 = load <16 x i1>, ptr addrspace(3) %in.gep.1, align 4
+  %2 = load <8 x i1>, ptr addrspace(3) %in, align 1
+  ret void
+}
+

@@ -1695,7 +1695,8 @@ bool SITargetLowering::allowsMisalignedMemoryAccessesImpl(
if (!Subtarget->hasUnalignedDSAccessEnabled() && Alignment < Align(4))
return false;

Align RequiredAlignment(PowerOf2Ceil(Size/8)); // Natural alignment.
Align RequiredAlignment(
PowerOf2Ceil(std::max(Size / 8, 1u))); // Natural alignment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use divideCeil?

@kerbowa kerbowa force-pushed the fix-crash-in-allowsMisalignedMemoryAccesses-i1 branch from bb2ce3d to f01dd7e Compare August 23, 2024 16:50
@kerbowa kerbowa merged commit ceb587a into llvm:main Aug 23, 2024
6 of 8 checks passed
dmpolukhin pushed a commit to dmpolukhin/llvm-project that referenced this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants