Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][debug] Improve handling of cyclic derived types. #122770

Merged
merged 1 commit into from
Jan 20, 2025
Merged

Conversation

abidh
Copy link
Contributor

@abidh abidh commented Jan 13, 2025

When RecordType is converted to corresponding DIType, we cache the information to avoid doing the conversion again.

Our conversion of RecordType looks like this:

ConvertRecordType(RecordType Ty)

  1. If type Ty is already in the cache, then return the corresponding item.
  2. Create a place holder DICompositeTypeAttr (called ty_self below) for Ty
  3. Put Ty->ty_self in the cache
  4. Convert members of Ty. This may cause ConvertRecordType to be called again with other types.
  5. Create final DICompositeTypeAttr
  6. Replace the ty_self in the cache with one created in step 5 end

The purpose of creating ty_self is to handle cases where a member may have reference to parent type.

Now consider the code below:

type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type

While processing t1, we could have a structure like below. t1 -> t2 -> t1_self

The t2 created during handling of t1 cant be cached on its own as it contains a place holder reference. It will fail an assert in MLIR if it is processed standalone. To avoid this problem, we have a check in the step 6 above to not cache such types. But this check was not tight enough. It just checked if a type should not have a place holder reference to another type. It missed the following case where the place holder reference can be in a type further down the line.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

So while processing t1, we have to stop caching of not only t3 but also of t2. This PR improves the check and moves the logic inside convertRecordType.

Please note that this limitation of why a type cant have a placeholder reference is because of how such references are resolved in the mlir. Please see the discussion at the end of this PR.

I have to change getDerivedType so that it will also get the derived type for things like type(t2), pointer :: p1 which are wrapped in BoxType. Happy to move it to a new function or a local helper in case this change is problematic.

Fixes #122024.

When RecordType is converted to corresponding DIType, we cache the
information to avoid doing the conversion again.

Our conversion of RecordType looks like this:

ConvertRecordType(RecordType Ty)
1. If type Ty is already in the cache, then return the corresponding item.
2. Create a place holder DICompositeTypeAttr (called ty_self below) for Ty
3. Put Ty->ty_self in the cache
4. Convert members of Ty. This may cause ConvertRecordType to be called
   agin with other types.
5. Create final DICompositeTypeAttr
6. Replace the ty_self in the cache with one created in step 5
end

The purpose of creating ty_self is to handle cases where a member may
have reference to parent type.

Now consider the code below:

type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type

While processing t1, we could have a structure like below.
t1 -> t2 -> t1_self

The t2 created during handling of t1 cant be cached on its own as it contains a
place holder reference. It will fail an assert in MLIR if it is processed
standalone. We previously had a check in the step 6 above to not cache it.
But this check was not tight enough. It just checked if a type should not have
a place holder reference to another type. It missed the following case where
the place holder reference can be in a type further down the line.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

So while processing t1, we have to stop caching of not only t3 but also
of t2. This PR improves the check and moves the logic inside
convertRecordType.

Please note that this limitation of why a type cant have a placeholder
reference is because of how such references are resolved in the mlir. Please
see the discussion at the end of the following PR.
llvm#106571

Fixes llvm#122024.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jan 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Abid Qadeer (abidh)

Changes

When RecordType is converted to corresponding DIType, we cache the information to avoid doing the conversion again.

Our conversion of RecordType looks like this:

ConvertRecordType(RecordType Ty)

  1. If type Ty is already in the cache, then return the corresponding item.
  2. Create a place holder DICompositeTypeAttr (called ty_self below) for Ty
  3. Put Ty->ty_self in the cache
  4. Convert members of Ty. This may cause ConvertRecordType to be called again with other types.
  5. Create final DICompositeTypeAttr
  6. Replace the ty_self in the cache with one created in step 5 end

The purpose of creating ty_self is to handle cases where a member may have reference to parent type.

Now consider the code below:

type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type

While processing t1, we could have a structure like below. t1 -> t2 -> t1_self

The t2 created during handling of t1 cant be cached on its own as it contains a place holder reference. It will fail an assert in MLIR if it is processed standalone. To avoid this problem, we have a check in the step 6 above to not cache such types. But this check was not tight enough. It just checked if a type should not have a place holder reference to another type. It missed the following case where the place holder reference can be in a type further down the line.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

So while processing t1, we have to stop caching of not only t3 but also of t2. This PR improves the check and moves the logic inside convertRecordType.

Please note that this limitation of why a type cant have a placeholder reference is because of how such references are resolved in the mlir. Please see the discussion at the end of this PR.

I have to change getDerivedType so that it will also get the derived type for things like type(t2), pointer :: p1 which are wrapped in BoxType. Happy to move it to a new function or a local helper in case this change is problematic.

Fixes #122024.


Full diff: https://github.com/llvm/llvm-project/pull/122770.diff

3 Files Affected:

  • (modified) flang/lib/Optimizer/Dialect/FIRType.cpp (+1)
  • (modified) flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp (+58-50)
  • (added) flang/test/Integration/debug-cyclic-derived-type-3.f90 (+32)
diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp
index d25e5651f1142f..14f42d357675b4 100644
--- a/flang/lib/Optimizer/Dialect/FIRType.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRType.cpp
@@ -210,6 +210,7 @@ mlir::Type getDerivedType(mlir::Type ty) {
           return seq.getEleTy();
         return p.getEleTy();
       })
+      .Case<fir::BoxType>([](auto p) { return getDerivedType(p.getEleTy()); })
       .Default([](mlir::Type t) { return t; });
 }
 
diff --git a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
index 8ae3d313d881c5..3f77547b91ae5d 100644
--- a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+++ b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
@@ -273,55 +273,6 @@ static mlir::LLVM::DITypeAttr getUnderlyingType(mlir::LLVM::DITypeAttr Ty) {
   return Ty;
 }
 
-// Currently, the handling of recursive debug type in mlir has some limitations.
-// Those limitations were discussed at the end of the thread for following PR.
-// https://github.com/llvm/llvm-project/pull/106571
-//
-// Problem could be explained with the following example code:
-//  type t2
-//   type(t1), pointer :: p1
-// end type
-// type t1
-//   type(t2), pointer :: p2
-// end type
-// In the description below, type_self means a temporary type that is generated
-// as a place holder while the members of that type are being processed.
-//
-// If we process t1 first then we will have the following structure after it has
-// been processed.
-// t1 -> t2 -> t1_self
-// This is because when we started processing t2, we did not have the complete
-// t1 but its place holder t1_self.
-// Now if some entity requires t2, we will already have that in cache and will
-// return it. But this t2 refers to t1_self and not to t1. In mlir handling,
-// only those types are allowed to have _self reference which are wrapped by
-// entity whose reference it is. So t1 -> t2 -> t1_self is ok because the
-// t1_self reference can be resolved by the outer t1. But standalone t2 is not
-// because there will be no way to resolve it. Until this is fixed in mlir, we
-// avoid caching such types. Please see DebugTranslation::translateRecursive for
-// details on how mlir handles recursive types.
-static bool canCacheThisType(mlir::LLVM::DICompositeTypeAttr comTy) {
-  for (auto el : comTy.getElements()) {
-    if (auto mem =
-            mlir::dyn_cast_if_present<mlir::LLVM::DIDerivedTypeAttr>(el)) {
-      mlir::LLVM::DITypeAttr memTy = getUnderlyingType(mem.getBaseType());
-      if (auto baseTy =
-              mlir::dyn_cast_if_present<mlir::LLVM::DICompositeTypeAttr>(
-                  memTy)) {
-        // We will not cache a type if one of its member meets the following
-        // conditions:
-        // 1. It is a structure type
-        // 2. It is a place holder type (getIsRecSelf() is true)
-        // 3. It is not a self reference. It is ok to have t1_self in t1.
-        if (baseTy.getTag() == llvm::dwarf::DW_TAG_structure_type &&
-            baseTy.getIsRecSelf() && (comTy.getRecId() != baseTy.getRecId()))
-          return false;
-      }
-    }
-  }
-  return true;
-}
-
 std::pair<std::uint64_t, unsigned short>
 DebugTypeGenerator::getFieldSizeAndAlign(mlir::Type fieldTy) {
   mlir::Type llvmTy;
@@ -343,6 +294,7 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
   if (iter != typeCache.end())
     return iter->second;
 
+  bool canCacheThisType = true;
   llvm::SmallVector<mlir::LLVM::DINodeAttr> elements;
   mlir::MLIRContext *context = module.getContext();
   auto recId = mlir::DistinctAttr::create(mlir::UnitAttr::get(context));
@@ -406,6 +358,62 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
         /*extra data=*/nullptr);
     elements.push_back(tyAttr);
     offset += llvm::alignTo(byteSize, byteAlign);
+
+    // Currently, the handling of recursive debug type in mlir has some
+    // limitations that were discussed at the end of the thread for following
+    // PR.
+    // https://github.com/llvm/llvm-project/pull/106571
+    //
+    // Problem could be explained with the following example code:
+    //  type t2
+    //   type(t1), pointer :: p1
+    // end type
+    // type t1
+    //   type(t2), pointer :: p2
+    // end type
+    // In the description below, type_self means a temporary type that is
+    // generated
+    // as a place holder while the members of that type are being processed.
+    //
+    // If we process t1 first then we will have the following structure after
+    // it has been processed.
+    // t1 -> t2 -> t1_self
+    // This is because when we started processing t2, we did not have the
+    // complete t1 but its place holder t1_self.
+    // Now if some entity requires t2, we will already have that in cache and
+    // will return it. But this t2 refers to t1_self and not to t1. In mlir
+    // handling, only those types are allowed to have _self reference which are
+    // wrapped by entity whose reference it is. So t1 -> t2 -> t1_self is ok
+    // because the t1_self reference can be resolved by the outer t1. But
+    // standalone t2 is not because there will be no way to resolve it. Until
+    // this is fixed in mlir, we avoid caching such types. Please see
+    // DebugTranslation::translateRecursive for details on how mlir handles
+    // recursive types.
+    // The code below checks for situation where it will be unsafe to cache
+    // a type to avoid this problem. We do that in 2 situations.
+    // 1. If a member is record type, then its type would have been processed
+    // before reaching here. If it is not in the cache, it means that it was
+    // found to be unsafe to cache. So any type containing it will also not
+    // be cached
+    // 2. The type of the member is found in the cache but it is a place holder.
+    // In this case, its recID should match the recID of the type we are
+    // processing. This helps us to cache the following type.
+    // type t
+    //  type(t), allocatable :: p
+    // end type
+    mlir::Type baseTy = getDerivedType(fieldTy);
+    if (auto recTy = mlir::dyn_cast<fir::RecordType>(baseTy)) {
+      auto iter = typeCache.find(recTy);
+      if (iter == typeCache.end())
+        canCacheThisType = false;
+      else {
+        if (auto tyAttr =
+                mlir::dyn_cast<mlir::LLVM::DICompositeTypeAttr>(iter->second)) {
+          if (tyAttr.getIsRecSelf() && tyAttr.getRecId() != recId)
+            canCacheThisType = false;
+        }
+      }
+    }
   }
 
   auto finalAttr = mlir::LLVM::DICompositeTypeAttr::get(
@@ -414,7 +422,7 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
       /*baseType=*/nullptr, mlir::LLVM::DIFlags::Zero, offset * 8,
       /*alignInBits=*/0, elements, /*dataLocation=*/nullptr, /*rank=*/nullptr,
       /*allocated=*/nullptr, /*associated=*/nullptr);
-  if (canCacheThisType(finalAttr)) {
+  if (canCacheThisType) {
     typeCache[Ty] = finalAttr;
   } else {
     auto iter = typeCache.find(Ty);
diff --git a/flang/test/Integration/debug-cyclic-derived-type-3.f90 b/flang/test/Integration/debug-cyclic-derived-type-3.f90
new file mode 100644
index 00000000000000..ef9aed13cc514c
--- /dev/null
+++ b/flang/test/Integration/debug-cyclic-derived-type-3.f90
@@ -0,0 +1,32 @@
+! RUN: %flang_fc1 -emit-llvm -debug-info-kind=standalone %s -o -
+
+! mainly test that this program does not cause an assertion failure
+! testcase for issue 122024
+
+module m1
+  type t1
+    type(t2),pointer :: x1
+  end type
+  type t2
+    type(t3),pointer :: x2
+  end type
+  type t3
+    type(t1),pointer :: x3
+  end type
+end
+
+program test
+  use m1
+  type(t1),pointer :: foo
+  allocate(foo)
+  allocate(foo%x1)
+  allocate(foo%x1%x2)
+  allocate(foo%x1%x2%x3)
+  call sub1(foo%x1)
+  print *,'done'
+end program
+
+subroutine sub1(bar)
+  use m1
+  type(t2) :: bar
+end subroutine

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @abidh.

Copy link
Contributor

@pawosm-arm pawosm-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

@abidh abidh merged commit af91372 into llvm:main Jan 20, 2025
11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 20, 2025

LLVM Buildbot has detected a new failure on builder flang-aarch64-libcxx running on linaro-flang-aarch64-libcxx while building flang at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/14726

Here is the relevant piece of the build log for the reference
Step 5 (build-unified-tree) failure: build (failure)
...
87.125 [160/4/7180] Linking CXX shared library lib/libFIRAnalysis.so.20.0git
87.131 [159/4/7181] Creating library symlink lib/libFIRAnalysis.so
87.233 [158/4/7182] Linking CXX shared library lib/libFIRTestOpenACCInterfaces.so.20.0git
87.240 [157/4/7183] Creating library symlink lib/libFIRTestOpenACCInterfaces.so
87.524 [157/3/7184] Linking CXX shared library lib/libFIRCodeGen.so.20.0git
87.531 [156/3/7185] Creating library symlink lib/libFIRCodeGen.so
87.869 [155/3/7186] Linking CXX executable bin/fir-lsp-server
88.052 [155/2/7187] Linking CXX shared library lib/libFlangOpenMPTransforms.so.20.0git
88.058 [154/2/7188] Creating library symlink lib/libFlangOpenMPTransforms.so
90.135 [154/1/7189] Building CXX object tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o
FAILED: tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o 
/usr/local/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/flang/lib/Optimizer/Transforms -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/lib/Optimizer/Transforms -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/../mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/clang/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/../clang/include -stdlib=libc++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion           -Wcovered-switch-default -Wno-nested-anon-types -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o -MF tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o.d -o tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o -c /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
../llvm-project/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp:262:31: error: function 'getUnderlyingType' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
  262 | static mlir::LLVM::DITypeAttr getUnderlyingType(mlir::LLVM::DITypeAttr Ty) {
      |                               ^~~~~~~~~~~~~~~~~
1 error generated.
ninja: build stopped: subcommand failed.

@abidh
Copy link
Contributor Author

abidh commented Jan 20, 2025

The build is failing due to a function that is unused now. I have opened #123602 to fix the issue and will merge it.

abidh added a commit to abidh/llvm-project that referenced this pull request Jan 26, 2025
Since llvm#122770, we have seen
that compile time have become extremely slow for cyclic derived types.
In 122770, we made the criteria to cache a type very strict. As a result,
some types which are safe to cache were also being re-generated every
type they were required. This increased the compile time and also the
size of the debug info.

Please see the description of PR# 122770. We decided that when
processing t1, the type generated for t2 and t3 were not safe to
cached. But our algorithm also denied caching to t1 which as top level
type was safe.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

I have tinkered the check a bit so that top level type is always cached.
To detect a top level type, we use a depth counter that get incremented
before call to convertRecordType and decremented after it returns.

After this change, the following file from Fujitsu get compiled under
a minute.
https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90

The smaller testcase present in issue 124049 takes around half second. I
also added check to make sure that duplicate entries of the
DICompositeType are not present in the IR.

Fixes llvm#124049 and llvm#123960.
abidh added a commit that referenced this pull request Jan 27, 2025
…s. (#124473)

Since #122770, we have seen
that compile time have become extremely slow for cyclic derived types.
In #122770, we made the criteria to cache a derived type very strict. As
a result, some types which are safe to cache were also being
re-generated every type they were required. This increased the compile
time and also the size of the debug info.

Please see the description of PR# 122770. We decided that when
processing `t1`, the type generated for `t2` and `t3` were not safe to
cached. But our algorithm also denied caching to `t1` which as top level
type was safe.

```
type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type
```

I have tinkered the check a bit so that top level type is always cached.
To detect a top level type, we use a depth counter that get incremented
before call to `convertRecordType` and decremented after it returns.

After this change, the following
[file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90)
from Fujitsu get compiled around 40s which is same as it was before
#122770.


The smaller testcase present in issue #124049 takes less than half a
second. I also added check to make sure that duplicate entries of the
`DICompositeType` are not present in the IR.

Fixes #124049 and #123960.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 27, 2025
…erived types. (#124473)

Since llvm/llvm-project#122770, we have seen
that compile time have become extremely slow for cyclic derived types.
In #122770, we made the criteria to cache a derived type very strict. As
a result, some types which are safe to cache were also being
re-generated every type they were required. This increased the compile
time and also the size of the debug info.

Please see the description of PR# 122770. We decided that when
processing `t1`, the type generated for `t2` and `t3` were not safe to
cached. But our algorithm also denied caching to `t1` which as top level
type was safe.

```
type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type
```

I have tinkered the check a bit so that top level type is always cached.
To detect a top level type, we use a depth counter that get incremented
before call to `convertRecordType` and decremented after it returns.

After this change, the following
[file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90)
from Fujitsu get compiled around 40s which is same as it was before
#122770.

The smaller testcase present in issue #124049 takes less than half a
second. I also added check to make sure that duplicate entries of the
`DICompositeType` are not present in the IR.

Fixes #124049 and #123960.
@abidh abidh deleted the 122024 branch February 5, 2025 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[flang][debug] Unexpected Assertion `!attr.getIsRecSelf() && "unbound DI recursive self reference"' failure
5 participants