[flang][debug] Improve handling of cyclic derived types. #122770

abidh · 2025-01-13T19:18:42Z

When RecordType is converted to corresponding DIType, we cache the information to avoid doing the conversion again.

Our conversion of RecordType looks like this:

ConvertRecordType(RecordType Ty)

If type Ty is already in the cache, then return the corresponding item.
Create a place holder DICompositeTypeAttr (called ty_self below) for Ty
Put Ty->ty_self in the cache
Convert members of Ty. This may cause ConvertRecordType to be called again with other types.
Create final DICompositeTypeAttr
Replace the ty_self in the cache with one created in step 5 end

The purpose of creating ty_self is to handle cases where a member may have reference to parent type.

Now consider the code below:

type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type

While processing t1, we could have a structure like below. t1 -> t2 -> t1_self

The t2 created during handling of t1 cant be cached on its own as it contains a place holder reference. It will fail an assert in MLIR if it is processed standalone. To avoid this problem, we have a check in the step 6 above to not cache such types. But this check was not tight enough. It just checked if a type should not have a place holder reference to another type. It missed the following case where the place holder reference can be in a type further down the line.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

So while processing t1, we have to stop caching of not only t3 but also of t2. This PR improves the check and moves the logic inside convertRecordType.

Please note that this limitation of why a type cant have a placeholder reference is because of how such references are resolved in the mlir. Please see the discussion at the end of this PR.

I have to change getDerivedType so that it will also get the derived type for things like type(t2), pointer :: p1 which are wrapped in BoxType. Happy to move it to a new function or a local helper in case this change is problematic.

Fixes #122024.

When RecordType is converted to corresponding DIType, we cache the information to avoid doing the conversion again. Our conversion of RecordType looks like this: ConvertRecordType(RecordType Ty) 1. If type Ty is already in the cache, then return the corresponding item. 2. Create a place holder DICompositeTypeAttr (called ty_self below) for Ty 3. Put Ty->ty_self in the cache 4. Convert members of Ty. This may cause ConvertRecordType to be called agin with other types. 5. Create final DICompositeTypeAttr 6. Replace the ty_self in the cache with one created in step 5 end The purpose of creating ty_self is to handle cases where a member may have reference to parent type. Now consider the code below: type t1 type(t2), pointer :: p1 end type type t2 type(t1), pointer :: p2 end type While processing t1, we could have a structure like below. t1 -> t2 -> t1_self The t2 created during handling of t1 cant be cached on its own as it contains a place holder reference. It will fail an assert in MLIR if it is processed standalone. We previously had a check in the step 6 above to not cache it. But this check was not tight enough. It just checked if a type should not have a place holder reference to another type. It missed the following case where the place holder reference can be in a type further down the line. type t1 type(t2), pointer :: p1 end type type t2 type(t3), pointer :: p2 end type type t3 type(t1), pointer :: p3 end type So while processing t1, we have to stop caching of not only t3 but also of t2. This PR improves the check and moves the logic inside convertRecordType. Please note that this limitation of why a type cant have a placeholder reference is because of how such references are resolved in the mlir. Please see the discussion at the end of the following PR. llvm#106571 Fixes llvm#122024.

llvmbot · 2025-01-13T19:19:20Z

@llvm/pr-subscribers-flang-fir-hlfir

Author: Abid Qadeer (abidh)

Changes

When RecordType is converted to corresponding DIType, we cache the information to avoid doing the conversion again.

Our conversion of RecordType looks like this:

ConvertRecordType(RecordType Ty)

If type Ty is already in the cache, then return the corresponding item.
Create a place holder DICompositeTypeAttr (called ty_self below) for Ty
Put Ty->ty_self in the cache
Convert members of Ty. This may cause ConvertRecordType to be called again with other types.
Create final DICompositeTypeAttr
Replace the ty_self in the cache with one created in step 5 end

The purpose of creating ty_self is to handle cases where a member may have reference to parent type.

Now consider the code below:

type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type

While processing t1, we could have a structure like below. t1 -> t2 -> t1_self

The t2 created during handling of t1 cant be cached on its own as it contains a place holder reference. It will fail an assert in MLIR if it is processed standalone. To avoid this problem, we have a check in the step 6 above to not cache such types. But this check was not tight enough. It just checked if a type should not have a place holder reference to another type. It missed the following case where the place holder reference can be in a type further down the line.

type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type

So while processing t1, we have to stop caching of not only t3 but also of t2. This PR improves the check and moves the logic inside convertRecordType.

Please note that this limitation of why a type cant have a placeholder reference is because of how such references are resolved in the mlir. Please see the discussion at the end of this PR.

I have to change getDerivedType so that it will also get the derived type for things like type(t2), pointer :: p1 which are wrapped in BoxType. Happy to move it to a new function or a local helper in case this change is problematic.

Fixes #122024.

Full diff: https://github.com/llvm/llvm-project/pull/122770.diff

3 Files Affected:

(modified) flang/lib/Optimizer/Dialect/FIRType.cpp (+1)
(modified) flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp (+58-50)
(added) flang/test/Integration/debug-cyclic-derived-type-3.f90 (+32)

diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp
index d25e5651f1142f..14f42d357675b4 100644
--- a/flang/lib/Optimizer/Dialect/FIRType.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRType.cpp
@@ -210,6 +210,7 @@ mlir::Type getDerivedType(mlir::Type ty) {
           return seq.getEleTy();
         return p.getEleTy();
       })
+      .Case<fir::BoxType>([](auto p) { return getDerivedType(p.getEleTy()); })
       .Default([](mlir::Type t) { return t; });
 }
 
diff --git a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
index 8ae3d313d881c5..3f77547b91ae5d 100644
--- a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+++ b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
@@ -273,55 +273,6 @@ static mlir::LLVM::DITypeAttr getUnderlyingType(mlir::LLVM::DITypeAttr Ty) {
   return Ty;
 }
 
-// Currently, the handling of recursive debug type in mlir has some limitations.
-// Those limitations were discussed at the end of the thread for following PR.
-// https://github.com/llvm/llvm-project/pull/106571
-//
-// Problem could be explained with the following example code:
-//  type t2
-//   type(t1), pointer :: p1
-// end type
-// type t1
-//   type(t2), pointer :: p2
-// end type
-// In the description below, type_self means a temporary type that is generated
-// as a place holder while the members of that type are being processed.
-//
-// If we process t1 first then we will have the following structure after it has
-// been processed.
-// t1 -> t2 -> t1_self
-// This is because when we started processing t2, we did not have the complete
-// t1 but its place holder t1_self.
-// Now if some entity requires t2, we will already have that in cache and will
-// return it. But this t2 refers to t1_self and not to t1. In mlir handling,
-// only those types are allowed to have _self reference which are wrapped by
-// entity whose reference it is. So t1 -> t2 -> t1_self is ok because the
-// t1_self reference can be resolved by the outer t1. But standalone t2 is not
-// because there will be no way to resolve it. Until this is fixed in mlir, we
-// avoid caching such types. Please see DebugTranslation::translateRecursive for
-// details on how mlir handles recursive types.
-static bool canCacheThisType(mlir::LLVM::DICompositeTypeAttr comTy) {
-  for (auto el : comTy.getElements()) {
-    if (auto mem =
-            mlir::dyn_cast_if_present<mlir::LLVM::DIDerivedTypeAttr>(el)) {
-      mlir::LLVM::DITypeAttr memTy = getUnderlyingType(mem.getBaseType());
-      if (auto baseTy =
-              mlir::dyn_cast_if_present<mlir::LLVM::DICompositeTypeAttr>(
-                  memTy)) {
-        // We will not cache a type if one of its member meets the following
-        // conditions:
-        // 1. It is a structure type
-        // 2. It is a place holder type (getIsRecSelf() is true)
-        // 3. It is not a self reference. It is ok to have t1_self in t1.
-        if (baseTy.getTag() == llvm::dwarf::DW_TAG_structure_type &&
-            baseTy.getIsRecSelf() && (comTy.getRecId() != baseTy.getRecId()))
-          return false;
-      }
-    }
-  }
-  return true;
-}
-
 std::pair<std::uint64_t, unsigned short>
 DebugTypeGenerator::getFieldSizeAndAlign(mlir::Type fieldTy) {
   mlir::Type llvmTy;
@@ -343,6 +294,7 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
   if (iter != typeCache.end())
     return iter->second;
 
+  bool canCacheThisType = true;
   llvm::SmallVector<mlir::LLVM::DINodeAttr> elements;
   mlir::MLIRContext *context = module.getContext();
   auto recId = mlir::DistinctAttr::create(mlir::UnitAttr::get(context));
@@ -406,6 +358,62 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
         /*extra data=*/nullptr);
     elements.push_back(tyAttr);
     offset += llvm::alignTo(byteSize, byteAlign);
+
+    // Currently, the handling of recursive debug type in mlir has some
+    // limitations that were discussed at the end of the thread for following
+    // PR.
+    // https://github.com/llvm/llvm-project/pull/106571
+    //
+    // Problem could be explained with the following example code:
+    //  type t2
+    //   type(t1), pointer :: p1
+    // end type
+    // type t1
+    //   type(t2), pointer :: p2
+    // end type
+    // In the description below, type_self means a temporary type that is
+    // generated
+    // as a place holder while the members of that type are being processed.
+    //
+    // If we process t1 first then we will have the following structure after
+    // it has been processed.
+    // t1 -> t2 -> t1_self
+    // This is because when we started processing t2, we did not have the
+    // complete t1 but its place holder t1_self.
+    // Now if some entity requires t2, we will already have that in cache and
+    // will return it. But this t2 refers to t1_self and not to t1. In mlir
+    // handling, only those types are allowed to have _self reference which are
+    // wrapped by entity whose reference it is. So t1 -> t2 -> t1_self is ok
+    // because the t1_self reference can be resolved by the outer t1. But
+    // standalone t2 is not because there will be no way to resolve it. Until
+    // this is fixed in mlir, we avoid caching such types. Please see
+    // DebugTranslation::translateRecursive for details on how mlir handles
+    // recursive types.
+    // The code below checks for situation where it will be unsafe to cache
+    // a type to avoid this problem. We do that in 2 situations.
+    // 1. If a member is record type, then its type would have been processed
+    // before reaching here. If it is not in the cache, it means that it was
+    // found to be unsafe to cache. So any type containing it will also not
+    // be cached
+    // 2. The type of the member is found in the cache but it is a place holder.
+    // In this case, its recID should match the recID of the type we are
+    // processing. This helps us to cache the following type.
+    // type t
+    //  type(t), allocatable :: p
+    // end type
+    mlir::Type baseTy = getDerivedType(fieldTy);
+    if (auto recTy = mlir::dyn_cast<fir::RecordType>(baseTy)) {
+      auto iter = typeCache.find(recTy);
+      if (iter == typeCache.end())
+        canCacheThisType = false;
+      else {
+        if (auto tyAttr =
+                mlir::dyn_cast<mlir::LLVM::DICompositeTypeAttr>(iter->second)) {
+          if (tyAttr.getIsRecSelf() && tyAttr.getRecId() != recId)
+            canCacheThisType = false;
+        }
+      }
+    }
   }
 
   auto finalAttr = mlir::LLVM::DICompositeTypeAttr::get(
@@ -414,7 +422,7 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
       /*baseType=*/nullptr, mlir::LLVM::DIFlags::Zero, offset * 8,
       /*alignInBits=*/0, elements, /*dataLocation=*/nullptr, /*rank=*/nullptr,
       /*allocated=*/nullptr, /*associated=*/nullptr);
-  if (canCacheThisType(finalAttr)) {
+  if (canCacheThisType) {
     typeCache[Ty] = finalAttr;
   } else {
     auto iter = typeCache.find(Ty);
diff --git a/flang/test/Integration/debug-cyclic-derived-type-3.f90 b/flang/test/Integration/debug-cyclic-derived-type-3.f90
new file mode 100644
index 00000000000000..ef9aed13cc514c
--- /dev/null
+++ b/flang/test/Integration/debug-cyclic-derived-type-3.f90
@@ -0,0 +1,32 @@
+! RUN: %flang_fc1 -emit-llvm -debug-info-kind=standalone %s -o -
+
+! mainly test that this program does not cause an assertion failure
+! testcase for issue 122024
+
+module m1
+  type t1
+    type(t2),pointer :: x1
+  end type
+  type t2
+    type(t3),pointer :: x2
+  end type
+  type t3
+    type(t1),pointer :: x3
+  end type
+end
+
+program test
+  use m1
+  type(t1),pointer :: foo
+  allocate(foo)
+  allocate(foo%x1)
+  allocate(foo%x1%x2)
+  allocate(foo%x1%x2%x3)
+  call sub1(foo%x1)
+  print *,'done'
+end program
+
+subroutine sub1(bar)
+  use m1
+  type(t2) :: bar
+end subroutine

jeanPerier

LGTM, thanks @abidh.

pawosm-arm

Works for me.

llvm-ci · 2025-01-20T12:07:18Z

LLVM Buildbot has detected a new failure on builder flang-aarch64-libcxx running on linaro-flang-aarch64-libcxx while building flang at step 5 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/14726

Here is the relevant piece of the build log for the reference

Step 5 (build-unified-tree) failure: build (failure)
...
87.125 [160/4/7180] Linking CXX shared library lib/libFIRAnalysis.so.20.0git
87.131 [159/4/7181] Creating library symlink lib/libFIRAnalysis.so
87.233 [158/4/7182] Linking CXX shared library lib/libFIRTestOpenACCInterfaces.so.20.0git
87.240 [157/4/7183] Creating library symlink lib/libFIRTestOpenACCInterfaces.so
87.524 [157/3/7184] Linking CXX shared library lib/libFIRCodeGen.so.20.0git
87.531 [156/3/7185] Creating library symlink lib/libFIRCodeGen.so
87.869 [155/3/7186] Linking CXX executable bin/fir-lsp-server
88.052 [155/2/7187] Linking CXX shared library lib/libFlangOpenMPTransforms.so.20.0git
88.058 [154/2/7188] Creating library symlink lib/libFlangOpenMPTransforms.so
90.135 [154/1/7189] Building CXX object tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o
FAILED: tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o 
/usr/local/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/flang/lib/Optimizer/Transforms -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/lib/Optimizer/Transforms -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/flang/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/include -I/home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/../mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/mlir/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/tools/clang/include -isystem /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/llvm/../clang/include -stdlib=libc++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion           -Wcovered-switch-default -Wno-nested-anon-types -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o -MF tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o.d -o tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/DebugTypeGenerator.cpp.o -c /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
../llvm-project/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp:262:31: error: function 'getUnderlyingType' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
  262 | static mlir::LLVM::DITypeAttr getUnderlyingType(mlir::LLVM::DITypeAttr Ty) {
      |                               ^~~~~~~~~~~~~~~~~
1 error generated.
ninja: build stopped: subcommand failed.

abidh · 2025-01-20T12:15:45Z

The build is failing due to a function that is unused now. I have opened #123602 to fix the issue and will merge it.

Since llvm#122770, we have seen that compile time have become extremely slow for cyclic derived types. In 122770, we made the criteria to cache a type very strict. As a result, some types which are safe to cache were also being re-generated every type they were required. This increased the compile time and also the size of the debug info. Please see the description of PR# 122770. We decided that when processing t1, the type generated for t2 and t3 were not safe to cached. But our algorithm also denied caching to t1 which as top level type was safe. type t1 type(t2), pointer :: p1 end type type t2 type(t3), pointer :: p2 end type type t3 type(t1), pointer :: p3 end type I have tinkered the check a bit so that top level type is always cached. To detect a top level type, we use a depth counter that get incremented before call to convertRecordType and decremented after it returns. After this change, the following file from Fujitsu get compiled under a minute. https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90 The smaller testcase present in issue 124049 takes around half second. I also added check to make sure that duplicate entries of the DICompositeType are not present in the IR. Fixes llvm#124049 and llvm#123960.

…s. (#124473) Since #122770, we have seen that compile time have become extremely slow for cyclic derived types. In #122770, we made the criteria to cache a derived type very strict. As a result, some types which are safe to cache were also being re-generated every type they were required. This increased the compile time and also the size of the debug info. Please see the description of PR# 122770. We decided that when processing `t1`, the type generated for `t2` and `t3` were not safe to cached. But our algorithm also denied caching to `t1` which as top level type was safe. ``` type t1 type(t2), pointer :: p1 end type type t2 type(t3), pointer :: p2 end type type t3 type(t1), pointer :: p3 end type ``` I have tinkered the check a bit so that top level type is always cached. To detect a top level type, we use a depth counter that get incremented before call to `convertRecordType` and decremented after it returns. After this change, the following [file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90) from Fujitsu get compiled around 40s which is same as it was before #122770. The smaller testcase present in issue #124049 takes less than half a second. I also added check to make sure that duplicate entries of the `DICompositeType` are not present in the IR. Fixes #124049 and #123960.

…erived types. (#124473) Since llvm/llvm-project#122770, we have seen that compile time have become extremely slow for cyclic derived types. In #122770, we made the criteria to cache a derived type very strict. As a result, some types which are safe to cache were also being re-generated every type they were required. This increased the compile time and also the size of the debug info. Please see the description of PR# 122770. We decided that when processing `t1`, the type generated for `t2` and `t3` were not safe to cached. But our algorithm also denied caching to `t1` which as top level type was safe. ``` type t1 type(t2), pointer :: p1 end type type t2 type(t3), pointer :: p2 end type type t3 type(t1), pointer :: p3 end type ``` I have tinkered the check a bit so that top level type is always cached. To detect a top level type, we use a depth counter that get incremented before call to `convertRecordType` and decremented after it returns. After this change, the following [file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90) from Fujitsu get compiled around 40s which is same as it was before #122770. The smaller testcase present in issue #124049 takes less than half a second. I also added check to make sure that duplicate entries of the `DICompositeType` are not present in the IR. Fixes #124049 and #123960.

abidh requested review from clementval, tblah, jeanPerier, dpalermo, pawosm-arm and kiranktp January 13, 2025 19:18

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jan 13, 2025

abidh mentioned this pull request Jan 13, 2025

[flang][debug] Unexpected Assertion `!attr.getIsRecSelf() && "unbound DI recursive self reference"' failure #122024

Closed

jeanPerier approved these changes Jan 16, 2025

View reviewed changes

pawosm-arm approved these changes Jan 19, 2025

View reviewed changes

abidh merged commit af91372 into llvm:main Jan 20, 2025
11 checks passed

eugeneepshteyn mentioned this pull request Jan 23, 2025

Fortran test with many self-referencing types takes "forever" to compile with -g #124049

Closed

abidh mentioned this pull request Jan 26, 2025

[flang][debug] Avoid redundant debug data generation for derived types. #124473

Merged

abidh deleted the 122024 branch February 5, 2025 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang][debug] Improve handling of cyclic derived types. #122770

[flang][debug] Improve handling of cyclic derived types. #122770

abidh commented Jan 13, 2025

llvmbot commented Jan 13, 2025

jeanPerier left a comment

pawosm-arm left a comment

llvm-ci commented Jan 20, 2025

abidh commented Jan 20, 2025

[flang][debug] Improve handling of cyclic derived types. #122770

[flang][debug] Improve handling of cyclic derived types. #122770

Conversation

abidh commented Jan 13, 2025

llvmbot commented Jan 13, 2025

jeanPerier left a comment

Choose a reason for hiding this comment

pawosm-arm left a comment

Choose a reason for hiding this comment

llvm-ci commented Jan 20, 2025

abidh commented Jan 20, 2025