Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coverage] Improve performance of propagating Counter of Expansions #122589

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chapuni
Copy link
Contributor

@chapuni chapuni commented Jan 11, 2025

This improves clang++ -dump-coverage-mapping (939,498 lines) for RISCVInstructionSelector.cpp from 30m to 1m18s and also improves llvm-cov for check-llvm from 33m to 24s.

The current implementation behaved O(N^2) order with hundreds thousands of Expansions.

This assumes:

  • Records are partitioned by FileID.
    • ExpandedFileID doesn't point FileID==0, since it is the root.
  • The Count in Expansion is propagated from 1st Record in ExpandedFileID.

Therefore another fact below can be assumed.

  • Propagation chain consists of Expansions at each 1st Record.

This scans the Record at most a few times. O(N) is expected.

Fixes #113173

This improves `-dump-coverage-mapping` (939,498 lines) for
`RISCVInstructionSelector.cpp` from 30m to 1m18s and also improves
`llvm-cov` for `check-llvm` from 33m to 24s.

The current implementation behaved O(N^2) order with hundreds
thousands of Expansions.

This assumes:
  - Records are partitioned by FileID.
    - ExpandedFileID doesn't point FileID==0, since it is the root.
  - The Count in Expansion is propagated from 1st Record in
    ExpandedFileID.

Therefore another fact below can be assumed.
  - Propagation chain consists of Expansions at each 1st Record.

This scans the Record at most a few times. O(N) is expected.

Fixes #113173
@chapuni chapuni requested review from MaskRay and ornata January 11, 2025 09:50
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Jan 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 11, 2025

@llvm/pr-subscribers-pgo

Author: NAKAMURA Takumi (chapuni)

Changes

This improves clang++ -dump-coverage-mapping (939,498 lines) for RISCVInstructionSelector.cpp from 30m to 1m18s and also improves llvm-cov for check-llvm from 33m to 24s.

The current implementation behaved O(N^2) order with hundreds thousands of Expansions.

This assumes:

  • Records are partitioned by FileID.
    • ExpandedFileID doesn't point FileID==0, since it is the root.
  • The Count in Expansion is propagated from 1st Record in ExpandedFileID.

Therefore another fact below can be assumed.

  • Propagation chain consists of Expansions at each 1st Record.

This scans the Record at most a few times. O(N) is expected.

Fixes #113173


Full diff: https://github.com/llvm/llvm-project/pull/122589.diff

1 Files Affected:

  • (modified) llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp (+84-16)
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
index cdf4412c6477a1..15c7004c45e7d6 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
@@ -440,26 +440,94 @@ Error RawCoverageMappingReader::read() {
   // Set the counters for the expansion regions.
   // i.e. Counter of expansion region = counter of the first region
   // from the expanded file.
-  // Perform multiple passes to correctly propagate the counters through
-  // all the nested expansion regions.
-  SmallVector<CounterMappingRegion *, 8> FileIDExpansionRegionMapping;
-  FileIDExpansionRegionMapping.resize(VirtualFileMapping.size(), nullptr);
-  for (unsigned Pass = 1, S = VirtualFileMapping.size(); Pass < S; ++Pass) {
-    for (auto &R : MappingRegions) {
-      if (R.Kind != CounterMappingRegion::ExpansionRegion)
-        continue;
-      assert(!FileIDExpansionRegionMapping[R.ExpandedFileID]);
-      FileIDExpansionRegionMapping[R.ExpandedFileID] = &R;
+  // This assumes:
+  //   - Records are partitioned by FileID.
+  //     - ExpandedFileID doesn't point FileID==0, since it is the root.
+  //   - The Count in Expansion is propagated from 1st Record in
+  //     ExpandedFileID.
+  // Therefore another fact below can be assumed.
+  //   - Propagation chain consists of Expansions at each 1st Record.
+
+  /// Node per File.
+  struct FileNode {
+    /// 1st Record in the File.
+    CounterMappingRegion *FirstR = nullptr;
+    /// The Expansion Record out of this File.
+    CounterMappingRegion *ExpanderR = nullptr;
+    /// Count hasn't been propagated yet (for Expansion)
+    bool ExpansionPending = false;
+  };
+  SmallVector<FileNode> FileIDExpansionRegionMapping(VirtualFileMapping.size());
+  assert(VirtualFileMapping.size() == FileIDExpansionRegionMapping.size());
+
+  // Construct the tree.
+  unsigned PrevFileID = 0;
+  for (auto &R : MappingRegions) {
+    if (R.Kind == CounterMappingRegion::ExpansionRegion) {
+      // The File that contains Expansion may be a node.
+      assert(R.ExpandedFileID != 0 &&
+             "ExpandedFileID shouldn't point the root");
+      assert(!FileIDExpansionRegionMapping[R.ExpandedFileID].ExpanderR);
+      FileIDExpansionRegionMapping[R.ExpandedFileID].ExpanderR = &R;
+
+      // It will be the node if (isExpansion and ExpansionPending).
+      FileIDExpansionRegionMapping[R.ExpandedFileID].ExpansionPending = true;
     }
-    for (auto &R : MappingRegions) {
-      if (FileIDExpansionRegionMapping[R.FileID]) {
-        FileIDExpansionRegionMapping[R.FileID]->Count = R.Count;
-        FileIDExpansionRegionMapping[R.FileID] = nullptr;
-      }
+
+    // FileID==0 is not handled here but don't care since it's the root.
+    if (PrevFileID != R.FileID) {
+      // Record 1st Record for each File.
+      assert(!FileIDExpansionRegionMapping[R.FileID].FirstR);
+      FileIDExpansionRegionMapping[R.FileID].FirstR = &R;
+      PrevFileID = R.FileID;
     }
   }
 
-  return Error::success();
+  // Make sure the root [0] doesn't point others.
+  assert((FileIDExpansionRegionMapping.empty() ||
+          (!FileIDExpansionRegionMapping[0].ExpanderR &&
+           !FileIDExpansionRegionMapping[0].FirstR)) &&
+         "The root shouldn't point other nodes");
+
+  auto Propagate = [&](FileNode &X) {
+    // Skip already processed node.
+    if (!X.ExpanderR)
+      return false;
+    // Skip Pending Expansion node.
+    // Process non-Expansion Record and non-Pending Expansion.
+    if (X.ExpansionPending &&
+        X.FirstR->Kind == CounterMappingRegion::ExpansionRegion)
+      return false;
+
+    // Propagate.
+    X.ExpanderR->Count = X.FirstR->Count;
+    // Mark the destination ready.
+    FileIDExpansionRegionMapping[X.ExpanderR->FileID].ExpansionPending = false;
+    // Mark this processed.
+    X.ExpanderR = nullptr;
+    return true;
+  };
+
+  // This won't iterate in most cases but iterates finitely for
+  // unusual cases.
+  for (unsigned I = 0, E = FileIDExpansionRegionMapping.size(); I != E; ++I) {
+    // Propagate Descending.  All Files will be processed if each
+    // ExpandedFileID is greater than FileID.
+    for (auto &X : reverse(FileIDExpansionRegionMapping))
+      Propagate(X);
+
+    // Check whether all Files are processed (or propagate Ascending,
+    // not possible in the usual assumptions).
+    bool Changed = false;
+    for (auto &X : FileIDExpansionRegionMapping)
+      Changed = (Propagate(X) || Changed);
+
+    if (!Changed)
+      return Error::success();
+  }
+
+  return make_error<CoverageMapError>(coveragemap_error::malformed,
+                                      "unexpected Expansions");
 }
 
 Expected<bool> RawCoverageMappingDummyChecker::isDummy() {

@chapuni chapuni requested a review from hyp January 11, 2025 10:12
assert(!FileIDExpansionRegionMapping[R.ExpandedFileID]);
FileIDExpansionRegionMapping[R.ExpandedFileID] = &R;
// This assumes:
// - Records are partitioned by FileID.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. no record is a part of two different files?

FileIDExpansionRegionMapping[R.ExpandedFileID] = &R;
// This assumes:
// - Records are partitioned by FileID.
// - The Count in Expansion is propagated from 1st Record in
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g.

nth record's count == n-1th count + n-2th count ... 0th count?

FileIDExpansionRegionMapping[R.FileID] = nullptr;
}

if (PrevFileID != R.FileID) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reduce indentation here.

// Have we found the first record for this file already?
if (PrevFileID == R.FileID) {
  // We've seen this FileID before, so we have found the
  // first record. Move on to the next region.
  continue;
}
// We have not. Mark the current record as the first record
// for this file.
assert(!FileIDExpansionRegionMapping[R.FileID].FirstR && "Already found a record for an ID we haven't seen before?!");
FileIDExpansionRegionMapping[R.FileID].FirstR = &R;
// Remember that we've processed this file already.
PrevFileID = R.FileID;

for (auto &R : MappingRegions) {
if (R.Kind == CounterMappingRegion::ExpansionRegion) {
// The File that contains Expansion may be a node.
assert(!FileIDExpansionRegionMapping[R.ExpandedFileID].ExpanderR);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this checking that there is only one expansion region?

if (R.Kind == CounterMappingRegion::ExpansionRegion) {
// The File that contains Expansion may be a node.
assert(!FileIDExpansionRegionMapping[R.ExpandedFileID].ExpanderR);
FileIDExpansionRegionMapping[R.ExpandedFileID].ExpanderR = &R;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be slightly cleaner to pull this out into a reference variable?

E.g.

FileNode &N = FileIDExpansionRegionMapping[R.ExpandedFileID];
assert(!N.ExpanderR && "Already found an expansion region for file node?!");
N.ExpanderR = &R;
N.ExpansionPending = true;

/// The Expansion Record out of this File.
CounterMappingRegion *ExpanderR = nullptr;
/// Count hasn't been propagated yet (for Expansion)
bool ExpansionPending = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool ExpansionPending = false;
bool ExpansionCountPropagated = false;

Maybe more descriptive?

/// Count hasn't been propagated yet (for Expansion)
bool ExpansionPending = false;
};
SmallVector<FileNode> FileIDExpansionRegionMapping(VirtualFileMapping.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallVector<FileNode> FileIDExpansionRegionMapping(VirtualFileMapping.size());
// We will construct a tree with one node per file in the virtual file mapping.
SmallVector<FileNode> FileIDExpansionRegionMapping(VirtualFileMapping.size());

Add a comment explaining what we're going to do with this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what will the edges in the tree be? That's not clear to me right now.

}
}

return Error::success();
auto Propagate = [&](FileNode &X) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a lambda and not a function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations tools:llvm-cov
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CoverageMappingReader behaves O(N^2) slowly with hundreds thousand of Expansions
3 participants