Artifact conflict check: rearrange cases

Flip two cases in a conditional. Since the cases are in disjunctive normal form (OR of ANDs) and all checks are idempotent, rearranging them is safe. The new arrangement is more logical because it takes increasingly more complex scenarios to trigger them. For background: the artifact conflict checker is responsible for finding non-equal actions that generate the same output, so they can't be built together. This check is expensive (currently it takes 150ms for Bazel), so we try to avoid it in incremental builds. The logic to avoid the check is complicated and nuanced. Other than rearranging the cases, this CL adds comments that explain elements of the check, and give examples of build scenarios that each check catches. The comments also show a couple scenarios where the conflict checker could be avoided but currently isn't. Motivation for looking at the conflict checker is to assess how we could do the check inside a SkyFunction. PiperOrigin-RevId: 290583572
bazelbuild · Jan 20, 2020 · 044264e · 044264e
1 parent 3e5ece3
commit 044264e
Showing 1 changed file with 61 additions and 5 deletions.
diff --git a/src/main/java/com/google/devtools/build/lib/skyframe/SkyframeBuildView.java b/src/main/java/com/google/devtools/build/lib/skyframe/SkyframeBuildView.java
@@ -426,11 +426,7 @@ public SkyframeAnalysisResult configureTargets(
               .addAll(ctKeys)
               .addAll(aspectKeys)
               .build();
-      if (someConfiguredTargetEvaluated
-          || anyConfiguredTargetDeleted
-          || !dirtiedConfiguredTargetKeys.isEmpty()
-          || !largestTopLevelKeySetCheckedForConflicts.containsAll(newKeys)
-          || !skyframeActionExecutor.badActions().isEmpty()) {
+      if (checkForConflicts(newKeys)) {
         largestTopLevelKeySetCheckedForConflicts = newKeys;
         // This operation is somewhat expensive, so we only do it if the graph might have changed in
         // some way -- either we analyzed a new target or we invalidated an old one or are building
@@ -523,6 +519,66 @@ public SkyframeAnalysisResult configureTargets(
         packageRoots);
   }
 
+  private boolean checkForConflicts(ImmutableSet<SkyKey> newKeys) {
+    if (someConfiguredTargetEvaluated) {
+      // A top-level target was added and may introduce a conflict, or a top-level target was
+      // recomputed and may introduce or resolve a conflict.
+      return true;
+    }
+
+    if (anyConfiguredTargetDeleted) {
+      // No target was (re)computed, but some were deleted, which may resolve a conflict.
+      // TODO(laszlocsomor): get to understand this condition and give an example for it.
+      return true;
+    }
+
+    if (!dirtiedConfiguredTargetKeys.isEmpty()) {
+      // No target was (re)computed and none was deleted, but at least one was dirtied.
+      // Example: (//:x //foo:y) are built, and in conflict (//:x creates foo/C and //foo:y
+      // creates C). Then y is removed from foo/BUILD and only //:x is built, so //foo:y is
+      // dirtied but not recomputed, and no other nodes are recomputed (and none are deleted).
+      // Still we must do the conflict checking because previously there was a conflict but now
+      // there isn't.
+      return true;
+    }
+
+    if (!skyframeActionExecutor.badActions().isEmpty()) {
+      // Example sequence:
+      // 1.  Build (x y z), and there is a conflict. We store (x y z) as the largest checked key
+      //     set, and record the fact that there were bad actions.
+      // 2.  Null-build (x z), so we don't evaluate or dirty or delete anything, but because we know
+      //     there was some conflict last time but don't know exactly which targets conflicted, it
+      //     could have been (x z), so we now check again.
+      return true;
+    }
+
+    if (!largestTopLevelKeySetCheckedForConflicts.containsAll(newKeys)) {
+      // Example sequence:
+      // 1.  Build (x y z), and there is a conflict. We store (x y z) as the largest checked key
+      //     set, and record the fact that there were bad actions.
+      // 2.  Null-build (x z), so we don't evaluate or dirty or delete anything, but because we know
+      //     there was some conflict last time but don't know exactly which targets conflicted, it
+      //     could have been (x z), so we now check again, and store (x z) as the largest checked
+      //     key set.
+      // 3.  Null-build (y z), so again we don't evaluate or dirty or delete anything, and the
+      //     previous build had no conflicts, so no other condition is true. But because (y z) is no
+      //     subset of (x z) and we only keep the most recent largest checked key set, we don't know
+      //     if (y z) are conflict free, so we check.
+      return true;
+    }
+
+    // We believe the conditions above are correct in the sense that we always check for conflicts
+    // when we have to. But they are incomplete, so we sometimes check for conflicts even if we
+    // wouldn't have to. For example:
+    // - if no target was evaluated (nor dirtied, nor deleted), and build sequence is (x y)
+    //   [no conflict], (z), where z is in the transitive closure of (x y), then we shouldn't check.
+    // - if no target was evaluated (nor dirtied, nor deleted), and build sequence is (x y)
+    //   [no conflict], (z), (x), then the last build shouldn't conflict-check because (x y) was
+    //   checked earlier. But it does, since after the second build we store (z) as the largest
+    //   checked set of which (x) is no subset.
+    return false;
+  }
+
   /**
    * Process errors encountered during analysis, and return a {@link Pair} indicating the existence
    * of a loading-phase error, if any, and an exception to be thrown to halt the build, if {@code