opt: don't add reordered join with extra filters to original memo group

The `JoinOrderBuilder` builds reordered join plans from the bottom up. It expects filters to be pushed down as far as possible at each step, and that transitive closure has been calculated over Inner Join equality filters (e.g. `a=b` and `b=c` => `a=c`). It also reuses the original matched joins when possible to avoid duplicate work by adding to the original memo groups. This could previously cause filters to be dropped in the case when the original join tree did not compute transitive closure and push filters down as far as possible. More specifically, the `JoinOrderBuilder` could add new reordered joins with new filters synthesized and pushed down as far as possible to an original memo group that didn't have one of those filters. Subsequent joins would then expect the filter to be part of the memo group, and so wouldn't add it later on in the plan. In the rare case when the expression without the filter was chosen, this could manifest as a dropped filter in the final plan. This was rare because dropping a filter usually does not produce a lower-cost plan. As an example, take this original join tree: ``` (xy join ab on true) join uv on x = u and a = u; ``` Here it is possible to sythesize and push down a `x = a` filter, and so the `JoinOrderBuilder` would do this and add it to the group: ``` group (xy join ab on true), (xy join ab on x = a) ``` Later joins would use this group as an input, an expect the `x = a` filter to be present. If costing happened to choose the first expression in the group, we would end up choosing a plan like this: ``` (xy join ab on true) join uv on x = u ``` Where the `a = u` filter isn't included in the top-level join because it would be redundant to add it when `x = u` and `x = a` are already present. This is a bit of a simplification, but is essentially correct. This commit adds a check to the `JoinOrderBuilder` to identify cases where filters (including ones sythesized from the transitive closure) weren't pushed all the way down in the original join tree. When this is true, none of the originally matched joins can be reused when reordered joins are built except for the root join. This solution may perform some duplicate work when filters aren't pushed down, but it shouldn't matter because this case is rare (and should be avoided whenever possible). Fixes cockroachdb#88659 Release note (bug fix): Fixed a bug introduced in 20.2 that could cause filters to be dropped from a query plan with many joins in rare cases.
DrewKimball · Sep 30, 2022 · 8d82367 · 8d82367
1 parent aaca5ce
commit 8d82367
Show file tree

Hide file tree

Showing 2 changed files with 263 additions and 85 deletions.
diff --git a/pkg/sql/opt/xform/join_order_builder.go b/pkg/sql/opt/xform/join_order_builder.go
@@ -316,6 +316,14 @@ type JoinOrderBuilder struct {
 	// assembling filters.
 	equivs props.EquivSet
 
+	// rebuildAllJoins is true when the filters in the original matched join tree
+	// were not pushed down as far as possible. When this is true, all joins
+	// except the root join need to be re-built, possibly with additional filters
+	// pushed down. While technically it is sufficient to only do this for the
+	// joins that would be changed by a successful push-down, it is simpler to
+	// handle things this way (and the problem is rare).
+	rebuildAllJoins bool
+
 	onReorderFunc OnReorderFunc
 
 	onAddJoinFunc OnAddJoinFunc
@@ -354,6 +362,12 @@ func (jb *JoinOrderBuilder) Reorder(join memo.RelExpr) {
 		// the best plan.
 		jb.ensureClosure(join)
 
+		// Ensure that the JoinOrderBuilder will not add reordered joins to the
+		// original memo groups (apart from the root) in the case when doing so
+		// would add filters that weren't present in the original joins. See the
+		// validateEdges comment for more information.
+		jb.validateEdges()
+
 		if jb.onReorderFunc != nil {
 			// Hook for testing purposes.
 			jb.callOnReorderFunc(join)
@@ -466,6 +480,72 @@ func (jb *JoinOrderBuilder) ensureClosure(join memo.RelExpr) {
 	}
 }
 
+// validateEdges checks whether each edge applies to its original join. If any
+// do not, normalization rules failed to synthesize and push a filter down as
+// far as possible, and it is not valid to add new reordered joins to the
+// original memo groups. When this is the case, all joins except for the root
+// join need to be removed from the plans map. This prevents cases where a join
+// is added to a memo group that isn't logically equivalent.
+//
+// This is necessary because the JoinOrderBuilder expects each join tree for a
+// given set of relations to contain all filters that apply to those relations.
+// When a new join is constructed, it doesn't contain "degenerate" filters -
+// filters that only refer to one side of the join. So if the original join tree
+// had an implicit filter that could have been synthesized and pushed down the
+// tree, but wasn't, using the original join group that *should* have that
+// filter when building a new join would cause a filter to be dropped.
+//
+// Take the following (simplified) example of a join tree where filter push-down
+// rules have failed:
+//
+//	(xy join ab on true) join uv on x = u and a = u
+//
+// Here, the JoinOrderBuilder will synthesize an 'x = a' filter that will be
+// used to join xy and ab. If it was added to the original group, we would have
+// a memo group that looks like this:
+//
+//	group: (xy join ab on true), (xy join ab on x = a)
+//
+// Later joins that are constructed using this group would expect the 'x = a'
+// filter to be present, and would avoid adding redundant filters. Therefore,
+// a join tree like the following would be added to the memo.
+//
+//	(xy join ab on true) join uv on x = u
+//
+// Notice how the 'a = u' filter has been dropped because it would be redundant
+// when 'x = u' and 'x = a' are already present. We prevent this from happening
+// by not reusing the original memo groups in the case when the JoinOrderBuilder
+// is able to synthesize and/or push down filters that weren't in the original
+// join tree.
+func (jb *JoinOrderBuilder) validateEdges() {
+	for i := range jb.edges {
+		if jb.rebuildAllJoins {
+			break
+		}
+		e := &jb.edges[i]
+		if e.op.joinType == opt.InnerJoinOp {
+			jb.rebuildAllJoins = !e.checkInnerJoin(e.op.leftVertexes, e.op.rightVertexes)
+		} else {
+			jb.rebuildAllJoins = !e.checkNonInnerJoin(e.op.leftVertexes, e.op.rightVertexes)
+		}
+	}
+	if jb.rebuildAllJoins {
+		for vertexes := range jb.plans {
+			if vertexes.isSingleton() || vertexes == jb.allVertexes() {
+				// Do not remove the plan if it is for a base relation (not a join) or
+				// it is the root join. Adding to the root join group is correct because
+				// the JoinOrderBuilder will only consider filters that were present
+				// (even if only implicitly) in the root join tree. It is also necessary
+				// because the purpose of the JoinOrderBuilder is to add equivalent join
+				// plans to the root join group - otherwise, any new joins would be
+				// disconnected from the main query plan.
+				continue
+			}
+			delete(jb.plans, vertexes)
+		}
+	}
+}
+
 // dpSube carries out the DPSube algorithm (citations: [8] figure 4). All
 // disjoint pairs of subsets of base relations are enumerated and checked for
 // validity. If valid, the pair of subsets is used along with the edges
@@ -526,9 +606,10 @@ func (jb *JoinOrderBuilder) addJoins(s1, s2 vertexSet) {
 				continue
 			}
 			if !joinIsRedundant {
-				// If this edge was originally part of a join between relation sets s1 and
-				// s2, any other edges that apply will also be part of that original join.
-				joinIsRedundant = e.joinIsRedundant(s1, s2)
+				// If this edge was originally part of a join between relation sets s1
+				// and s2, any other edges that apply will also be part of that original
+				// join.
+				joinIsRedundant = jb.joinIsRedundant(e, s1, s2)
 			}
 			for j := range e.filters {
 				jb.equivs.AddFromFDs(&e.filters[j].ScalarProps().FuncDeps)
@@ -549,7 +630,7 @@ func (jb *JoinOrderBuilder) addJoins(s1, s2 vertexSet) {
 			// Construct a non-inner join. If any inner join filters also apply to the
 			// pair of relationSets, construct a select on top of the join with the
 			// inner join filters.
-			jb.addJoin(e.op.joinType, s1, s2, e.filters, innerJoinFilters, e.joinIsRedundant(s1, s2))
+			jb.addJoin(e.op.joinType, s1, s2, e.filters, innerJoinFilters, jb.joinIsRedundant(e, s1, s2))
 			return
 		}
 		if e.checkNonInnerJoin(s2, s1) {
@@ -575,7 +656,7 @@ func (jb *JoinOrderBuilder) addJoins(s1, s2 vertexSet) {
 			// 010 on the right. 101 is larger than 111 / 2, so we will not enumerate
 			// this plan unless we consider a join with s2 on the left and s1 on the
 			// right.
-			jb.addJoin(e.op.joinType, s2, s1, e.filters, innerJoinFilters, e.joinIsRedundant(s2, s1))
+			jb.addJoin(e.op.joinType, s2, s1, e.filters, innerJoinFilters, jb.joinIsRedundant(e, s2, s1))
 			return
 		}
 	}
@@ -642,6 +723,19 @@ func (jb *JoinOrderBuilder) makeTransitiveEdge(col1, col2 opt.ColumnID) {
 		return
 	}
 
+	originalJoin, ok := jb.plans[op.leftVertexes.union(op.rightVertexes)]
+	if !ok {
+		panic(errors.AssertionFailedf("failed to find expected join plan"))
+	}
+	if !originalJoin.Relational().FuncDeps.AreColsEquiv(col1, col2) {
+		// This inferred filter was not pushed down as far as possible. All joins
+		// apart from the root will have to be rebuilt. We have to do this check
+		// here because we set the op for this edge to the join to which the filter
+		// *would* have been pushed down if it existed, so the applicable check will
+		// always succeed for that join.
+		jb.rebuildAllJoins = true
+	}
+
 	// Construct the edge.
 	var1 := jb.f.ConstructVariable(col1)
 	var2 := jb.f.ConstructVariable(col2)
@@ -754,12 +848,6 @@ func (jb *JoinOrderBuilder) addToGroup(
 ) {
 	if len(selectFilters) > 0 {
 		joinExpr := jb.memoize(op, left, right, on, nil)
-		if joinExpr.FirstExpr() == grp.FirstExpr() {
-			// In rare cases, the select filters may be redundant. In this case,
-			// adding a select to the group with the redundant filters would create a
-			// memo cycle (see #80901).
-			return
-		}
 		selectExpr := &memo.SelectExpr{
 			Input:   joinExpr,
 			Filters: selectFilters,
@@ -904,6 +992,15 @@ func (jb *JoinOrderBuilder) addBaseRelation(rel memo.RelExpr) {
 	jb.plans[relSet] = rel
 }
 
+// joinIsRedundant returns true if a join between the two sets of base relations
+// was already present in the original join tree. If so, enumerating this join
+// would be redundant, so it should be skipped.
+func (jb *JoinOrderBuilder) joinIsRedundant(e *edge, s1, s2 vertexSet) bool {
+	// The join is never redundant when rebuildAllJoins is true, because
+	// rebuildAllJoins indicates we don't want to reuse the original joins.
+	return !jb.rebuildAllJoins && e.op.leftVertexes == s1 && e.op.rightVertexes == s2
+}
+
 // checkSize panics if the number of relations is greater than or equal to
 // MaxReorderJoinsLimit. checkSize should be called before a vertex is added to
 // the join graph.
@@ -1353,13 +1450,6 @@ func (e *edge) checkRules(s1, s2 vertexSet) bool {
 	return true
 }
 
-// joinIsRedundant returns true if a join between the two sets of base relations
-// was already present in the original join tree. If so, enumerating this join
-// would be redundant, so it should be skipped.
-func (e *edge) joinIsRedundant(s1, s2 vertexSet) bool {
-	return e.op.leftVertexes == s1 && e.op.rightVertexes == s2
-}
-
 // commute returns true if the given join operator type is commutable.
 func commute(op opt.Operator) bool {
 	return op == opt.InnerJoinOp || op == opt.FullJoinOp