Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7437][SQL] Fold "literal in (item1, item2, ..., literal, ...)" into true or false directly #5972

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from 20 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ case class InSet(value: Expression, hset: Set[Any])

override def children: Seq[Expression] = value :: Nil

override def foldable: Boolean = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that you want override def foldable: Boolean = value.foldable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you are right

override def nullable: Boolean = true // TODO: Figure out correct nullability semantics of IN.
override def toString: String = s"$value INSET ${hset.mkString("(", ",", ")")}"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,13 @@ object DefaultOptimizer extends Optimizer {
CombineLimits) ::
Batch("ConstantFolding", FixedPoint(100),
NullPropagation,
OptimizeIn,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it more like a transform, i think it should before ConstantFolding @yhuai

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of them are in the batch of "ConstantFolding". So, we do not really need to move them, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, but looks ok to me, here first do transform from in -> inset, then in ConstantFolding it will call InSet.eval (not In.eval) which should be more efficient.

ConstantFolding,
LikeSimplification,
BooleanSimplification,
SimplifyFilters,
SimplifyCasts,
SimplifyCaseConversionExpressions,
OptimizeIn) ::
SimplifyCaseConversionExpressions) ::
Batch("Decimal Optimizations", FixedPoint(100),
DecimalAggregates) ::
Batch("LocalRelation", FixedPoint(100),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ class ConstantFoldingSuite extends PlanTest {
Batch("AnalysisNodes", Once,
EliminateSubQueries) ::
Batch("ConstantFolding", Once,
OptimizeIn,
ConstantFolding,
BooleanSimplification) :: Nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmbrus Seems we need to add all of the rules shown in Analyzer's "ConstantFolding" to here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial plan here was to only include the rules we are testing. I'm not sure if its true here, but sometimes including all the rules make it harder to test since then you need something that won't be optimized away by unrelated rules.

}
Expand Down Expand Up @@ -247,4 +248,36 @@ class ConstantFoldingSuite extends PlanTest {

comparePlans(optimized, correctAnswer)
}

test("Constant folding test: Fold In(v, list) into true or false") {
var originalQuery =
testRelation
.select('a)
.where(In(Literal(1), Seq(Literal(1), Literal(2))))

var optimized = Optimize.execute(originalQuery.analyze)

var correctAnswer =
testRelation
.select('a)
.where(Literal(true))
.analyze

comparePlans(optimized, correctAnswer)

originalQuery =
testRelation
.select('a)
.where(In(Literal(1), Seq(Literal(1), 'a.attr)))

optimized = Optimize.execute(originalQuery.analyze)

correctAnswer =
testRelation
.select('a)
.where(Literal(true))
.analyze

comparePlans(optimized, correctAnswer)
}
}