Skip to content

Commit

Permalink
[SPARK-4693] [SQL] PruningPredicates may be wrong if predicates conta…
Browse files Browse the repository at this point in the history
…ins an empty AttributeSet() references

The sql "select * from spark_test::for_test where abs(20141202) is not null" has predicates=List(IS NOT NULL HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)) and
partitionKeyIds=AttributeSet(). PruningPredicates is List(IS NOT NULL HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)). Then the exception "java.lang.IllegalArgumentException: requirement failed: Partition pruning predicates only supported for partitioned tables." is thrown.
The sql "select * from spark_test::for_test_partitioned_table where abs(20141202) is not null and type_id=11 and platform = 3" with partitioned key insert_date has predicates=List(IS NOT NULL HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202), (type_id#12 = 11), (platform#8 = 3)) and partitionKeyIds=AttributeSet(insert_date#24). PruningPredicates is List(IS NOT NULL HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs(20141202)).

Author: YanTangZhai <hakeemzhai@tencent.com>
Author: yantangzhai <tyz0303@163.com>

Closes apache#3556 from YanTangZhai/SPARK-4693 and squashes the following commits:

620ebe3 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references
37cfdf5 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references
70a3544 [yantangzhai] [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references
efa9b03 [YanTangZhai] Update HiveQuerySuite.scala
72accf1 [YanTangZhai] Update HiveQuerySuite.scala
e572b9a [YanTangZhai] Update HiveStrategies.scala
6e643f8 [YanTangZhai] Merge pull request alteryx#11 from apache/master
e249846 [YanTangZhai] Merge pull request alteryx#10 from apache/master
d26d982 [YanTangZhai] Merge pull request alteryx#9 from apache/master
76d4027 [YanTangZhai] Merge pull request alteryx#8 from apache/master
03b62b0 [YanTangZhai] Merge pull request alteryx#7 from apache/master
8a00106 [YanTangZhai] Merge pull request alteryx#6 from apache/master
cbcba66 [YanTangZhai] Merge pull request #3 from apache/master
cdef539 [YanTangZhai] Merge pull request #1 from apache/master
  • Loading branch information
YanTangZhai authored and marmbrus committed Dec 19, 2014
1 parent 22ddb6e commit e7de7e5
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,6 @@ class AttributeSet private (val baseSet: Set[AttributeEquals])
override def toSeq: Seq[Attribute] = baseSet.map(_.a).toArray.toSeq

override def toString = "{" + baseSet.map(_.a).mkString(", ") + "}"

override def isEmpty: Boolean = baseSet.isEmpty
}
Original file line number Diff line number Diff line change
Expand Up @@ -210,8 +210,9 @@ private[hive] trait HiveStrategies {
// Filter out all predicates that only deal with partition keys, these are given to the
// hive table scan operator to be used for partition pruning.
val partitionKeyIds = AttributeSet(relation.partitionKeys)
val (pruningPredicates, otherPredicates) = predicates.partition {
_.references.subsetOf(partitionKeyIds)
val (pruningPredicates, otherPredicates) = predicates.partition { predicate =>
!predicate.references.isEmpty &&
predicate.references.subsetOf(partitionKeyIds)
}

pruneFilterProject(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,15 @@ class HiveQuerySuite extends HiveComparisonTest with BeforeAndAfter {
createQueryTest("select null from table",
"SELECT null FROM src LIMIT 1")

test("predicates contains an empty AttributeSet() references") {
sql(
"""
|SELECT a FROM (
| SELECT 1 AS a FROM src LIMIT 1 ) table
|WHERE abs(20141202) is not null
""".stripMargin).collect()
}

test("implement identity function using case statement") {
val actual = sql("SELECT (CASE key WHEN key THEN key END) FROM src")
.map { case Row(i: Int) => i }
Expand Down

0 comments on commit e7de7e5

Please sign in to comment.