Skip to content

Commit

Permalink
[SPARK-26677][BUILD] Update Parquet to 1.10.1 with notEq pushdown fix.
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

Update to Parquet Java 1.10.1.

## How was this patch tested?

Added a test from HyukjinKwon that validates the notEq case from SPARK-26677.

Closes apache#23704 from rdblue/SPARK-26677-fix-noteq-parquet-bug.

Lead-authored-by: Ryan Blue <blue@apache.org>
Co-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Co-authored-by: Ryan Blue <rdblue@users.noreply.github.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
3 people authored and dongjoon-hyun committed Feb 2, 2019
1 parent a5427a0 commit f72d217
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 11 deletions.
10 changes: 5 additions & 5 deletions dev/deps/spark-deps-hadoop-2.7
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,13 @@ orc-shims-1.5.4.jar
oro-2.0.8.jar
osgi-resource-locator-1.0.1.jar
paranamer-2.8.jar
parquet-column-1.10.0.jar
parquet-common-1.10.0.jar
parquet-encoding-1.10.0.jar
parquet-column-1.10.1.jar
parquet-common-1.10.1.jar
parquet-encoding-1.10.1.jar
parquet-format-2.4.0.jar
parquet-hadoop-1.10.0.jar
parquet-hadoop-1.10.1.jar
parquet-hadoop-bundle-1.6.0.jar
parquet-jackson-1.10.0.jar
parquet-jackson-1.10.1.jar
protobuf-java-2.5.0.jar
py4j-0.10.8.1.jar
pyrolite-4.13.jar
Expand Down
10 changes: 5 additions & 5 deletions dev/deps/spark-deps-hadoop-3.1
Original file line number Diff line number Diff line change
Expand Up @@ -178,13 +178,13 @@ orc-shims-1.5.4.jar
oro-2.0.8.jar
osgi-resource-locator-1.0.1.jar
paranamer-2.8.jar
parquet-column-1.10.0.jar
parquet-common-1.10.0.jar
parquet-encoding-1.10.0.jar
parquet-column-1.10.1.jar
parquet-common-1.10.1.jar
parquet-encoding-1.10.1.jar
parquet-format-2.4.0.jar
parquet-hadoop-1.10.0.jar
parquet-hadoop-1.10.1.jar
parquet-hadoop-bundle-1.6.0.jar
parquet-jackson-1.10.0.jar
parquet-jackson-1.10.1.jar
protobuf-java-2.5.0.jar
py4j-0.10.8.1.jar
pyrolite-4.13.jar
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@
<!-- note that this should be compatible with Kafka brokers version 0.10 and up -->
<kafka.version>2.1.0</kafka.version>
<derby.version>10.12.1.1</derby.version>
<parquet.version>1.10.0</parquet.version>
<parquet.version>1.10.1</parquet.version>
<orc.version>1.5.4</orc.version>
<orc.classifier>nohive</orc.classifier>
<hive.parquet.version>1.6.0</hive.parquet.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -890,6 +890,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext
}
}
}

test("SPARK-26677: negated null-safe equality comparison should not filter matched row groups") {
(true :: false :: Nil).foreach { vectorized =>
withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> vectorized.toString) {
withTempPath { path =>
// Repeated values for dictionary encoding.
Seq(Some("A"), Some("A"), None).toDF.repartition(1)
.write.parquet(path.getAbsolutePath)
val df = spark.read.parquet(path.getAbsolutePath)
checkAnswer(stripSparkFilter(df.where("NOT (value <=> 'A')")), df)
}
}
}
}

}

object TestingUDT {
Expand Down

0 comments on commit f72d217

Please sign in to comment.