[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap #23284

viirya · 2018-12-11T10:01:32Z

What changes were proposed in this pull request?

Regarding the performance issue of SPARK-26155, it reports the issue on TPC-DS. I think it is better to add a benchmark for LongToUnsafeRowMap which is the root cause of performance regression.

It can be easier to show performance difference between different metric implementations in LongToUnsafeRowMap.

How was this patch tested?

Manually run added benchmark.

viirya · 2018-12-11T10:03:09Z

Without metrics (master after PR 23269):

[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
[info] LongToUnsafeRowMap metrics:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------
[info] LongToUnsafeRowMap                             243 /  347          2.1         485.0       1.0X

Using LongAdder to compute metrics (by applying PR 23214):

[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
[info] LongToUnsafeRowMap metrics:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------
[info] LongToUnsafeRowMap                             401 /  460          1.2         802.2       1.0X

Using Long variables (master before PR 23269):

[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
[info] LongToUnsafeRowMap metrics:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------
[info] LongToUnsafeRowMap                            1220 / 1233          0.4        2439.1       1.0X

viirya · 2018-12-11T10:04:06Z

cc @cloud-fan @dongjoon-hyun

...src/test/scala/org/apache/spark/sql/execution/benchmark/HashedRelationMetricsBenchmark.scala

SparkQA · 2018-12-11T13:47:14Z

Test build #99964 has finished for PR 23284 at commit fa70205.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-11T14:42:01Z

Test build #99974 has finished for PR 23284 at commit cdbae0a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2018-12-11T15:44:59Z

can you update #23284 (comment) ?

The revert PR is already merged, so we should revert the revert PR and run the benchmark again, and post the results in the comment.

cloud-fan · 2018-12-11T15:46:40Z

BTW I think this proves that, in Java if long is accessed by multiple threads, it will cause perf problems even without lock. Maybe it's related to memory barrier.

cc @kiszk @dongjoon-hyun @JkSelf

SparkQA · 2018-12-11T16:43:29Z

Test build #99973 has finished for PR 23284 at commit ccb6fed.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2018-12-11T17:04:31Z

Thank you for ping me, @viirya and @cloud-fan .

SparkQA · 2018-12-11T17:13:36Z

Test build #99979 has finished for PR 23284 at commit 723b27c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2018-12-11T17:32:20Z

retest this please

kiszk · 2018-12-11T17:59:52Z

To be honest, I cannot understand why the original performance degradation occurred.

I think that read/write of long value does not require any sychronization or memory barrier without declaring volatile.
At this PR, numKeyLookups and numProbes are non-volatile long variables. I confirmed it by decompling a class file. However, I have no time to disassemble the generated code today and tomorrow.

Here is an article that addresses the similar topic regarding static long.

cc @rednaxelafx

srowen · 2018-12-11T20:19:52Z

There's no good reason why 64-bit reads/writes shouldn't be atomic on a 64-bit machine, and I assume everything we're testing on is 64-bit these days. It was an issue in the past, and yes as you note, the JLS seems to allow for it to be implementation-specific. No idea...

SparkQA · 2018-12-11T21:16:56Z

Test build #99985 has finished for PR 23284 at commit 723b27c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2018-12-11T21:33:49Z

sql/core/benchmarks/HashedRelationMetricsBenchmark-results.txt

@@ -6,6 +6,6 @@ Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.13.6
 Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
 LongToUnsafeRowMap metrics:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------
-LongToUnsafeRowMap                            1265 / 1336          0.4        2530.5       1.0X
+LongToUnsafeRowMap                             234 /  315          2.1         467.3       1.0X


Yep. It's a clear reason to have this benchmark (723b27c)

dongjoon-hyun · 2018-12-11T21:45:47Z

...src/test/scala/org/apache/spark/sql/execution/benchmark/HashedRelationMetricsBenchmark.scala

+        map.optimize()
+
+        val threads = (0 to 100).map { _ =>
+          val thread = new Thread {


So, is this the real difference from AggregateBenchmark.LongToUnsafeRowMap benchmark case?

I think multi-thread is the key here.

yea, here we focus on multi-thread reading the same LongToUnsafeRowMap .

cloud-fan · 2018-12-12T02:08:21Z

...src/test/scala/org/apache/spark/sql/execution/benchmark/HashedRelationMetricsBenchmark.scala

+ *      Results will be written to "benchmarks/HashedRelationMetricsBenchmark-results.txt".
+ * }}}
+ */
+object HashedRelationMetricsBenchmark extends SqlBasedBenchmark {


to match the real case, shall we benchmark BytesToBytesMap instead of LongToUnsafeRowMap? The real case is, we have one LongToUnsafeRowMap for each thread, but they share the same BytesToBytesMap

UnsafeHashedRelation uses BytesToBytesMap. LongHashedRelation uses LongToUnsafeRowMap. The real case is we have one LongHashedRelation for each thread and they share the same LongToUnsafeRowMap.

oh yes I got messed up :P

viirya · 2018-12-12T02:21:56Z

The revert PR is already merged, so we should revert the revert PR and run the benchmark again, and post the results in the comment.

I've posted the benchmarks for original master before the revert PR in previous comment. I think that is what you asked?

cloud-fan · 2018-12-12T02:42:54Z

The future reviewers may not know the context, when they see Using Long variables (master):, they will be confused as the master branch before your PR is not that.

viirya · 2018-12-12T03:17:00Z

The future reviewers may not know the context, when they see Using Long variables (master):, they will be confused as the master branch before your PR is not that.

If so, let me update the previous comment.

viirya · 2018-12-14T01:16:38Z

@cloud-fan Is this ready to merge?

cloud-fan · 2018-12-14T02:49:58Z

thanks, merging to master!

## What changes were proposed in this pull request? Regarding the performance issue of SPARK-26155, it reports the issue on TPC-DS. I think it is better to add a benchmark for `LongToUnsafeRowMap` which is the root cause of performance regression. It can be easier to show performance difference between different metric implementations in `LongToUnsafeRowMap`. ## How was this patch tested? Manually run added benchmark. Closes apache#23284 from viirya/SPARK-26337. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

Add benchmark for LongToUnsafeRowMap.

fa70205

cloud-fan reviewed Dec 11, 2018

View reviewed changes

...src/test/scala/org/apache/spark/sql/execution/benchmark/HashedRelationMetricsBenchmark.scala Show resolved Hide resolved

Add benchmark result.

cdbae0a

viirya force-pushed the SPARK-26337 branch from ccb6fed to cdbae0a Compare December 11, 2018 13:10

viirya added 2 commits December 11, 2018 22:11

Merge remote-tracking branch 'upstream/master' into SPARK-26337

8ee6fc3

Update benchmark result.

723b27c

dongjoon-hyun reviewed Dec 11, 2018

View reviewed changes

cloud-fan reviewed Dec 12, 2018

View reviewed changes

asfgit closed this in 93139af Dec 14, 2018

viirya deleted the SPARK-26337 branch December 27, 2023 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap #23284

[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap #23284

viirya commented Dec 11, 2018 •

edited

Loading

viirya commented Dec 11, 2018 •

edited

Loading

viirya commented Dec 11, 2018

SparkQA commented Dec 11, 2018

SparkQA commented Dec 11, 2018

cloud-fan commented Dec 11, 2018

cloud-fan commented Dec 11, 2018

SparkQA commented Dec 11, 2018

dongjoon-hyun commented Dec 11, 2018

SparkQA commented Dec 11, 2018

kiszk commented Dec 11, 2018

kiszk commented Dec 11, 2018

srowen commented Dec 11, 2018

SparkQA commented Dec 11, 2018

dongjoon-hyun Dec 11, 2018 •

edited

Loading

dongjoon-hyun Dec 11, 2018

cloud-fan Dec 12, 2018

viirya Dec 12, 2018

cloud-fan Dec 12, 2018 •

edited

Loading

viirya Dec 12, 2018

cloud-fan Dec 12, 2018

viirya commented Dec 12, 2018 •

edited

Loading

cloud-fan commented Dec 12, 2018

viirya commented Dec 12, 2018

viirya commented Dec 14, 2018

cloud-fan commented Dec 14, 2018

[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap #23284

[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap #23284

Conversation

viirya commented Dec 11, 2018 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

viirya commented Dec 11, 2018 • edited Loading

viirya commented Dec 11, 2018

SparkQA commented Dec 11, 2018

SparkQA commented Dec 11, 2018

cloud-fan commented Dec 11, 2018

cloud-fan commented Dec 11, 2018

SparkQA commented Dec 11, 2018

dongjoon-hyun commented Dec 11, 2018

SparkQA commented Dec 11, 2018

kiszk commented Dec 11, 2018

kiszk commented Dec 11, 2018

srowen commented Dec 11, 2018

SparkQA commented Dec 11, 2018

dongjoon-hyun Dec 11, 2018 • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun Dec 11, 2018

Choose a reason for hiding this comment

cloud-fan Dec 12, 2018

Choose a reason for hiding this comment

viirya Dec 12, 2018

Choose a reason for hiding this comment

cloud-fan Dec 12, 2018 • edited Loading

Choose a reason for hiding this comment

viirya Dec 12, 2018

Choose a reason for hiding this comment

cloud-fan Dec 12, 2018

Choose a reason for hiding this comment

viirya commented Dec 12, 2018 • edited Loading

cloud-fan commented Dec 12, 2018

viirya commented Dec 12, 2018

viirya commented Dec 14, 2018

cloud-fan commented Dec 14, 2018

viirya commented Dec 11, 2018 •

edited

Loading

viirya commented Dec 11, 2018 •

edited

Loading

dongjoon-hyun Dec 11, 2018 •

edited

Loading

cloud-fan Dec 12, 2018 •

edited

Loading

viirya commented Dec 12, 2018 •

edited

Loading