-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark #46261
Conversation
…k.DateTimeBenchmark
@@ -75,7 +81,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { | |||
doBenchmark(N, s"$dt + interval 1 month 2 day") | |||
} | |||
benchmark.addCase("date + interval(m, d, ms)") { _ => | |||
doBenchmark(N, s"$dt + interval 1 month 2 day 5 hour") | |||
doBenchmarkAnsiOff(N, s"$dt + interval 1 month 2 day 5 hour") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
illegal hour portion for ansi date add
@@ -161,7 +167,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { | |||
} | |||
val dateExpr = "cast(timestamp_seconds(id) as date)" | |||
Seq("year", "yyyy", "yy", "mon", "month", "mm").foreach { level => | |||
run(N, s"trunc $level", s"trunc('$level', $dateExpr)") | |||
run(N, s"trunc $level", s"trunc($dateExpr, '$level')") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parameter order is wrong
@@ -171,7 +177,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { | |||
run(n, "to timestamp str", timestampStrExpr) | |||
run(n, "to_timestamp", s"to_timestamp($timestampStrExpr, $pattern)") | |||
run(n, "to_unix_timestamp", s"to_unix_timestamp($timestampStrExpr, $pattern)") | |||
val dateStrExpr = "concat('2019-01-', lpad(mod(id, 25), 2, '0'))" | |||
val dateStrExpr = "concat('2019-01-', lpad(mod(id, 25) + 1, 2, '0'))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid invalid date value 2019-01-00
trunc year wholestage off 866 867 2 11.5 86.6 1.0X | ||
trunc year wholestage on 886 888 2 11.3 88.6 1.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
AMD EPYC 7763 64-Core Processor | ||
trunc yyyy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
trunc yyyy wholestage off 301 305 5 33.2 30.1 1.0X | ||
trunc yyyy wholestage on 271 275 3 36.9 27.1 1.1X | ||
trunc yyyy wholestage off 882 883 1 11.3 88.2 1.0X | ||
trunc yyyy wholestage on 881 887 5 11.4 88.1 1.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
AMD EPYC 7763 64-Core Processor | ||
trunc yy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
trunc yy wholestage off 295 295 0 33.9 29.5 1.0X | ||
trunc yy wholestage on 270 277 4 37.0 27.0 1.1X | ||
trunc yy wholestage off 875 877 3 11.4 87.5 1.0X | ||
trunc yy wholestage on 881 890 9 11.3 88.1 1.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
AMD EPYC 7763 64-Core Processor | ||
trunc mon: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
trunc mon wholestage off 296 297 1 33.8 29.6 1.0X | ||
trunc mon wholestage on 275 278 2 36.4 27.5 1.1X | ||
trunc mon wholestage off 838 839 2 11.9 83.8 1.0X | ||
trunc mon wholestage on 833 839 5 12.0 83.3 1.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
AMD EPYC 7763 64-Core Processor | ||
trunc month: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
trunc month wholestage off 304 308 6 32.9 30.4 1.0X | ||
trunc month wholestage on 276 280 4 36.2 27.6 1.1X | ||
trunc month wholestage off 839 841 4 11.9 83.9 1.0X | ||
trunc month wholestage on 837 839 2 12.0 83.7 1.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
AMD EPYC 7763 64-Core Processor | ||
trunc mm: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
trunc mm wholestage off 297 298 1 33.6 29.7 1.0X | ||
trunc mm wholestage on 274 276 2 36.5 27.4 1.1X | ||
trunc mm wholestage off 846 847 1 11.8 84.6 1.0X | ||
trunc mm wholestage on 834 845 11 12.0 83.4 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results are corrected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine but do we have this issue in other banchmarks too?
Thank you @HyukjinKwon. I have dispatched a CI task to audit, https://github.com/yaooqinn/spark/actions/runs/8866497258 |
Merged to master |
…k.DateTimeBenchmark ### What changes were proposed in this pull request? This PR fixes several issues in org.apache.spark.sql.execution.benchmark.DateTimeBenchmark - Misuse `trunc` function, a.k.a, the parameter order is wrong in reverse order - Some benchmarks not compatible with ANSI by default ### Why are the changes needed? restore benchmark cases ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? benchmark ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#46261 from yaooqinn/SPARK-48025. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Kent Yao <yao@apache.org>
What changes were proposed in this pull request?
This PR fixes several issues in org.apache.spark.sql.execution.benchmark.DateTimeBenchmark
trunc
function, a.k.a, the parameter order is wrong in reverse orderWhy are the changes needed?
restore benchmark cases
Does this PR introduce any user-facing change?
no
How was this patch tested?
benchmark
Was this patch authored or co-authored using generative AI tooling?
no