forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SQL][TEST] Re-run collation benchmark
### What changes were proposed in this pull request? Re-running the collation benchmark with two modifications: - UTF8_BINARY_LCASE has been renamed to UTF8_LCASE in apache#46924 - UTF8_BINARY should appear first in the collation benchmark results, so performance is relative to it ### Why are the changes needed? We've changed the meaning of LCASE collation in Spark, and also modified how equality checks / hashing/ expressions work with this collation, so we need to re-run the benchmarks and identify areas of improvement. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Rxisting tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47030 from uros-db/collation-benchmarks. Authored-by: Uros Bojanic <157381213+uros-db@users.noreply.github.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
- Loading branch information
1 parent
b015d73
commit aaaf90d
Showing
3 changed files
with
61 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,54 +1,54 @@ | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - equalsFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
-------------------------------------------------------------------------------------------------------------------------- | ||
UTF8_BINARY_LCASE 2948 2958 13 0.0 29483.6 1.0X | ||
UNICODE 2040 2042 3 0.0 20396.6 1.4X | ||
UTF8_BINARY 2043 2043 0 0.0 20426.3 1.4X | ||
UNICODE_CI 16318 16338 28 0.0 163178.4 0.2X | ||
UTF8_BINARY 1355 1358 4 0.1 13551.1 1.0X | ||
UTF8_LCASE 4983 4984 3 0.0 49826.4 0.3X | ||
UNICODE 18212 18220 12 0.0 182120.9 0.1X | ||
UNICODE_CI 17568 17577 14 0.0 175677.2 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - compareFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
--------------------------------------------------------------------------------------------------------------------------- | ||
UTF8_BINARY_LCASE 3227 3228 1 0.0 32272.1 1.0X | ||
UNICODE 16637 16643 9 0.0 166367.7 0.2X | ||
UTF8_BINARY 3132 3137 7 0.0 31319.2 1.0X | ||
UNICODE_CI 17816 17829 18 0.0 178162.4 0.2X | ||
UTF8_BINARY 1772 1774 3 0.1 17722.3 1.0X | ||
UTF8_LCASE 4365 4365 0 0.0 43649.6 0.4X | ||
UNICODE 16538 16544 9 0.0 165375.5 0.1X | ||
UNICODE_CI 16296 16305 12 0.0 162961.9 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - hashFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 4824 4824 0 0.0 48243.7 1.0X | ||
UNICODE 69416 69475 84 0.0 694158.3 0.1X | ||
UTF8_BINARY 3806 3808 2 0.0 38062.8 1.3X | ||
UNICODE_CI 60943 60975 45 0.0 609426.2 0.1X | ||
UTF8_BINARY 7279 7280 1 0.0 72791.2 1.0X | ||
UTF8_LCASE 18538 18543 6 0.0 185381.0 0.4X | ||
UNICODE 71514 71520 8 0.0 715144.6 0.1X | ||
UNICODE_CI 60488 60488 0 0.0 604880.9 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - contains: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 11979 11980 1 0.0 119790.4 1.0X | ||
UNICODE 6469 6474 7 0.0 64694.8 1.9X | ||
UTF8_BINARY 7253 7253 1 0.0 72528.3 1.7X | ||
UNICODE_CI 319124 319881 1070 0.0 3191244.0 0.0X | ||
UTF8_BINARY 7516 7519 4 0.0 75162.9 1.0X | ||
UTF8_LCASE 120330 120338 12 0.0 1203299.2 0.1X | ||
UNICODE 371784 371946 228 0.0 3717840.7 0.0X | ||
UNICODE_CI 427401 427547 207 0.0 4274009.0 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - startsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 11584 11595 15 0.0 115841.4 1.0X | ||
UNICODE 6155 6156 2 0.0 61548.7 1.9X | ||
UTF8_BINARY 6979 6982 5 0.0 69785.6 1.7X | ||
UNICODE_CI 318228 318726 705 0.0 3182275.2 0.0X | ||
UTF8_BINARY 6504 6507 3 0.0 65044.6 1.0X | ||
UTF8_LCASE 60331 60359 40 0.0 603313.9 0.1X | ||
UNICODE 369394 369404 13 0.0 3693943.0 0.0X | ||
UNICODE_CI 427382 427421 55 0.0 4273819.7 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - endsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 11655 11664 12 0.0 116552.8 1.0X | ||
UNICODE 6235 6239 5 0.0 62350.8 1.9X | ||
UTF8_BINARY 7066 7069 5 0.0 70658.1 1.6X | ||
UNICODE_CI 313515 313999 685 0.0 3135149.1 0.0X | ||
UTF8_BINARY 6600 6601 1 0.0 66002.7 1.0X | ||
UTF8_LCASE 58723 58751 39 0.0 587230.1 0.1X | ||
UNICODE 379668 379789 172 0.0 3796677.7 0.0X | ||
UNICODE_CI 437119 437194 106 0.0 4371189.5 0.0X | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,54 +1,54 @@ | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - equalsFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
-------------------------------------------------------------------------------------------------------------------------- | ||
UTF8_BINARY_LCASE 3571 3576 7 0.0 35708.8 1.0X | ||
UNICODE 2235 2240 7 0.0 22349.2 1.6X | ||
UTF8_BINARY 2237 2242 6 0.0 22371.7 1.6X | ||
UNICODE_CI 18733 18817 118 0.0 187333.8 0.2X | ||
UTF8_BINARY 1370 1370 1 0.1 13698.4 1.0X | ||
UTF8_LCASE 4836 4836 0 0.0 48359.5 0.3X | ||
UNICODE 19239 19271 45 0.0 192391.8 0.1X | ||
UNICODE_CI 18936 18954 25 0.0 189362.4 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - compareFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
--------------------------------------------------------------------------------------------------------------------------- | ||
UTF8_BINARY_LCASE 4260 4290 41 0.0 42602.6 1.0X | ||
UNICODE 19536 19624 124 0.0 195360.2 0.2X | ||
UTF8_BINARY 3582 3612 43 0.0 35818.5 1.2X | ||
UNICODE_CI 20381 20454 103 0.0 203814.1 0.2X | ||
UTF8_BINARY 1726 1727 1 0.1 17260.4 1.0X | ||
UTF8_LCASE 6293 6304 16 0.0 62927.1 0.3X | ||
UNICODE 18677 18679 4 0.0 186768.3 0.1X | ||
UNICODE_CI 18488 18504 23 0.0 184879.6 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - hashFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 7347 7349 3 0.0 73467.1 1.0X | ||
UNICODE 73462 73608 206 0.0 734623.2 0.1X | ||
UTF8_BINARY 5775 5815 57 0.0 57746.0 1.3X | ||
UNICODE_CI 57543 57619 108 0.0 575425.2 0.1X | ||
UTF8_BINARY 3028 3029 1 0.0 30283.4 1.0X | ||
UTF8_LCASE 19773 19830 81 0.0 197726.4 0.2X | ||
UNICODE 68565 68594 41 0.0 685646.9 0.0X | ||
UNICODE_CI 53100 53101 2 0.0 530996.0 0.1X | ||
|
||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - contains: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 15415 15424 13 0.0 154147.1 1.0X | ||
UNICODE 8091 8108 25 0.0 80907.9 1.9X | ||
UTF8_BINARY 8964 8979 21 0.0 89643.5 1.7X | ||
UNICODE_CI 469123 474822 8060 0.0 4691227.7 0.0X | ||
UTF8_BINARY 7024 7026 3 0.0 70244.6 1.0X | ||
UTF8_LCASE 118693 118703 15 0.0 1186926.5 0.1X | ||
UNICODE 385409 386299 1257 0.0 3854093.7 0.0X | ||
UNICODE_CI 434618 435527 1285 0.0 4346181.0 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - startsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 13064 13080 23 0.0 130635.2 1.0X | ||
UNICODE 6836 6851 22 0.0 68360.1 1.9X | ||
UTF8_BINARY 7693 7719 36 0.0 76933.9 1.7X | ||
UNICODE_CI 488919 495530 9349 0.0 4889190.5 0.0X | ||
UTF8_BINARY 6069 6090 29 0.0 60691.9 1.0X | ||
UTF8_LCASE 61809 61828 27 0.0 618094.5 0.1X | ||
UNICODE 370523 371729 1705 0.0 3705229.7 0.0X | ||
UNICODE_CI 435805 436945 1612 0.0 4358051.5 0.0X | ||
|
||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure | ||
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure | ||
AMD EPYC 7763 64-Core Processor | ||
collation unit benchmarks - endsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
------------------------------------------------------------------------------------------------------------------------ | ||
UTF8_BINARY_LCASE 13097 13112 21 0.0 130970.4 1.0X | ||
UNICODE 6960 6985 34 0.0 69603.9 1.9X | ||
UTF8_BINARY 7766 7768 3 0.0 77663.5 1.7X | ||
UNICODE_CI 456956 470733 19485 0.0 4569556.7 0.0X | ||
UTF8_BINARY 6725 6732 10 0.0 67247.9 1.0X | ||
UTF8_LCASE 54990 55010 28 0.0 549896.0 0.1X | ||
UNICODE 380872 383258 3375 0.0 3808722.0 0.0X | ||
UNICODE_CI 443911 444111 283 0.0 4439112.3 0.0X | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters