Skip to content

Commit

Permalink
[SQL][TEST][FOLLOWUP] Re-run collation benchmark (NonASCII)
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
Following up on #47030, re-running the collation benchmark for NonASCII.

### Why are the changes needed?
We've changed the meaning of LCASE collation in Spark, and also modified how equality checks / hashing/ expressions work with this collation, so we need to re-run the benchmarks and identify areas of improvement.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing tests.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #47054 from uros-db/collation-benchmarks-nonascii.

Authored-by: Uros Bojanic <157381213+uros-db@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
uros-db authored and cloud-fan committed Jun 24, 2024
1 parent 88cc153 commit 31fa9d8
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 60 deletions.
60 changes: 30 additions & 30 deletions sql/core/benchmarks/CollationNonASCIIBenchmark-jdk21-results.txt
Original file line number Diff line number Diff line change
@@ -1,54 +1,54 @@
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - equalsFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 8454 8463 13 0.0 211345.8 1.0X
UNICODE 380 383 3 0.1 9499.1 22.2X
UTF8_BINARY 381 386 5 0.1 9536.4 22.2X
UNICODE_CI 6425 6426 1 0.0 160615.2 1.3X
UTF8_BINARY 177 178 2 0.2 4421.6 1.0X
UTF8_LCASE 7165 7178 19 0.0 179129.7 0.0X
UNICODE 5601 5607 8 0.0 140030.5 0.0X
UNICODE_CI 5389 5402 19 0.0 134734.8 0.0X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - compareFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 8498 8526 41 0.0 212443.7 1.0X
UNICODE 6392 6393 1 0.0 159798.1 1.3X
UTF8_BINARY 424 424 0 0.1 10607.1 20.0X
UNICODE_CI 6766 6780 20 0.0 169149.1 1.3X
UTF8_BINARY 307 310 4 0.1 7684.9 1.0X
UTF8_LCASE 6668 6673 6 0.0 166712.0 0.0X
UNICODE 5135 5138 4 0.0 128375.9 0.1X
UNICODE_CI 5074 5079 7 0.0 126857.9 0.1X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - hashFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 4125 4126 2 0.0 103121.0 1.0X
UNICODE 15723 15761 54 0.0 393076.4 0.3X
UTF8_BINARY 565 566 1 0.1 14136.1 7.3X
UNICODE_CI 13050 13059 13 0.0 326244.9 0.3X
UTF8_BINARY 382 383 1 0.1 9546.3 1.0X
UTF8_LCASE 3302 3304 3 0.0 82540.3 0.1X
UNICODE 15198 15221 33 0.0 379949.7 0.0X
UNICODE_CI 11761 11763 3 0.0 294018.9 0.0X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - contains: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 9884 9894 15 0.0 247091.2 1.0X
UNICODE 1346 1348 3 0.0 33638.2 7.3X
UTF8_BINARY 1474 1482 11 0.0 36858.5 6.7X
UNICODE_CI 63238 63273 50 0.0 1580950.4 0.2X
UTF8_BINARY 1343 1344 1 0.0 33576.2 1.0X
UTF8_LCASE 34362 34362 1 0.0 859049.8 0.0X
UNICODE 70951 70968 24 0.0 1773767.5 0.0X
UNICODE_CI 80623 80806 258 0.0 2015572.4 0.0X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - startsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 9740 9748 12 0.0 243489.5 1.0X
UNICODE 1106 1107 0 0.0 27661.7 8.8X
UTF8_BINARY 1248 1251 4 0.0 31212.2 7.8X
UNICODE_CI 66333 66480 209 0.0 1658317.1 0.1X
UTF8_BINARY 1054 1065 15 0.0 26353.2 1.0X
UTF8_LCASE 19162 19185 31 0.0 479061.4 0.1X
UNICODE 70920 70969 69 0.0 1773010.7 0.0X
UNICODE_CI 80608 80637 42 0.0 2015195.3 0.0X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - endsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 9694 9704 14 0.0 242354.0 1.0X
UNICODE 1103 1105 2 0.0 27581.6 8.8X
UTF8_BINARY 1259 1261 3 0.0 31471.6 7.7X
UNICODE_CI 66582 68209 2301 0.0 1664562.5 0.1X
UTF8_BINARY 1085 1085 1 0.0 27116.4 1.0X
UTF8_LCASE 18171 18194 32 0.0 454278.5 0.1X
UNICODE 76434 76440 8 0.0 1910849.8 0.0X
UNICODE_CI 85673 85704 44 0.0 2141822.3 0.0X

60 changes: 30 additions & 30 deletions sql/core/benchmarks/CollationNonASCIIBenchmark-results.txt
Original file line number Diff line number Diff line change
@@ -1,54 +1,54 @@
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - equalsFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 8632 8634 2 0.0 215809.9 1.0X
UNICODE 386 392 4 0.1 9650.9 22.4X
UTF8_BINARY 387 394 5 0.1 9665.3 22.3X
UNICODE_CI 6364 6368 6 0.0 159101.1 1.4X
UTF8_BINARY 133 133 1 0.3 3320.1 1.0X
UTF8_LCASE 7715 7735 28 0.0 192878.1 0.0X
UNICODE 5509 5517 11 0.0 137725.8 0.0X
UNICODE_CI 5585 5586 1 0.0 139631.8 0.0X

OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - compareFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 8635 8645 14 0.0 215873.3 1.0X
UNICODE 6335 6342 10 0.0 158365.8 1.4X
UTF8_BINARY 401 402 0 0.1 10034.6 21.5X
UNICODE_CI 6443 6444 1 0.0 161077.0 1.3X
UTF8_BINARY 446 448 3 0.1 11161.3 1.0X
UTF8_LCASE 7237 7250 17 0.0 180932.4 0.1X
UNICODE 5734 5734 1 0.0 143338.0 0.1X
UNICODE_CI 5699 5700 1 0.0 142483.5 0.1X

OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - hashFunction: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 4479 4487 12 0.0 111968.7 1.0X
UNICODE 16070 16100 42 0.0 401752.8 0.3X
UTF8_BINARY 828 830 2 0.0 20711.4 5.4X
UNICODE_CI 13330 13344 20 0.0 333246.3 0.3X
UTF8_BINARY 413 413 0 0.1 10316.4 1.0X
UTF8_LCASE 3407 3407 0 0.0 85177.8 0.1X
UNICODE 15026 15046 28 0.0 375646.1 0.0X
UNICODE_CI 12181 12204 32 0.0 304526.9 0.0X

OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - contains: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 10200 10208 12 0.0 255002.0 1.0X
UNICODE 1191 1193 3 0.0 29782.0 8.6X
UTF8_BINARY 1326 1326 0 0.0 33160.9 7.7X
UNICODE_CI 63362 63434 101 0.0 1584059.9 0.2X
UTF8_BINARY 1217 1217 1 0.0 30413.5 1.0X
UTF8_LCASE 34426 34438 17 0.0 860656.8 0.0X
UNICODE 68095 68202 151 0.0 1702375.3 0.0X
UNICODE_CI 77954 78229 388 0.0 1948859.6 0.0X

OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - startsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 10467 10496 41 0.0 261666.3 1.0X
UNICODE 1140 1141 1 0.0 28499.5 9.2X
UTF8_BINARY 1260 1261 1 0.0 31496.2 8.3X
UNICODE_CI 60515 60630 163 0.0 1512877.3 0.2X
UTF8_BINARY 960 961 2 0.0 23990.6 1.0X
UTF8_LCASE 18582 18599 24 0.0 464545.8 0.1X
UNICODE 68340 68426 121 0.0 1708511.0 0.0X
UNICODE_CI 79017 79051 48 0.0 1975424.4 0.0X

OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1018-azure
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Linux 6.5.0-1022-azure
AMD EPYC 7763 64-Core Processor
collation unit benchmarks - endsWith: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UTF8_BINARY_LCASE 10357 10357 0 0.0 258921.6 1.0X
UNICODE 1107 1109 2 0.0 27681.9 9.4X
UTF8_BINARY 1260 1262 3 0.0 31509.2 8.2X
UNICODE_CI 67119 67120 1 0.0 1677968.5 0.2X
UTF8_BINARY 1083 1083 0 0.0 27073.4 1.0X
UTF8_LCASE 18868 18879 15 0.0 471690.7 0.1X
UNICODE 73435 73580 205 0.0 1835881.8 0.0X
UNICODE_CI 83324 83416 130 0.0 2083108.7 0.0X

0 comments on commit 31fa9d8

Please sign in to comment.