-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47411][SQL] Support StringInstr & FindInSet functions to work with collated strings #45643
Conversation
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
6a52d92
to
eb2d7c5
Compare
@cloud-fan @uros-db @mihailom-db can you take a look at this changes please? |
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java
Show resolved
Hide resolved
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java
Outdated
Show resolved
Hide resolved
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java
Show resolved
Hide resolved
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java
Outdated
Show resolved
Hide resolved
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @cloud-fan please review
# Conflicts: # common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java # sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala
Outdated
Show resolved
Hide resolved
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java
Show resolved
Hide resolved
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please change PR name to coincide with functions that you were implementing and with the JIRA ticket
thanks, merging to master! |
…with collated strings ### What changes were proposed in this pull request? Extend built-in string functions to support non-binary, non-lowercase collation for: instr & find_in_set. ### Why are the changes needed? Update collation support for built-in string functions in Spark. ### Does this PR introduce _any_ user-facing change? Yes, users should now be able to use COLLATE within arguments for built-in string functions INSTR and FIND_IN_SET in Spark SQL queries, using non-binary collations such as UNICODE_CI. ### How was this patch tested? Unit tests for queries using "collate" (CollationSuite). ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#45643 from miland-db/miland-db/substr-functions. Authored-by: Milan Dankovic <milan.dankovic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Extend built-in string functions to support non-binary, non-lowercase collation for: instr & find_in_set.
Why are the changes needed?
Update collation support for built-in string functions in Spark.
Does this PR introduce any user-facing change?
Yes, users should now be able to use COLLATE within arguments for built-in string functions INSTR and FIND_IN_SET in Spark SQL queries, using non-binary collations such as UNICODE_CI.
How was this patch tested?
Unit tests for queries using "collate" (CollationSuite).
Was this patch authored or co-authored using generative AI tooling?
No