Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47625][SQL] Addition of Indeterminate Collation Support #46004

Closed
wants to merge 19 commits into from

Conversation

mihailom-db
Copy link
Contributor

What changes were proposed in this pull request?

INDETERMINATE_COLLATION should only be thrown on comparison operations and memory storing of data, and we should be able to combine different implicit collations for certain operations like concat and possible others in the future.
This is why we have to add another predefined collation id named INDETERMINATE_COLLATION_ID which means that the result is a combination of conflicting non-default implicit collations. Right now it would an id of -1 so it fail if it ever goes to the CollatorFactory.

Why are the changes needed?

Support for concatenation between columns of different collation is what PGSQL follows and this behaviour should be followed.

Does this PR introduce any user-facing change?

Yes. It adds new error of indeterminate collation.

How was this patch tested?

Tests in CollationSuite.

Was this patch authored or co-authored using generative AI tooling?

No

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
# Conflicts:
#	sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala
#	sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
#	sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala
# Conflicts:
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala
# Conflicts:
#	sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
#	sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
#	sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Aug 23, 2024
@github-actions github-actions bot closed this Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant