Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49962][SQL] Simplify AbstractStringTypes class hierarchy #48459

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

stefankandic
Copy link
Contributor

@stefankandic stefankandic commented Oct 14, 2024

What changes were proposed in this pull request?

Simplifying the AbstractStringType hierarchy.

Why are the changes needed?

The addition of trim-sensitive collation (#48336) highlighted the complexity of extending the existing AbstractStringType structure. Besides adding a new parameter to all types inheriting from AbstractStringType, it caused changing the logic of every subclass as well as changing the name of a derived class StringTypeAnyCollation into StringTypeWithCaseAccentSensitivity which could again be subject to change if we keep adding new specifiers.

Looking ahead, the introduction of support for indeterminate collation would further complicate these types. To address this, the proposed changes simplify the design by consolidating common logic into a single base class. This base class will handle core functionality such as trim or indeterminate collation, while a derived class, StringTypeWithCollation (previously awkwardly called StringTypeWithCaseAccentSensitivity), will manage collation specifiers.

This approach allows for easier future extensions: fundamental checks can be handled in the base class, while any new specifiers can be added as optional fields in StringTypeWithCollation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

With existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 14, 2024
*/
abstract class AbstractStringType(private[sql] val supportsTrimCollation: Boolean = false)
abstract class AbstractStringType(supportsTrimCollation: Boolean = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove private[sql] val? let's keep it private unless there's a need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to private? Looks like other classes inheriting from AbstractDataType like AbstractMapType don't have their members private either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep it private to SQL in all of them? no need to expose stuff unless there's a need to expose stuff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sql/internal is already a private package so we won;t need to mark it as private[sql]. We should add a package.scala and document so, see https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/package.scala

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes, left a couple of comments

otherwise lgtm, let's go for it

@uros-db
Copy link
Contributor

uros-db commented Oct 14, 2024

also, let's just mention in the PR description that we're doing some renaming (which is most of the code changes in this PR I'd say)

Copy link
Contributor

@jovanpavl-db jovanpavl-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants