-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49962][SQL] Simplify AbstractStringTypes class hierarchy #48459
base: master
Are you sure you want to change the base?
Conversation
4218c95
to
4183e5c
Compare
sql/api/src/main/scala/org/apache/spark/sql/internal/types/AbstractStringType.scala
Outdated
Show resolved
Hide resolved
sql/api/src/main/scala/org/apache/spark/sql/internal/types/AbstractStringType.scala
Outdated
Show resolved
Hide resolved
*/ | ||
abstract class AbstractStringType(private[sql] val supportsTrimCollation: Boolean = false) | ||
abstract class AbstractStringType(supportsTrimCollation: Boolean = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you remove private[sql] val
? let's keep it private unless there's a need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to private? Looks like other classes inheriting from AbstractDataType like AbstractMapType don't have their members private either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep it private to SQL in all of them? no need to expose stuff unless there's a need to expose stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql/internal
is already a private package so we won;t need to mark it as private[sql]
. We should add a package.scala
and document so, see https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/package.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the changes, left a couple of comments
otherwise lgtm, let's go for it
also, let's just mention in the PR description that we're doing some renaming (which is most of the code changes in this PR I'd say) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
What changes were proposed in this pull request?
Simplifying the AbstractStringType hierarchy.
Why are the changes needed?
The addition of trim-sensitive collation (#48336) highlighted the complexity of extending the existing AbstractStringType structure. Besides adding a new parameter to all types inheriting from AbstractStringType, it caused changing the logic of every subclass as well as changing the name of a derived class StringTypeAnyCollation into StringTypeWithCaseAccentSensitivity which could again be subject to change if we keep adding new specifiers.
Looking ahead, the introduction of support for indeterminate collation would further complicate these types. To address this, the proposed changes simplify the design by consolidating common logic into a single base class. This base class will handle core functionality such as trim or indeterminate collation, while a derived class, StringTypeWithCollation (previously awkwardly called StringTypeWithCaseAccentSensitivity), will manage collation specifiers.
This approach allows for easier future extensions: fundamental checks can be handled in the base class, while any new specifiers can be added as optional fields in StringTypeWithCollation.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
With existing tests.
Was this patch authored or co-authored using generative AI tooling?
No.