Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core #29085
[SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core #29085
Changes from 4 commits
dfcec3c
e53744b
a693722
5bfa669
ec754e2
a2b12a1
c3dc66b
cb19b7b
ce8a0a5
d37ef86
fce25ff
f3e05c6
5c049b5
04684a8
6811721
fc96e1f
ed901af
a6f1e7d
e367c05
e74d04c
4ef4d76
22d223c
72b2155
a3628ac
e16c136
858f4e5
cfecc90
43d0f24
9e18fa8
9537d9b
5227441
670f21b
ce8184a
4615733
08d97c8
33923b6
f5ec656
7916d72
a769aa7
d93f7fa
be80c27
7f3cff8
03d3409
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the implementation of
processIterator
is pretty similar to the Hive one. Could we share the code between them more?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, working on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this class so big while it doesn't support hive serde?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this , I think we should change this after decide if need to implement serde in script of sql/core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation here offers a very limited support for
ROW FORMAT DELIMITED
format - it does not rely on a Hive's SerDe class.A complete implementation (SerDes class for ROW FORMAT DELIMITED) can be added later and will live in the same folder.
#29085 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maropu For your confuse #29085 (comment),
CalenderIntervalType/ArrayType/MapType/StructType as input of hive default serde will throw error, won't throw error in spark default way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, why does the tests pass in
HiveScriptTransformationSuite
?#29085 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it directly build SparkPlan, don't use sql parser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we already have end-2-end tests for the unspported cases in the hive side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check this here? Seems like it has been checked in https://github.com/apache/spark/pull/29085/files#diff-9847f5cef7cf7fbc5830fbc6b779ee10R783-R784 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, don't need now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this condition
if ioschema.inputSerdeClass.nonEmpty || ioschema.outputSerdeClass.nonEmpty
now?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed and see #29085 (comment)