-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12188] [SQL] Code refactoring and comment correction in Dataset APIs #10184
Conversation
ok to test |
Test build #47298 has finished for PR 10184 at commit
|
Test build #47300 has finished for PR 10184 at commit
|
|
||
/** | ||
* The encoder where the expressions used to construct an object from an input row have been | ||
* bound to the ordinals of the given schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'm going to change this to say this [[Dataset]]'s output schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, actually i forgot :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add the change in a follow-up PR. : )
Thanks, I'm going to merge this to master and 1.6. |
… APIs This PR contains the following updates: - Created a new private variable `boundTEncoder` that can be shared by multiple functions, `RDD`, `select` and `collect`. - Replaced all the `queryExecution.analyzed` by the function call `logicalPlan` - A few API comments are using wrong class names (e.g., `DataFrame`) or parameter names (e.g., `n`) - A few API descriptions are wrong. (e.g., `mapPartitions`) marmbrus rxin cloud-fan Could you take a look and check if they are appropriate? Thank you! Author: gatorsmile <gatorsmile@gmail.com> Closes #10184 from gatorsmile/datasetClean. (cherry picked from commit 5d96a71) Signed-off-by: Michael Armbrust <michael@databricks.com>
This PR contains the following updates:
boundTEncoder
that can be shared by multiple functions,RDD
,select
andcollect
.queryExecution.analyzed
by the function calllogicalPlan
DataFrame
) or parameter names (e.g.,n
)mapPartitions
)@marmbrus @rxin @cloud-fan Could you take a look and check if they are appropriate? Thank you!