Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12188] [SQL] Code refactoring and comment correction in Dataset APIs #10184

Closed
wants to merge 4 commits into from

Conversation

gatorsmile
Copy link
Member

This PR contains the following updates:

  • Created a new private variable boundTEncoder that can be shared by multiple functions, RDD, select and collect.
  • Replaced all the queryExecution.analyzed by the function call logicalPlan
  • A few API comments are using wrong class names (e.g., DataFrame) or parameter names (e.g., n)
  • A few API descriptions are wrong. (e.g., mapPartitions)

@marmbrus @rxin @cloud-fan Could you take a look and check if they are appropriate? Thank you!

@marmbrus
Copy link
Contributor

marmbrus commented Dec 8, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47298 has finished for PR 10184 at commit d510d6a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47300 has finished for PR 10184 at commit d510d6a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


/**
* The encoder where the expressions used to construct an object from an input row have been
* bound to the ordinals of the given schema.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm going to change this to say this [[Dataset]]'s output schema

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, actually i forgot :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add the change in a follow-up PR. : )

@marmbrus
Copy link
Contributor

marmbrus commented Dec 8, 2015

Thanks, I'm going to merge this to master and 1.6.

asfgit pushed a commit that referenced this pull request Dec 8, 2015
… APIs

This PR contains the following updates:

- Created a new private variable `boundTEncoder` that can be shared by multiple functions, `RDD`, `select` and `collect`.
- Replaced all the `queryExecution.analyzed` by the function call `logicalPlan`
- A few API comments are using wrong class names (e.g., `DataFrame`) or parameter names (e.g., `n`)
- A few API descriptions are wrong. (e.g., `mapPartitions`)

marmbrus rxin cloud-fan Could you take a look and check if they are appropriate? Thank you!

Author: gatorsmile <gatorsmile@gmail.com>

Closes #10184 from gatorsmile/datasetClean.

(cherry picked from commit 5d96a71)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 5d96a71 Dec 8, 2015
@gatorsmile gatorsmile deleted the datasetClean branch December 18, 2015 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants