Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7133][SQL] Implement struct, array, and map field accessor #5744

Closed
wants to merge 12 commits into from

Conversation

cloud-fan
Copy link
Contributor

It's the first step: generalize UnresolvedGetField to support all map, struct, and array
TODO: add apply in Scala and __getitem__ in Python, and unify the getItem and getField methods to one single API(or should we keep them for compatibility?).

@cloud-fan
Copy link
Contributor Author

cc @rxin @marmbrus , it should be easy to add apply method, so I'll do it after code review.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@liancheng
Copy link
Contributor

ok to test

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Apr 28, 2015

Test build #31189 has started for PR 5744 at commit b9636f6.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31189 has finished for PR 5744 at commit b9636f6.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedGetField(child: Expression, fieldExpr: Expression) extends UnaryExpression
    • trait GetField extends UnaryExpression
    • abstract class StructGetField extends GetField
    • abstract class OrdinalGetField extends GetField
    • case class SimpleStructGetField(child: Expression, field: StructField, ordinal: Int)
    • case class ArrayStructGetField(
    • case class ArrayOrdinalGetField(child: Expression, ordinal: Expression)
    • case class MapOrdinalGetField(child: Expression, ordinal: Expression)
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31189/
Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31204 has started for PR 5744 at commit 0ec5b96.

@rxin
Copy link
Contributor

rxin commented Apr 29, 2015

We should definitely keep the old function for compatibility.

@@ -194,7 +194,7 @@ case class ResolvedStar(expressions: Seq[NamedExpression]) extends Star {
override def toString: String = expressions.mkString("ResolvedStar(", ", ", ")")
}

case class UnresolvedGetField(child: Expression, fieldName: String) extends UnaryExpression {
case class UnresolvedGetField(child: Expression, fieldExpr: Expression) extends UnaryExpression {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some javadoc for this class. at the very least, we should say this can be used to get a field out of struct/map/array.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31204 has finished for PR 5744 at commit 0ec5b96.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedGetField(child: Expression, fieldExpr: Expression) extends UnaryExpression
    • trait GetField extends UnaryExpression
    • abstract class StructGetField extends GetField
    • abstract class OrdinalGetField extends GetField
    • case class SimpleStructGetField(child: Expression, field: StructField, ordinal: Int)
    • case class ArrayStructGetField(
    • case class ArrayOrdinalGetField(child: Expression, ordinal: Expression)
    • case class MapOrdinalGetField(child: Expression, ordinal: Expression)
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31204/
Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31236 has started for PR 5744 at commit d72e47e.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31236 has finished for PR 5744 at commit d72e47e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedGetField(child: Expression, fieldExpr: Expression) extends UnaryExpression
    • trait GetField extends UnaryExpression
    • abstract class StructGetField extends GetField
    • abstract class OrdinalGetField extends GetField
    • case class SimpleStructGetField(child: Expression, field: StructField, ordinal: Int)
    • case class ArrayStructGetField(
    • case class ArrayOrdinalGetField(child: Expression, ordinal: Expression)
    • case class MapOrdinalGetField(child: Expression, ordinal: Expression)
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31236/
Test FAILed.

@cloud-fan
Copy link
Contributor Author

The failed test cases looks unrelated, retest?

@rxin
Copy link
Contributor

rxin commented Apr 29, 2015

Jenkins, retest this please.

@rxin
Copy link
Contributor

rxin commented Apr 29, 2015

Jenkins, add to whitelist.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Apr 29, 2015

Test build #31251 has started for PR 5744 at commit d72e47e.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32210 has started for PR 5744 at commit 7274041.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32210 has finished for PR 5744 at commit 7274041.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)
    • case class CreateTableAsSelect(

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32210/
Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32213 has started for PR 5744 at commit 715c589.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32213 has finished for PR 5744 at commit 715c589.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32213/
Test PASSed.

@marmbrus
Copy link
Contributor

marmbrus commented May 8, 2015

Thanks! Merged to master and 1.4.

asfgit pushed a commit that referenced this pull request May 8, 2015
It's the first step: generalize UnresolvedGetField to support all map, struct, and array
TODO: add `apply` in Scala and `__getitem__` in Python, and unify the `getItem` and `getField` methods to one single API(or should we keep them for compatibility?).

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes #5744 from cloud-fan/generalize and squashes the following commits:

715c589 [Wenchen Fan] address comments
7ea5b31 [Wenchen Fan] fix python test
4f0833a [Wenchen Fan] add python test
f515d69 [Wenchen Fan] add apply method and test cases
8df6199 [Wenchen Fan] fix python test
239730c [Wenchen Fan] fix test compile
2a70526 [Wenchen Fan] use _bin_op in dataframe.py
6bf72bc [Wenchen Fan] address comments
3f880c3 [Wenchen Fan] add java doc
ab35ab5 [Wenchen Fan] fix python test
b5961a9 [Wenchen Fan] fix style
c9d85f5 [Wenchen Fan] generalize UnresolvedGetField to support all map, struct, and array

(cherry picked from commit 2d05f32)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 2d05f32 May 8, 2015
@cloud-fan cloud-fan deleted the generalize branch May 9, 2015 03:35
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
It's the first step: generalize UnresolvedGetField to support all map, struct, and array
TODO: add `apply` in Scala and `__getitem__` in Python, and unify the `getItem` and `getField` methods to one single API(or should we keep them for compatibility?).

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes apache#5744 from cloud-fan/generalize and squashes the following commits:

715c589 [Wenchen Fan] address comments
7ea5b31 [Wenchen Fan] fix python test
4f0833a [Wenchen Fan] add python test
f515d69 [Wenchen Fan] add apply method and test cases
8df6199 [Wenchen Fan] fix python test
239730c [Wenchen Fan] fix test compile
2a70526 [Wenchen Fan] use _bin_op in dataframe.py
6bf72bc [Wenchen Fan] address comments
3f880c3 [Wenchen Fan] add java doc
ab35ab5 [Wenchen Fan] fix python test
b5961a9 [Wenchen Fan] fix style
c9d85f5 [Wenchen Fan] generalize UnresolvedGetField to support all map, struct, and array
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
It's the first step: generalize UnresolvedGetField to support all map, struct, and array
TODO: add `apply` in Scala and `__getitem__` in Python, and unify the `getItem` and `getField` methods to one single API(or should we keep them for compatibility?).

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes apache#5744 from cloud-fan/generalize and squashes the following commits:

715c589 [Wenchen Fan] address comments
7ea5b31 [Wenchen Fan] fix python test
4f0833a [Wenchen Fan] add python test
f515d69 [Wenchen Fan] add apply method and test cases
8df6199 [Wenchen Fan] fix python test
239730c [Wenchen Fan] fix test compile
2a70526 [Wenchen Fan] use _bin_op in dataframe.py
6bf72bc [Wenchen Fan] address comments
3f880c3 [Wenchen Fan] add java doc
ab35ab5 [Wenchen Fan] fix python test
b5961a9 [Wenchen Fan] fix style
c9d85f5 [Wenchen Fan] generalize UnresolvedGetField to support all map, struct, and array
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
It's the first step: generalize UnresolvedGetField to support all map, struct, and array
TODO: add `apply` in Scala and `__getitem__` in Python, and unify the `getItem` and `getField` methods to one single API(or should we keep them for compatibility?).

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes apache#5744 from cloud-fan/generalize and squashes the following commits:

715c589 [Wenchen Fan] address comments
7ea5b31 [Wenchen Fan] fix python test
4f0833a [Wenchen Fan] add python test
f515d69 [Wenchen Fan] add apply method and test cases
8df6199 [Wenchen Fan] fix python test
239730c [Wenchen Fan] fix test compile
2a70526 [Wenchen Fan] use _bin_op in dataframe.py
6bf72bc [Wenchen Fan] address comments
3f880c3 [Wenchen Fan] add java doc
ab35ab5 [Wenchen Fan] fix python test
b5961a9 [Wenchen Fan] fix style
c9d85f5 [Wenchen Fan] generalize UnresolvedGetField to support all map, struct, and array
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants