Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4508] [SQL] build native date type to conform behavior to Hive #3381

Closed
wants to merge 4 commits into from

Conversation

adrian-wang
Copy link
Contributor

Store daysSinceEpoch as an Int value(4 bytes) to represent DateType, instead of using java.sql.Date(8 bytes as Long) in catalyst row. This ensures the same comparison behavior of Hive and Catalyst.

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23664 has started for PR 3381 at commit f7c704a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23664 has finished for PR 3381 at commit f7c704a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • final class Date extends Ordered[Date] with Serializable

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23664/
Test FAILed.

@adrian-wang
Copy link
Contributor Author

retest this please.

@adrian-wang
Copy link
Contributor Author

This is a weird build error. @shaneknapp

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23665 has started for PR 3381 at commit f7c704a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23665 has finished for PR 3381 at commit f7c704a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • final class Date extends Ordered[Date] with Serializable

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23665/
Test FAILed.

@shaneknapp
Copy link
Contributor

looks like a compilation error, and the reason why no unit test results
were stored was that none were run (unless i'm missing something).

On Wed, Nov 19, 2014 at 11:56 PM, Daoyuan Wang notifications@github.com
wrote:

This is a weird build error. @shaneknapp https://github.com/shaneknapp


Reply to this email directly or view it on GitHub
#3381 (comment).

@shaneknapp
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23676 has started for PR 3381 at commit f7c704a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23676 has finished for PR 3381 at commit f7c704a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • final class Date extends Ordered[Date] with Serializable

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23676/
Test FAILed.

@adrian-wang
Copy link
Contributor Author

This builds successfully locally, and the build error is very confusing, since I never changed anything related to that.

@adrian-wang
Copy link
Contributor Author

@marmbrus @rxin @liancheng Can you help verify this build error? I cloned this branch separately and the build is successful.

@liancheng
Copy link
Contributor

The history of this build has already timed out...

@liancheng
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23816 has started for PR 3381 at commit f7c704a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23816 has finished for PR 3381 at commit f7c704a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • final class Date extends Ordered[Date] with Serializable

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23816/
Test FAILed.

@adrian-wang
Copy link
Contributor Author

@liancheng can you reproduce the build error locally? I think it is a bug of Jenkins.

@shaneknapp
Copy link
Contributor

i'll have the time to take a closer look at this tomorrow.

On Mon, Nov 24, 2014 at 9:59 PM, Daoyuan Wang notifications@github.com
wrote:

@liancheng https://github.com/liancheng can you reproduce the build
error locally? I think it is a bug of Jenkins.


Reply to this email directly or view it on GitHub
#3381 (comment).

@liancheng
Copy link
Contributor

@adrian-wang Actually I can reproduce this compilation error with this:

./sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-0.12.0 -Phive-thriftserver clean compile

Those profiles used by Jenkins are unfortunately not printed when compiling against Hive 0.12.0. You may refer to dev/run-tests for details.

@liancheng
Copy link
Contributor

And the compilation failure is caused by this line https://github.com/apache/spark/pull/3381/files#diff-ff50aea397a607b79df9bec6f2a841dbL23

@adrian-wang
Copy link
Contributor Author

@liancheng thanks for your help! My local compilation with -Phive-0.13.1 is totally OK......
Anyway, it is ok now, with that import line back again.:)

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23827 has started for PR 3381 at commit 0c6cab6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23827 has finished for PR 3381 at commit 0c6cab6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • final class Date extends Ordered[Date] with Serializable

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23827/
Test PASSed.

@liancheng
Copy link
Contributor

@adrian-wang Yea, Jenkins compiles Spark SQL with both Hive 0.12.0 and 0.13.1, and then runs SQL tests against 0.13.1.

@marmbrus
Copy link
Contributor

Hi @adrian-wang, thanks for working on this and I'm very sorry for letting it fall out of date. In general this seems like a reasonable change to make, but we do need to be very careful that we don't change the output types as those are part of the public API. I could be wrong, but it doesn't seem like we are doing that now.

So I propose the following:

  • Close this issue for now, and reopen it once it has been updated. (we are trying very hard to keep the PR queue small and limited to active PRs so that things don't fall through the cracks).
  • Add tests to scala/java/python to make sure we are not changing the types that are output.

What do you think?

@asfgit asfgit closed this in ca12608 Dec 17, 2014
@adrian-wang
Copy link
Contributor Author

Thanks for comments! Sorry for letting it out of date. I have no authority to reopen PRs, so I'll start a new session for this.

@marmbrus
Copy link
Contributor

Oh, sorry I thought you could reopen your own PRs. I guess that is not the case. Either way, please open a new one when ready.

asfgit pushed a commit that referenced this pull request Feb 2, 2015
Store daysSinceEpoch as an Int value(4 bytes) to represent DateType, instead of using java.sql.Date(8 bytes as Long) in catalyst row. This ensures the same comparison behavior of Hive and Catalyst.
Subsumes #3381
I thinks there are already some tests in JavaSQLSuite, and for python it will not affect python's datetime class.

Author: Daoyuan Wang <daoyuan.wang@intel.com>

Closes #3732 from adrian-wang/datenative and squashes the following commits:

0ed0fdc [Daoyuan Wang] fix test data
a2fdd4e [Daoyuan Wang] getDate
c37832b [Daoyuan Wang] row to catalyst
f0005b1 [Daoyuan Wang] add date in sql parser and java type conversion
024c9a6 [Daoyuan Wang] clean some import order
d6715fc [Daoyuan Wang] refactoring Date as Primitive Int internally
374abd5 [Daoyuan Wang] spark native date type support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants