Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalding updates for Scala 2.12 #1646

Merged
merged 28 commits into from
Mar 23, 2017

Conversation

piyushnarang
Copy link
Collaborator

Based on @sritchie's PR (#1630) so that we can try and get this in and then cut the 0.17.0 release. Putting this up so that we can have travis catch any test failures (2.12 seems to compile ok on my machine, didn't run all tests though).

@johnynek
Copy link
Collaborator

Looks like some errors in the test package.

Thanks for pushing on this.

@piyushnarang
Copy link
Collaborator Author

@johnynek yeah was checking out some of the failures. Will update the PR with fixes and ping you / others when the build is all green.

@piyushnarang
Copy link
Collaborator Author

@johnynek / @isnotinvain / @pankajroark question for you folks. It looks like there's no version of the scrooge-sbt-plugin that supports both 2.10 and 2.12. Support for 2.10 has been dropped on their end a while back. Can we do the same as well? Drop support for 2.10?

@@ -16,7 +16,6 @@ addSbtPlugin("com.typesafe" % "sbt-mima-plugin" % "0.1.12")
addSbtPlugin("com.typesafe.sbt" % "sbt-ghpages" % "0.5.4")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mima 0.1.14 is also out with some bug fixes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll pick it up as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@johnynek
Copy link
Collaborator

johnynek commented Mar 1, 2017

if it is not too expensive can we do the same 2.10 strategy we did in chill?

I am not opposed to dropping 2.10, I guess, but honestly, if it is pretty easy to support, maybe not.

That said, if you are on 2.10 you probably aren't updating much, and we are both on 2.11, so maybe 2.11 is fine?

@piyushnarang
Copy link
Collaborator Author

Yeah if we want to continue supporting 2.10, we'll have to drop the scalding-parquet and scalding-parquet-scrooge projects (as they / their tests depend on scrooge). I'm more in favor of dropping 2.10 support. This change is going to be part of a major release (0.17.0) so we can make some substantial updates (like dropping 2.10). 2.10 seems to have been EOLed around 2 years back (https://www.scala-lang.org/news/2016-schedule.html). It seems like we're going to keep running into issues from time to time if we continue to support it. Seems like it's not worth the support burden when we have 2.11.x and with this change 2.12.x.

@johnynek
Copy link
Collaborator

johnynek commented Mar 1, 2017

sounds good to me. Also testing is slow if we have 3 versions since our tests are already crazy slow.

@piyushnarang
Copy link
Collaborator Author

@johnynek yeah I initially thought that my travis build for this PR might have been down yesterday due to this 3 version explosion :-). But it turned out to be s3's fault.

@johnynek
Copy link
Collaborator

johnynek commented Mar 1, 2017

maybe you brought down s3?

;)

@oscar-stripe
Copy link

looks like this line fails in 2.12:
https://github.com/twitter/scalding/blob/develop/scalding-core/src/main/scala/com/twitter/scalding/LineNumber.scala#L75

we need to catch the exception for lambdas.

@gkk-stripe
Copy link

gkk-stripe commented Mar 17, 2017

I found this problem with Box.scala when trying to upgrade scalding-core to Scala 2.12: https://issues.scala-lang.org/browse/SI-10232

The work-around is to remove half of boxes from Box.scala. Once I've done it, I got all tests in core to pass.

@johnynek
Copy link
Collaborator

Note that @gkk-stripe found this issue we are hitting with boxed:

https://issues.scala-lang.org/browse/SI-10232

@johnynek
Copy link
Collaborator

:) race...

@piyushnarang
Copy link
Collaborator Author

Ah thanks @gkk-stripe that's really cool. I was breaking my head against that one for a while :-). I'll try out your suggestion in the next couple of days (or if you want to push a commit on this PR so that you get credit for it, feel free to do so).

Copy link
Collaborator

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those may need to be on different objects and we may need to import. Can we make those objects private?

@piyushnarang
Copy link
Collaborator Author

@johnynek yeah I can try that. I did try running the scalding-core tests and they worked locally with these changes. Seems to be failing on travis though.

@johnynek
Copy link
Collaborator

johnynek commented Mar 17, 2017 via email

@piyushnarang
Copy link
Collaborator Author

Trying to figure out why the travis build is failing. Tried building a couple of the projects on my machine and they seem to succeed:
./sbt ++2.12.1 clean scalding-thrift-macros/test same with scalding-core/test.

@johnynek
Copy link
Collaborator

I am seeing:

  at com.twitter.scalding.LineNumber$.$anonfun$tryNonScaldingCaller$2(LineNumber.scala:75)
[info]   at com.twitter.scalding.LineNumber$.$anonfun$tryNonScaldingCaller$2$adapted(LineNumber.scala:74)
[info]   at com.twitter.scalding.LineNumber$$$Lambda$401/869703951.apply(Unknown Source)
[info]   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:502)
[info]   at com.twitter.scalding.LineNumber$.headOption$1(LineNumber.scala:68)
[info]   at com.twitter.scalding.LineNumber$.tryNonScaldingCaller(LineNumber```

Which I mentioned as a problem above: 
https://github.com/twitter/scalding/pull/1646#issuecomment-287209798

we need to handle line numbers differently for 2.12 since lambdas might not be able to reflect like that?

@piyushnarang
Copy link
Collaborator Author

@johnynek yeah I pinged you on Gitter. So this seems to be an issue only on travis. I’ve tried a bunch of combinations of trying to repro this locally (directly calling ./sbt ++2.12.1 scalding-thrift-macros/test, invoking it via our run_test.sh script, switching to java 8_31 as thats what travis uses). None of those seem to trigger it. Not sure if I’m missing something there, it seems to consistently succeed on my machine (also got Timur to test on his machine).

@piyushnarang
Copy link
Collaborator Author

Debugging this locally, I don't see the 'com.twitter.scalding.typed.ReduceStep$$Lambda$275/727580562' showing up in the list of StackTraceElements passed to the method which is why it seems like I don't hit this error on my machine. Not sure why / how they're present on the travis CI build.

@johnynek
Copy link
Collaborator

johnynek commented Mar 18, 2017 via email

@piyushnarang
Copy link
Collaborator Author

@johnynek while debugging this in Intellij I updated my scalaVersion in the build.sbt file to 2.12.1. While trying to repro before I ran the commands above (and Timur also tried one set) on my command line.

@johnynek
Copy link
Collaborator

Well, i mean, in any case Class.forName might throw. We could put a try/catch and handle that in a decent way. Even if we can't repro, I can see how it may be a bug, I just don't know in what cases it is triggered. The only risk to such an approach is that the bug is elsewhere and we are hiding it with a fix like this. But if we can get the tests green locally and on Travis, that seems okay to me (if we don't have to change the tests to do it).

@piyushnarang
Copy link
Collaborator Author

@johnynek yeah I had the changes to catch the CNF handy but I was also worried about the masking of the issue elsewhere which is why I was trying out all those combinations to repro :-). I can push that upstream so that we can see if that makes Travis happy.

@piyushnarang
Copy link
Collaborator Author

@johnynek ok the build is green now :-). Do take a look when you get the time. I'll also sanity check this and clean up stuff when I get in tomorrow.

val cascadingAvroVersion = "2.1.2"
val chillVersion = "0.8.3"
val chillVersion = "0.8.4"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we want to bump to 0.8.x vs 0.7.7? Seems summingbird on storm 1.0 has still not landed, so we might want to have one last summingbird with 2.12? (I'm working a lot with summingbird and that may help me).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek I can switch this to 0.7.7. If I understand correctly the plan would be:

  1. Merge this PR with chill 0.7.7 (might need to revert the test fix here: https://github.com/twitter/scalding/pull/1634/files).
  2. Cut a 0.17.0 release
  3. You can use that to cut a SB 2.12 rel on chill 0.7.x
  4. Switch back to chill 0.8.4
  5. Cut a 0.17.1 rel.
  6. We can use this for our SB 2.12 rel on chill 0.8.x

@ttim - does this sound reasonable to you as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek summingbird with storm 1.0 support wasn't released because I'm waiting on scalding/storehaus to be released with 0.8.x chill.

In order to release SB with 0.9 storm and 2.12 scala support we also need to release tormenta with the same. Latest tormenta's release uses storm 1.0 (which is not good for above) and supports 2.12.

So I see two possibilities:

  1. If we really want to release SB with 0.9 storm and 2.12 scala:
  • release old version of tormenta with 2.12 support only, without 1.0 storm
  • release scalding 0.17.0 as @piyushnarang said
  • release storehaus & SB with 2.12 & 0.9 storm.

And iterate again with scalding/storehaus/SB releases to get SB with 2.12 & 1.0 storm.

  1. Or if we are ok without SB for 2.12 and 0.9 storm:
  • release scalding with 0.8.x chill & 2.12 support
  • release storehaus/SB with 2.12 & 1.0 storm

One more note - we use SB with 1.0 storm at Twitter for last couple of months and it works well. Transition to new api wasn't painful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think moving forward with 0.8 chill seems fine then. Let's move forward and go ahead to get on 1.0 storm.

override def equals(other: Any): Boolean = {
//removed from Scala 2.12
def _equals(x: Product, y: Any): Boolean = y match {
case y: Product if x.productArity == y.productArity => x.productIterator sameElements y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we wrap this line:

 case y: Product if x.productArity == y.productArity =>
   x.productIterator sameElements y.productIterator

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, seems cleaner will do.

.map { case (k, _) => ((), Batched(cms.create(Bytes(serialize(k))))) }
.sumByLocalKeys
.map {
case (_, batched) => batched.sum
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like some kind of bad merge. I don't think this is part of this. Can you rebase develop and see if this goes away?

@@ -106,6 +125,24 @@ class ReducerEstimatorTest extends WordSpec with Matchers with HadoopSharedPlatf
.run()
}

"run with correct number of reducers when we have a glob pattern in path" in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like another merge issue.

@@ -124,7 +124,7 @@ trait BaseScaldingShell extends MainGenericRunner {
* @param args from the command line.
*/
def main(args: Array[String]): Unit = {
val retVal = process(args)
val retVal = process(ExpandLibJarsGlobs(args))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another merge issue.

@@ -86,14 +86,6 @@ struct StructWithReorderedOptionalFields {
1: optional i32 fieldOne,
}

struct TestPersonWithRequiredPhone {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this need to get nuked?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scrooge generator was cribbing cause this was being declared twice. The old version seem to not mind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh. Cool.

@johnynek
Copy link
Collaborator

👍

@johnynek
Copy link
Collaborator

(when we merge, please squash merge this).

@johnynek
Copy link
Collaborator

(which github's merge button supports).

Copy link
Collaborator

@ttim ttim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment


override def equals(other: Any): Boolean = {
//removed from Scala 2.12
def _equals(x: Product, y: Any): Boolean = y match {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have this method inlined and without the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, will do.

build.sbt Outdated
val slf4jVersion = "1.6.6"
val thriftVersion = "0.5.0"
val junitVersion = "4.10"
val macroCompatVersion = "1.1.1"
val jlineVersion = "2.14.3"

def scrooge(scalaVersion: String) = "com.twitter" %% "scrooge-serializer" % scroogeVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this method? It doesn't use the scalaVersion parameter, maybe it should be a val?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was old stuff from when we were still supporting scala 2.10 (which scrooge stopped supporting).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I can drop this. Didn't clean it up after a couple of iterations.

@@ -560,7 +562,7 @@ object Boxed {
* using `Any` here instead of parametrizing everything over a type parameter
* `K`.
*/
private[this] val allBoxes = List(
val boxes1 = List(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could these lists be private[serialization]?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to as private as possible. Good catch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me try that.

@piyushnarang
Copy link
Collaborator Author

@ttim / @fwbrasil - updated with your comments

case y: Product if x.productArity == y.productArity =>
x.productIterator sameElements y.productIterator
val productEquals = other match {
case p: Product if self.productArity == p.productArity =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess hashCode check was made before _equals call for perf reasons. Therefore this code has slightly worse performance if it matters.

I recommend just remove separate hashCode == other.hashCode check and put it into if part.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this. Not sure how large those products get. I moved the hashCode into the case body.

@fwbrasil
Copy link
Contributor

👍 LGTM

@piyushnarang piyushnarang merged commit 12aaa7e into twitter:develop Mar 23, 2017
@johnynek johnynek mentioned this pull request Jun 15, 2017
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants