-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29292][STREAMING][SQL][BUILD] Get streaming, catalyst, sql compiling for Scala 2.13 #29078
Conversation
@@ -46,7 +46,7 @@ class HadoopFileLinesReader( | |||
|
|||
def this(file: PartitionedFile, conf: Configuration) = this(file, None, conf) | |||
|
|||
private val iterator = { | |||
private val _iterator = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to resolve a weird compile error:
[ERROR] [Error] /Users/seanowen/Documents/spark_2.13/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileLinesReader.scala:49: weaker access privileges in overriding
final def iterator: Iterator[org.apache.hadoop.io.Text] (defined in trait Iterator)
override should not be private
[ERROR] [Error] /Users/seanowen/Documents/spark_2.13/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileWholeTextReader.scala:38: weaker access privileges in overriding
final def iterator: Iterator[org.apache.hadoop.io.Text] (defined in trait Iterator)
override should not be private
Not sure, but easy to avoid the name conflict entirely.
@@ -223,16 +223,6 @@ class DatasetPrimitiveSuite extends QueryTest with SharedSparkSession { | |||
checkDataset(Seq(Queue(true)).toDS(), Queue(true)) | |||
checkDataset(Seq(Queue("test")).toDS(), Queue("test")) | |||
checkDataset(Seq(Queue(Tuple1(1))).toDS(), Queue(Tuple1(1))) | |||
|
|||
checkDataset(Seq(ArrayBuffer(1)).toDS(), ArrayBuffer(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For complex reasons, I don't think these tests / use cases are possible as-is in Scala 2.13. They are niche, so I jsut removed them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this means the removal of test coverage in Scala 2.12, I'm +1 for now. We can add back later after we finished everything in Scala 2.13.
Thank you, @srowen . |
Test build #125735 has finished for PR 29078 at commit
|
Test build #125738 has finished for PR 29078 at commit
|
Test build #125742 has finished for PR 29078 at commit
|
Jenkins retest this please |
Test build #125745 has started for PR 29078 at commit |
Retest this please |
Test build #125779 has finished for PR 29078 at commit
|
Weird one:
It's consistent, so I believe this patch causes it, but not yet seeing any obvious reason. I'll try to reproduce and isolate locally. |
@srowen . Interestingly, the root cause of the failure is the following inside private val records = Seq.fill(numPartitions)(new ListBuffer[UnsafeRow])
- private val recordEndpoint = new ContinuousRecordEndpoint(records, this)
+ private val recordEndpoint = new ContinuousRecordEndpoint(records.map(_.toSeq), this) |
Oh good catch, I think that's it. The issue is that This is the kind of bug to look out for, to be sure - even in 2.12, I suppose, given this. I am slightly worried there's a case that tests don't catch. I manually reviewed this PR vs occurrences of ListBuffer and don't think there are more instances here, but will examine the previous PR too. I suspect this kind of thing will come up in a few places in the 2.13 build as .toSeq will in these cases most definitely make a copy of a mutable collection. Which is even good in a way, as we should update the code to be clearer about the fact that it's specifically a mutable Seq being passed -- but, will probably require more debugging in 2.13 later. |
Thank you for the fix, @srowen ! |
Test build #125790 has finished for PR 29078 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @srowen and @HyukjinKwon .
Merged to master for Apache Spark 3.1.0 on December 2020.
… for Scala 2.13 compilation ### What changes were proposed in this pull request? Same as #29078 and #28971 . This makes the rest of the default modules (i.e. those you get without specifying `-Pyarn` etc) compile under Scala 2.13. It does not close the JIRA, as a result. this also of course does not demonstrate that tests pass yet in 2.13. Note, this does not fix the `repl` module; that's separate. ### Why are the changes needed? Eventually, we need to support a Scala 2.13 build, perhaps in Spark 3.1. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. (2.13 was not tested; this is about getting it to compile without breaking 2.12) Closes #29111 from srowen/SPARK-29292.3. Authored-by: Sean Owen <srowen@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Continuation of #28971 which lets streaming, catalyst and sql compile for 2.13. Same idea.
Why are the changes needed?
Eventually, we need to support a Scala 2.13 build, perhaps in Spark 3.1.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing tests. (2.13 was not tested; this is about getting it to compile without breaking 2.12)