-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GRAPHX] Spark 3789 - Python Bindings for GraphX #4205
Conversation
Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala
Conflicts: graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
…DImpl class hierarchy. Fixed compile issues.
Updated to "[GRAPHX] Spark 3789 - Python Bindings for GraphX" On Mon, Jan 26, 2015 at 9:35 AM, Mark Hamstra notifications@github.com
|
import scala.language.implicitConversions | ||
import scala.reflect.ClassTag | ||
|
||
trait JavaVertexRDDLike[VD, This <: JavaVertexRDDLike[VD, This, R], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove this trait and fold all of this code directly into the JavaVertexRDD
class. The *RDDLike
pattern in the Java API wasn't a great design and I'd like to avoid mimicking it in new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Josh, then how will we handle type bounds? Is there a new design for Java API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "handle type bounds"? In this PR's current code, it looks like JavaVertexRDDLike
is only extended by a single class and isn't used as part of any method signatures, return types, etc, so unless I'm overlooking something I don't see why it can't be removed. Inheriting implementations from generic traits has bitten us in the past via https://issues.scala-lang.org/browse/SI-8905 (see https://issues.apache.org/jira/browse/SPARK-3266), so if this trait isn't necessary then we shouldn't have it.
The JavaRDDLike traits in the Spark Core Java API are an unfortunate holdover from an earlier design and exists primarily for code re-use purposes. We can't remove it now due because that would break binary compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I will change this.
I left an initial pass of comments. I haven't really dug into the details very much yet, but a couple of high-level comments:
Now that we're increasingly seeing Spark libraries being written in one JVM language and used from another (e.g. a Spark library written against the Java API and called from Scala), it might be nice to try to extend GraphX's Scala API to expose Java-friendly methods instead of adding a new Java API. This is a major departure from how we've handled Java APIs up until now, but it might be a better long-term decision for new code. I think @rxin may be able to chime in here with more details. GraphX might be a nice context to explore this idea since it's a much smaller API than Spark as a whole. |
I like the idea of extending Scala APIs for Java, instead of having On Mon, Jan 26, 2015 at 11:58 AM, Josh Rosen notifications@github.com
|
Hey all, I'd like to close this issue and defer to a design doc before there is a lot of commenting on this. This pulls in another patch that is itself not merged and major portions of the PySpark API are copy/pasted. For instance, it might be good to wait until #3234 is merged before asking for a lot of community review here. |
@kdatta thanks for working on this. I also commented on JIRA. For such a massive change, can you write some high level design document and attach it to the JIRA ticket? |
There's duplication of effort between #3234 and #3789. Should i wait for
|
Hi All, Here's what I suggest we do:
On Mon, Jan 26, 2015 at 1:58 PM, Patrick Wendell notifications@github.com
|
First pull request for PyGraphX. The following codes are added: