-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13035] [ML] [PySpark] PySpark ml.clustering support export/import #10999
Conversation
Test build #50461 has finished for PR 10999 at commit
|
@@ -69,6 +70,25 @@ class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol | |||
True | |||
>>> rows[2].prediction == rows[3].prediction | |||
True | |||
>>> import os, tempfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is maybe a bit much for a doctest since in general they are supposed to be example-ish. Maybe this should be in the tests file instead? Just a suggestion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... Here we combine the test and example functions. I do not have strong preference about whether this should live here or in tests file. @jkbradley
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing will do
On Thursday, February 11, 2016, Xiangrui Meng notifications@github.com
wrote:
In python/pyspark/ml/clustering.py
#10999 (comment):@@ -69,6 +70,25 @@ class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol
True
>>> rows[2].prediction == rows[3].prediction
True
import os, tempfile
I agree with @holdenk https://github.com/holdenk . This may be too
verbose for a doctest. We can move the temp directory setup in test
preparation (where we initialize sqlContext) and clean up. We can do that
in a separate PR. @holdenk https://github.com/holdenk Could you create
a JIRA for it? Thanks!—
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/10999/files#r52688040.
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau
LGTM. Merged into master. Thanks! |
… outside of the doctests Some of the new doctests in ml/clustering.py have a lot of setup code, move the setup code to the general test init to keep the doctest more example-style looking. In part this is a follow up to #10999 Note that the same pattern is followed in regression & recommendation - might as well clean up all three at the same time. Author: Holden Karau <holden@us.ibm.com> Closes #11197 from holdenk/SPARK-13302-cleanup-doctests-in-ml-clustering.
PySpark ml.clustering support export/import.