Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model save support #15

Open
davis-varghese opened this issue Aug 23, 2016 · 6 comments
Open

Model save support #15

davis-varghese opened this issue Aug 23, 2016 · 6 comments

Comments

@davis-varghese
Copy link

I saved a model(KNNClassificationModel) using java serialization and when I use it later, I always get java.lang.IllegalArgumentException: Flat hash tables cannot contain null elements.
on the dataframe output of the model.transform(inputDataFrame).

Is there a better way of saving and using model? like support for MLWritable/Saveable traits. In our use case, we create a model and use it later

@mindcrusher11
Copy link

I am also looking for solution to save model using scala spark

@Sambor123
Copy link

I also had this problem,does it any solution for it?

@rachmaninovquartet
Copy link

I've tried like this:
sc.parallelize(Seq(knnModel), 1).saveAsObjectFile("/user/you/knnTest/" + "KNN")
val model = sc.objectFile[KNNClassificationModel]("/user/you/knnTest/" + "KNN").first()

but the model pulled back in no longer seems to work, which is strange since this has worked for all my other models.

@wzjmail
Copy link

wzjmail commented Jul 3, 2020

I also encountered this problem. I attempted to serialize this model and load again. but rdd[tree] cannot be deserialized correctly. it looks like that metricTree have some problem. if you have sollution,comment please

@alexnb
Copy link

alexnb commented Aug 11, 2020

Also saving using PipelineModel.save() does not work:

Caused by: java.lang.UnsupportedOperationException: Pipeline write will fail on this Pipeline because it contains a stage which does not implement Writable. Non-Writable stage: knnc_95d9ce15f990 of type class org.apache.spark.ml.classification.KNNClassificationModel
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:231)
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:228)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.ml.Pipeline$SharedReadWrite$.validateStages(Pipeline.scala:228)
at org.apache.spark.ml.PipelineModel$PipelineModelWriter.<init>(Pipeline.scala:336)
at org.apache.spark.ml.PipelineModel.write(Pipeline.scala:320)
at org.apache.spark.ml.util.MLWritable$class.save(ReadWrite.scala:306)
at org.apache.spark.ml.PipelineModel.save(Pipeline.scala:293)
... 16 more

@githubthunder
Copy link

HI, Has this problem been solved now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants