Scala-CP is a Scala implementation of the Conformal Prediction (CP) framework, introduced by Vovk et. al. in the book Algorithmic Learning in a Random World. When assigning confidence to machine learning models, CP is a nice alternative to cross-validation. Instead of predicting a value for a certain feature vector, a conformal predictor outputs a prediction set/region that contains the correct prediction with probability 1-𝜺, where 𝜺 is a user-defined significance level. The choose of the significance level will of course influence the size of the prediction set/region. In alternative, using CP one can predict object-specific p-values for unseen examples.
Scala-CP can be used along with any Scala/Java machine learning library and algorithm. All you have to do is to add the Scala-CP dependency to your pom.xml file:
<dependencies>
...
<dependency>
<groupId>se.uu.it</groupId>
<artifactId>cp</artifactId>
<version>0.1.0</version>
</dependency>
...
</dependencies>
The API documentation is available at: https://mcapuccini.github.io/scala-cp/scaladocs/.
For some usage examples please refer to the unit tests:
You can also refer to this Apache Zeppelin notebooks for more examples:
- M. Capuccini, L. Carlsson, U. Norinder and O. Spjuth, "Conformal Prediction in Spark: Large-Scale Machine Learning with Confidence," 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC), Limassol, 2015, pp. 61-67.
- Ahmed, L., Georgiev, V., Capuccini, M., Toor, S., Schaal, W., Laure, E., & Spjuth, O. (2018). Efficient iterative virtual screening with Apache Spark and conformal prediction. Journal of cheminformatics, 10(1), 1-8.
- Classification
- Regression
- Classification
- Regression