-
Notifications
You must be signed in to change notification settings - Fork 29
Classification
Identifying to which of a set of categories a new observation belongs, on the basis of a training set of data.
Type of probabilistic statistical classification model.[2] It is also used to predict a binary response from a binary predictor, used for predicting the outcome of a categorical dependent variable (i.e., a class label) based on one or more predictor variables (features). That is, it is used in estimating the parameters of a qualitative response model.
- method: LogisticRegressionWithSGD, LogisticRegressionWithLBFGS
- model: LogisticRegressionModel
- ruby: classification/logistic_regression.rb
data = [
LabeledPoint.new(0.0, [0.0, 1.0]),
LabeledPoint.new(1.0, [1.0, 0.0]),
]
lrm = LogisticRegressionWithSGD.train($sc.parallelize(data))
lrm.predict([1.0, 0.0])
# => 1
lrm.predict([0.0, 1.0])
# => 0
lrm.clear_threshold
lrm.predict([0.0, 1.0])
# => 0.123...
Supervised learning models are associated with learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier
- method: SVMWithSGD
- model: SVMModel
- ruby: classification/svm.rb
data = [
LabeledPoint.new(0.0, [0.0]),
LabeledPoint.new(1.0, [1.0]),
LabeledPoint.new(1.0, [2.0]),
LabeledPoint.new(1.0, [3.0])
]
svm = SVMWithSGD.train($sc.parallelize(data))
svm.predict([1.0])
# => 1
svm.clear_threshold
svm.predict([1.0])
# => 1.25...
Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features.
- method: NaiveBayes
- model: NaiveBayesModel
- ruby: classification/naive_bayes.rb
data = [
LabeledPoint.new(0.0, [0.0, 0.0]),
LabeledPoint.new(0.0, [0.0, 1.0]),
LabeledPoint.new(1.0, [1.0, 0.0])
]
model = NaiveBayes.train($sc.parallelize(data))
model.predict([0.0, 1.0])
# => 0.0
model.predict([1.0, 0.0])
# => 1.0