Skip to content

A collection of machine learning algorithms for online linear classification written in Common Lisp

License

Notifications You must be signed in to change notification settings

masatoi/cl-online-learning

Repository files navigation

Cl-Online-Learning

http://quickdocs.org/badge/cl-online-learning.svg https://github.com/masatoi/cl-online-learning/workflows/CI/badge.svg

A collection of machine learning algorithms for online linear classification written in Common Lisp.

Implemented algorithms

Binary classifier

  • Perceptron
  • AROW (Crammer, Koby, Alex Kulesza, and Mark Dredze. “Adaptive regularization of weight vectors.” Advances in neural information processing systems. 2009.)
  • SCW-I (Soft Confidence Weighted) (Wang, Jialei, Peilin Zhao, and Steven C. Hoi. “Exact Soft Confidence-Weighted Learning.” Proceedings of the 29th International Conference on Machine Learning (ICML-12). 2012.)
  • Logistic Regression with SGD or ADAM optimizer (Kingma, Diederik, and Jimmy Ba. “Adam: A method for stochastic optimization.” ICLR 2015)

Multiclass classifier

  • one-vs-rest ( K binary classifier required )
  • one-vs-one ( K*(K-1)/2 binary classifier required )

Command line tools

Installation

cl-online-learning is available from Quicklisp.

(ql:quickload :cl-online-learning)

When install from github repository,

cd ~/quicklisp/local-projects/
git clone https://github.com/masatoi/cl-online-learning.git

When using Roswell,

ros install masatoi/cl-online-learning

Usage

Prepare dataset

A data point is a pair of a class label (+1 or -1) and a input vector. Both of them have to be declared as single-float.

And dataset is represented as a sequence of data points. READ-DATA function is available to make a dataset from a sparse format used in LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/). This function requires the number of features of that dataset.

;; Number of features
(defparameter a1a-dim 123)

;; Read dataset from file
(defparameter a1a
  (clol.utils:read-data
   (merge-pathnames #P"t/dataset/a1a" (asdf:system-source-directory :cl-online-learning))
   a1a-dim))

;; A data point
(car a1a)

; (-1.0
;  . #(0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
;     1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
;     0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
;     1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
;     1.0 0.0 1.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
;     0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
;     0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0))

Define learner

A learner object is just a struct, therefore their constructor is available to make it.

(defparameter arow-learner (clol:make-arow a1a-dim 10))

Update and Train

To update the model destructively with one data point, use an update function corresponding to the model type.

(clol:arow-update arow-learner
                  (cdar a1a)  ; input
                  (caar a1a)) ; label

TRAIN function can be used to learn the dataset collectively.

(clol:train arow-learner a1a)

It may be necessary to call this function several times until learning converges. For now, the convergence test has not been implemented yet.

Predict and Test

(clol:arow-predict arow-