The Simulated Complexity Library (SCoL) is a package that simulates a set of complexity measures using models generated by a meta-learning approach to decrease the asymptotic computational complexity of the original measures for classification problems. The simulation is made by the models generated by the extraction of simple and efficient meta-features implemented by mfe package [1]. The simulated complexity measures capture aspects that quantify the linearity of the data, the presence of informative feature, the sparsity and dimensionality of the datasets with a low computational cost.
The measures available were originally proposed by Ho and Basu [2] and extend by many other works including the ECoL library [3]. The measures are based on: feature overlapping measures, neighborhood measures, linearity measures, dimensionality measures, class balance measures and network measures. These measured are simulated by models generated by Random Forest and Support Vector Machines algorithms.
The installation process using devtools is:
if (!require("devtools")) {
install.packages("devtools")
}
devtools::install_github("lpfgarcia/SCoL")
library("SCoL")
The simplest way to compute the simulated complexity measures are using the simulated
method. The method can be called by a symbolic description of the model or by a data frame. The parameters are the dataset and the measures to be extracted. The default paramenter is extract all the measures. A simple example is given next:
## Extract all complexity measures
simulated(Species ~ ., iris)
## Extract all complexity measures using data frame
simulated(iris[,1:4], iris[,5])
## Extract the F1 measure using overlapping function
simulated(Species ~ ., iris, features="F2")
To submit bugs and feature requests, report at project issues.
[1] Rivolli, A., Garcia, L. P. F., Soares, C., Vanschoren, J., and de Carvalho, A. C. P. L. F. (2018). Towards Reproducible Empirical Research in Meta-Learning. arXiv:1808.10406
[2] Ho, T., and Basu, M. (2002). Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289-300.
[3] Lorena, A. C., Garcia, L. P. F., Lehmann, J., de Souto, M. C. P., and Ho, T. K. (2018). How Complex is your classification problem? A survey on measuring classification complexity. arXiv:1808.03591