GitHub - vicontek/maxvol: Code for the "MAXVOL for machine learning" project

Motivation

In machine learning we often don't have powerful enough computers or large enough memory to perform calculations on the whole dataset which has too many features and/or samples. A dataset is represented with a matrix. If we can find a good subset of features, we'll be able to train the machine learning algorithm using computational resources. We can also try to achieve the same goal by using only the most representative samples. Feature selection and samples selection is equivalent to choosing a submatrix. For both purposes we can try choosing a submatrix with large volume, where volume as the absolute value of determinant for square matrices, is generalized for rectangular matrices.

Experiments

We made 4 main experiments on different datasets:

ARCENE
MNIST
Housing prices
Synthetic dataset

More details can be found in maxvol_report.pdf

Team

The project was performed by the team of four people:

Philip Blagoveschensky @philip-bl
Ivan Golovatskikh @vicontek
Maria Sindeeva @lapsya
Mirfarid Musavian @mirfaridmusavian

Prerequisites

In addition to the common numerical and ML packages shuch as scipy and sklearn, you need to install maxvolpy package:

pip3 install maxvolpy

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
arcene		arcene
arcene_experiment		arcene_experiment
datasets		datasets
housing_experiment		housing_experiment
mnist_experiment		mnist_experiment
synthetic_experiment		synthetic_experiment
.gitignore		.gitignore
README.md		README.md
maxvol_report.pdf		maxvol_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motivation

Experiments

Team

Prerequisites

About

Releases

Packages

Contributors 4

Languages

vicontek/maxvol

Folders and files

Latest commit

History

Repository files navigation

Motivation

Experiments

Team

Prerequisites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages