Wikimedia image classification and suggestions for article authors.
-
Install these dependencies by using your system's package manager if you don't have them already.
Depdendency Apt Pacman Homebrew Python 3 python3 python Cython cython3 cython Pip python3-pip python-pip Virtualenv virtualenv python-virtualenv Fortran gfortran gcc-fortran Blas libblas-dev blas Lapack liblapack-dev lapack PNG libpng-dev libpng JPEG libjpeg8-dev libjpeg-turbo Freetype libfreetype6-dev freetype2 Cairo libcairo2-dev cairo FFI libffi-dev -
Create a virtual environment inside the repository root by runnning
virtualenv .
or if you have multiple Python versionsvirtualenv -p python3 .
. -
Activate your virtual environment using
source bin/activate
. Make sure that the repository name is in front of your shell promt now. -
Install dependencies inside your virtual environment
pip install -r requirements.txt
-
Install OpenCV 3.0 with bindings for Python 3 by running
chmod +x tool/setup-opencv.sh tool/setup-opencv.sh
-
UTF-8 is required, so you may need to add these lines to your
~/.bash_profile
and apply the changes withsource ~/.bash_profile
.export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8
-
Create a virtual environment inside the repository root by runnning
virtualenv .
or if you have multiple Python versionsvirtualenv -p C:\Python34\python.exe .
. -
Activate your virtual environment using
Scripts\activate
. Make sure that the repository name is in front of your shell promt now. -
Download these dependencies. If in doubt, use the link before the last in each list. Run
pip install <path-to-file>
on each of those. -
Install remaining dependencies inside your virtual environment using
pip install -r requirements.txt
.
- Download DBpedia dump
- Extract list of image names
- Fetch image and meta data of random entries
- Manually label data
- Balance amount of image per class
- Preprocess data set
- Extract image and text based features
- Train classifier
- Get user search term
- Query DBpedia for related images based on description
- Fetch image and meta data of first results
- Extract image and text based features
- Use trained classifier to predict class
- Store results in DBpedia
-h
can be passed as a parameter to get a comprehensive list of parameters for all the classes listed below.
fetch_[source].py
files can be used to download images for a test/training set from [source] to test the classifier.
extraction.py
can be used to extract textual and visual features from images.
classifier.py
can be used to classify images.
performance.py
trains a different classifiers and measures their individual performance.
evaluation.py
can be used to measure the performance of individual features.