VCAT is a system for creating machine learning image datasets for computer vision. It uses a Django backend, React frontend, and CNN-based based visual search engine to facilitate creating new datasets and forensic image research.
VCAT's front end can also be used for visual investigations. Searching 10M keyframes takes only 0.15 seconds. VCAT designed to be used with the VFRAME computer vision processing tools and is primarily intended for human rights researchers and technologists.
This project is under daily development and installations steps may change significantly between Oct 2018 - May 2019
- Ubuntu 16.04 with 16GB RAM
- A conda/virtualenv virtual environment running Python 3.6+
- node v8.5.0 / npm v6.0.0 (suggest installing with nvm)
- MySQL2 (apt install libmysqlclient-dev)
sudo apt install libmysqlclient-dev
Run mysql -u root
then make a new user and database:
CREATE USER 'vframe'@'localhost' IDENTIFIED BY 'password';
CREATE DATABASE vframe;
GRANT ALL PRIVILEGES ON vframe.* TO 'vframe'@'localhost';
Copy the settings file and edit appropriately (or ask jules for dev config):
cp sample-env .env
source activate vcat
pip install -r requirements.txt
python manage.py migrate
python manage.py createsuperuser
npm install
The FAISS-based image search engine lives in ~/vcat/vsearch/
directory. This is a flask server which runs separately from the main Django app. Instructions for installing everything are there, including another requirements.txt. This may be moved into its own repo at some point. Instructions for setting up vsearch can be found in that folder's readme.
If using vsearch with vcat, please run its fixtures:
python manage.py migrate vsearch zero
python manage.py migrate
python manage.py loaddata document_tag
python manage.py import_metadata
python manage.py import_metadata --unverified
Run these commands in separate tabs:
python manage.py runserver
npm run watch
Note, if developing on Linux you may need to increase the number of filesystem watchers:
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
Put static images in backend/api/static/
Production bundles should be built remotely. Run npm run reload
on the remote server.
Services are set up in /etc/init.d
. Find sample init.d files in ./bin/init.d/
. If there's a problem do service vcat restart
or service sis restart
.
For now you can curl using Basic Auth to hit endpoints like so -
curl -u username:password https://syrianarchive.vframe.io/api/hierarchy/1/full
Autocomplete visual taxonomy for quick annotation
Toggle on/off labels for scenes with many items
Display labels and verify annotations
Search for similar images using content based image retrieval
Use search results to create a new training dataset
Selection of annotated training images for aircraft
The VFRAME is currently exploring how 3D modeling can be used to generated synthetic datasets to augment existing training data for illegal munitions.