Robotoff ANN

This project is archived, as we now use Elasticsearch to perform ANN. We don't need anymore an external service to serve index files, everything related to ANN is done in Robotoff directly.

This project helps robotoff in categorizing logos. Bug tracking is mostly done on the main Robotoff repository

Tangible results

You can see all the crops generated and up for manual annotation in Hunger Games, our gamified annotation engine.
Robotoff pings new crops and annotations in the #robotoff-alerts-annotations Slack channel

Contributing

To setup the project you must have a recent version of docker and docker-compose installed.

use make dev.

make quality will run linters and tests.

Models used in production are published in releases of openfoodfacts-ai.

See more in Makefile.

Architecture

From images we extract logos (logo detection is in robotoff). Those logos are embedded in a metric space using a specific model¹.

We then use approximate nearest neighbors ² in this metric space to try to classify the logos from known examples KNN.

Those logos will then help apply labels to Open Food Facts products.

Main entry point is API to get nearest neighbors, either for logo id ³, or an embedding vector ⁴, or add new logo from a image ⁵.

Note that the approximate nearest neighbors index is only regenerated using a specific command ⁶.

Preliminary research

Here is the FaceNet paper we talked about yesterday (from Google and not Facebook as I said yesterday): https://arxiv.org/abs/1503.03832
A blog post that explains the “triplet mining” introduced by the above paper: https://omoindrot.github.io/triplet-loss

How does it work ?

The ANN /add endpoints works as follows:
from the raw image and detected bounding boxes, we crop the image to get all detected logos.
Each logo is provided as input to the neural network (here an EfficientNet), to get an embedding for each logo.
The embedding is saved locally on an HDF5 file (

robotoff-ann/embeddings.py

Line 72 in 6abaee7

def save_embeddings(

).

Annotation

https://wiki.openfoodfacts.org/Logo_Annotation_Guidelines
https://annotate.openfoodfacts.org
Based on opencv/cvat: Powerful and efficient Computer Vision Annotation Tool (CVAT)
Currently 502 Bad Gateway: openfoodfacts/openfoodfacts-infrastructure#49
Documentation: https://annotate.openfoodfacts.org/documentation/user_guide.html#creating-an-annotation-task
train a "universal" logo / label detector, with good results
for each image crop resulting from the annotations (bounding box), generate embeddings with a pre-trained network (Resnet50).
This made it possible to verify that this approach is the right one for the classification of crops
the results are in the presentation: 3 photos et c'est à peu près tout

Annotation guidelines

there should be as little space as possible between the bounding box and the object. Conversely, the whole object must be included in the bounding box.
if the object is partially hidden, indicate the object as "occluded" (click on the "profile" icon on the object in question, in the right panel)
for best results, it is necessary that similar objects are annotated in the same way (especially concerning the extent of the object). It happens that there are several scales of annotation (cf the question discussed above of pictograms "to recycle"), the most important is that the annotations are coherent.
several very similar images or concerning the same product follow one another in the dataset. For the next campaign, it will be better to shuffle the dataset to have as much diversity as possible (edited)

Colab notebooks

(Accessible by Pierre) https://colab.research.google.com/drive/1G-6OELcz8l1u1_53_0a2DAKRryIfB9CE
The predictions on the validation set: output_images

Pipeline on colab

Data preprocessing : https://colab.research.google.com/drive/1cxi_aITHEFo4IZRsbiwm39CFgDKMG8LZ
Model training : https://colab.research.google.com/drive/1qGz2tNC29IRqji4hKebmu249WaUP7u0_
Visualization of results : https://colab.research.google.com/drive/1etqj-OgPEHi6ypjCixGSBTEW7muM6bc0

Roadmap

API routes

ANNResource: Allows you to do XYZ

/api/v1/ann/{logo_id:int}

ANNResource: Allows you to do XYZ

/api/v1/ann

ANNBatchResource: Allows you to do XYZ

/api/v1/ann/batch

ANNEmbeddingResource: Allows you to do XYZ

/api/v1/ann/from_embedding

AddLogoResource: Allows you to do XYZ

/api/v1/ann/add

ANNCountResource: Allows you to do XYZ

/api/v1/ann/count

ANNStoredLogoResource: Allows you to do XYZ

/api/v1/ann/stored

Datasets

see embeddings.generate_embeddings and settings.DEFAULT_MODEL ↩
see api.ANNIndex which currently relies on Annoy ↩
see api.ANNResource and api.ANNBatchResource ↩
see api.ANNEmbeddingResource ↩
see api.AddLogoResource ↩
see manage.generate_index ↩

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github		.github
ann_data		ann_data
docker		docker
docs		docs
tests		tests
.Dockerignore		.Dockerignore
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
api.py		api.py
docker-compose.yml		docker-compose.yml
embeddings.py		embeddings.py
gunicorn_conf.py		gunicorn_conf.py
manage.py		manage.py
requirements.txt		requirements.txt
requirements_test.txt		requirements_test.txt
schema.py		schema.py
settings.py		settings.py
setup.cfg		setup.cfg
utils.py		utils.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robotoff ANN

Tangible results

Contributing

Architecture

Preliminary research

How does it work ?

Annotation

Annotation guidelines

Colab notebooks

Pipeline on colab

Roadmap

API routes

ANNResource: Allows you to do XYZ

ANNResource: Allows you to do XYZ

ANNBatchResource: Allows you to do XYZ

ANNEmbeddingResource: Allows you to do XYZ

AddLogoResource: Allows you to do XYZ

ANNCountResource: Allows you to do XYZ

ANNStoredLogoResource: Allows you to do XYZ

Datasets

About

Releases 2

Sponsor this project

Packages

Contributors 7

Languages

License

openfoodfacts/robotoff-ann

Folders and files

Latest commit

History

Repository files navigation

Robotoff ANN

Tangible results

Contributing

Architecture

Preliminary research

How does it work ?

Annotation

Annotation guidelines

Colab notebooks

Pipeline on colab

Roadmap

API routes

ANNResource: Allows you to do XYZ

ANNResource: Allows you to do XYZ

ANNBatchResource: Allows you to do XYZ

ANNEmbeddingResource: Allows you to do XYZ

AddLogoResource: Allows you to do XYZ

ANNCountResource: Allows you to do XYZ

ANNStoredLogoResource: Allows you to do XYZ

Datasets

Footnotes

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 2

Sponsor this project

Packages 0

Contributors 7

Languages

Packages