This project is archived, as we now use Elasticsearch to perform ANN. We don't need anymore an external service to serve index files, everything related to ANN is done in Robotoff directly.
This project helps robotoff in categorizing logos. Bug tracking is mostly done on the main Robotoff repository
- You can see all the crops generated and up for manual annotation in Hunger Games, our gamified annotation engine.
- Robotoff pings new crops and annotations in the #robotoff-alerts-annotations Slack channel
To setup the project you must have a recent version of docker and docker-compose installed.
use make dev
.
make quality
will run linters and tests.
Models used in production are published in releases of openfoodfacts-ai.
See more in Makefile.
From images we extract logos (logo detection is in robotoff). Those logos are embedded in a metric space using a specific model1.
We then use approximate nearest neighbors 2 in this metric space to try to classify the logos from known examples KNN.
Those logos will then help apply labels to Open Food Facts products.
Main entry point is API to get nearest neighbors, either for logo id 3, or an embedding vector 4, or add new logo from a image 5.
Note that the approximate nearest neighbors index is only regenerated using a specific command 6.
- Here is the FaceNet paper we talked about yesterday (from Google and not Facebook as I said yesterday): https://arxiv.org/abs/1503.03832
- A blog post that explains the “triplet mining” introduced by the above paper: https://omoindrot.github.io/triplet-loss
- The ANN /add endpoints works as follows:
- from the raw image and detected bounding boxes, we crop the image to get all detected logos.
- Each logo is provided as input to the neural network (here an EfficientNet), to get an embedding for each logo.
- The embedding is saved locally on an HDF5 file (
Line 72 in 6abaee7
- https://wiki.openfoodfacts.org/Logo_Annotation_Guidelines
- https://annotate.openfoodfacts.org
- Based on opencv/cvat: Powerful and efficient Computer Vision Annotation Tool (CVAT)
- Currently 502 Bad Gateway: openfoodfacts/openfoodfacts-infrastructure#49
- Documentation: https://annotate.openfoodfacts.org/documentation/user_guide.html#creating-an-annotation-task
- train a "universal" logo / label detector, with good results
- for each image crop resulting from the annotations (bounding box), generate embeddings with a pre-trained network (Resnet50).
- This made it possible to verify that this approach is the right one for the classification of crops
- the results are in the presentation: 3 photos et c'est à peu près tout
- there should be as little space as possible between the bounding box and the object. Conversely, the whole object must be included in the bounding box.
- if the object is partially hidden, indicate the object as "occluded" (click on the "profile" icon on the object in question, in the right panel)
- for best results, it is necessary that similar objects are annotated in the same way (especially concerning the extent of the object). It happens that there are several scales of annotation (cf the question discussed above of pictograms "to recycle"), the most important is that the annotations are coherent.
- several very similar images or concerning the same product follow one another in the dataset. For the next campaign, it will be better to shuffle the dataset to have as much diversity as possible (edited)
- (Accessible by Pierre) https://colab.research.google.com/drive/1G-6OELcz8l1u1_53_0a2DAKRryIfB9CE
- The predictions on the validation set: output_images
- Data preprocessing : https://colab.research.google.com/drive/1cxi_aITHEFo4IZRsbiwm39CFgDKMG8LZ
- Model training : https://colab.research.google.com/drive/1qGz2tNC29IRqji4hKebmu249WaUP7u0_
- Visualization of results : https://colab.research.google.com/drive/1etqj-OgPEHi6ypjCixGSBTEW7muM6bc0
/api/v1/ann/{logo_id:int}
/api/v1/ann
/api/v1/ann/batch
/api/v1/ann/from_embedding
/api/v1/ann/add
/api/v1/ann/count
/api/v1/ann/stored
- https://openfoodfacts.slack.com/files/UN6TMCYA2/FNGUPS00H/off_barcode.csv
- Photo archive: Archive.zip (995mo): https://drive.google.com/file/d/1-N79A9jpVzR-al8aNFJNWvs59wM_M6sx/view
- https://openfoodfacts.slack.com/files/UN6TMCYA2/FQWLFNPF0/relabeled_labels_brands.tfrecord
- https://openfoodfacts.slack.com/files/UN6TMCYA2/FQWA1AZ9V/labels_brands.tfrecord
- https://openfoodfacts.slack.com/files/UN6TMCYA2/FQFLJKYG2/relabel_tfrecord.ipynb