Skip to content

ANN&DL Challenges - Polimi - a.y. 2020-2021: Image Classification, Image Segmentation, Visual Question Answering

Notifications You must be signed in to change notification settings

LorenzoMainetti/artificial-neural-networks-and-deep-learning-challenges-2020-2021

Repository files navigation

Artificial Neural Networks and Deep Learning Competition

This repository contains the Jupyter Notebooks that we created for the competition hosted by the Artificial Neural Network and Deep Learning course at Politecnico di Milano in the academic year 2020-2021.

The competition was divided into three challenges, each one of which covered a different topic of the course:

  • Image Classification
  • Image Segmentation
  • Visual Question Answering

Image Classification

Kaggle

The goal of the challenge is to classify images of people wearing masks into one of three classes:

  • Everyone in the image is wearing a mask
  • No one in the image is wearing a mask
  • Someone in the image is not wearing a mask.

Dataset: 5614 images in the training set, 450 images in the test set

Evaluation: Multiclass Accuracy 96%

Here is a complete description on how we approached the challenge and how we got our best model.

Image Segmentation

CodaLab

The goal of the challenge is to perform precise automatic crop and weed segmentation for the agricultural sector.
The images contained two different crop types: Mais or Haricot. The segmented objects can belong to one of three classes:

  • Background, defined as label 0 and has RGB pixel [0, 0, 0] and [254, 124, 18]
  • Crop, defined as label 1 and has RGB pixel [255, 255, 255]
  • Weed, defined as label 2 and has RGB pixel [216, 67, 82]

Datasets: integration of 4 widely different datasets of pictures and masks coming from the ROSE challenge

Evaluation: Intersection over Union 50.83%

Input image Target mask

Here is a complete description on how we approached the challenge and how we got our best model.

Visual Question Answering

Kaggle

The goal of the challenge is to answer questions using the information provided by the corresponding image and question pair. The given input is an image and an associated question about it, and the output is an answer, belonging to one of three possible categories: 'yes/no', 'counting' (from 0 to 5), and 'other' (e.g. colors, location, etc.).

Dataset: 58832 questions in the training set, 29333 total images (size: 400x700), 6372 questions for testing

Evaluation: Multiclass Accuracy 62.34%

Here is a complete description on how we approached the challenge and how we got our best model.

Group Members

Releases

No releases published

Packages

No packages published