A ViT based image classification model that classifies whether a person is wearing a mask properly.
Follow this link here
The data is retrieved from kaggle, created by KUCEV ROMAN. The dataset used has 40000 images with each 10000 images for each labels.
The labels are:
- TYPE 1 - There is no mask on the face.
- TYPE 2 - The mask is on, but does not cover the nose or mouth.
- TYPE 3 - The mask covers the mouth, but does not cover the nose.
- TYPE 4 - The mask is worn correctly, covers the nose and mouth.
This is a ViT based model which is pretrained on Imagenet dataset, with a specification of image size 224 and 16 patches. The model is transfer learn to match the dataset given by KUCEV ROMAN. The overall test accuracy is 92.8%. Due to the massive amount of data with limited resources, the model was trained for only one epoch, however, one epoch has been proven to be sufficient enough for the model to distinguish different labels.
After preprocessing the dataset, I realise the files listed below are corrupted for some reason and was not able to be read by pillow. I decided to remove the files that can't be read manually after testing the files one by one.
These are the corrupted images that I found.