Skip to content

Pre-trained CNNs based Comparitive Study for Age, Ethinicity and Gender Classification alongside Grad-CAM for indepth working.

License

Notifications You must be signed in to change notification settings

kwanit1142/Comparitive-Study-Of-Age-Ethinicity-and-Gender-Classification-via-Faces

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparitive Study Of Age, Ethinicity and Gender Classification via Faces

About

The project is a comparative study of 4 SOTA Models in which we have performed exploratory comparison of human facial images of different Age, Gender and Ethnicity. The models which we have used are ResNet50, VGG19, MobileNetV2 and AlexNet. These are compared in terms of classification performance and heatmap generation. This study provides explainability to the model predictions in addition to their efficiency in the said classification task.

Dataset

The following label columns were present alongside their corresponding variations:-

  1. Age :- Discretized Integer Quantity, ranging from 1 to 100s.

Screenshot (1267)

  1. Gender :- String based Classes, either Male or Female.

Screenshot (1269)

  1. Ethnicity :- String based Classes; White, Black, Indian, Asian or Hispanic.

Screenshot (1268)

Models

This repository focuses on the internal working of Pre-Trained Convolutional Neural Networks (CNNs), with different architectures as follows:-

  1. AlexNet :- It has 8 layers with learnable parameters. The model consists of 5 layers with a combination of Max Pooling followed by 3 fully connected layers and they use Relu activation in each of these layers, except the output layer.

  2. VGG-19 :- It was proposed by Karen Simonyan and Andrew Zisserman in 2014 in the paper "Very Deep Convolutional Networks for Large Scale Image Recognition".

  3. MobileNetV2 :- An architecture with depthwise and pointwise convolutions in order to enrich the features of the input data.

  4. ResNet50 :- It is a convolutional neural network that is 50 layers deep. ResNet includes several residual blocks that consist of convolutional layers, batch normalization layers and ReLU activation functions. We used the pretrained ResNet50 model to extract features from the Human Faces.

End-To-End-Pipeline

As far as the fine-tuning configurations are in concern, the following were performed:-

  1. Original Image size was 48x48. It was then resized to 224x224, since most of the Neural Network Architectures follow the same input convention.

  2. According to the Mean, Variance and Standard Deviation desired for the proper functionality of the pipeline, the input data was normalized (To the centralized Single-Peak Traditional Gaussian Distribution).

  3. Random_Seed was set to 129, Training Epochs were set to 10 and the Train-Test Split was decided to be kept as 80-20.

  4. If GPU is enabled, then Batch-Size of 64 was defined, else 32.

Grad-CAM

Screenshot (1290)

For witnessing the internal working of the Pre-Trained Convolutional Neural Networks, one of the renowned methods is Grad-CAM, which utilizes the activations of the Neural Network Layers, when backpropagated with the class label's stimuli signal. This procedure leads to the impression of certain activation functions and assorted components like poolings, convolution kernels, etc. on the Input Data in the form of Probabilistic Heatmaps (RGB, in decreasing order of attention).

Contributors

  1. Ronak Singhvi
  2. Misaal Khan
  3. Kwanit Gupta (me)

About

Pre-trained CNNs based Comparitive Study for Age, Ethinicity and Gender Classification alongside Grad-CAM for indepth working.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published