Skip to content
This repository has been archived by the owner on Sep 9, 2024. It is now read-only.

A tool for classifying an image into a disaster type, utilizing Python

Notifications You must be signed in to change notification settings

tariqshaban/disaster-classification-with-xai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Conducting Explainable AI (XAI) on Disasters Image Classification Model Augmented with Pretrained CNN

This is the implementation of the paper Natural disasters detection using explainable deep learning.

It contains the code necessary to implement a CNN model concerning disasters while including XAI visualization for selected images.

The dataset from the Disaster Image Classification and MEDIC has been used.

Getting Started

Clone the project from GitHub

$ git clone https://github.com/tariqshaban/disaster-classification-with-xai.git

TensorFlow with Keras needs to be installed (preferably utilizing GPU acceleration "CUDA-enabled")

No further configuration is required.

Usage

Simply run the notebook on any IPython program.

Methodology

The main operations conducted in this repository are thus:

  • Modify the global variables section:
    • Generic seed
    • Epochs
    • Learning rate:
      • 0.01
      • 0.001
      • 0.0001
    • Pretrained model (base model)
      • ResNet50
      • InceptionV3
      • VGG19
      • EfficientNetB0
      • EfficientNetB7
      • EfficientNetV2B0
      • EfficientNetV2L
      • ViT-B-32
    • Preprocessing method (in concurrence with the pretrained model)
    • Optimization algorithm:
      • Root Mean Squared Propagation (RMSProp)
      • Adam, a replacement for stochastic gradient descent
  • Read and decode the dataset into an array of pairs, denoting the true label of the image and the image name itself.
  • Randomly partitioning the dataset into training, validation, and test (70% 20% 10%).
  • Build a CNN model with the following characteristics:
    • Hyperparameters:
      • The specified number of epochs
      • The specified learning rate
      • The specified optimizer
    • Layers:
      • The selected base model
      • Identity layer; since directly accessing the base model for the Grad-CAM is not possible (Non-ViT models only)
      • GlobalAveragePooling2D layer (Non-ViT models only)
      • Multiple dropout layers: 20%
      • Rescaling layer: Part of the images' preprocessing (ViT models only)
      • Multiple dense layers with softmax activation function
  • Plotting the model's performance:
    • Training accuracy
    • Validation accuracy
    • Training loss
    • Validation loss
    • Testing confusion matrix (applicable since the model is for multi-label classification)
  • Visualize image samples:
    • Display the original image
    • Display the image augmented with LIME explainer
    • Display the image augmented with Grad-CAM explainer (Non-ViT models only)
    • Display the image augmented with Grad-CAM++ explainer (Non-ViT models only)
  • Modify the global variables based on the observed results.

Note: Classical machine learning classifiers are added to assess the effectiveness of the deep learning models; however, they are not fitted with XAI. Such machine learning models include:

  • Bagging
  • Decision tree
  • Random forest
  • K-nearest neighbors
  • SVM
  • Linear SVM (with SGD training)
  • Logistic regression (with SGD training)

HOG (Histogram of Oriented Gradients) was used as a feature descriptor to extract the edge orientation of the images. Then, the result was flattened to be trained by these models.

The following methods should be invoked to build and evaluate the model, as well as to implement XAI techniques:

# Download and filter the dataset
load_dataset()

# Ready the dataset and partition it into training, validation, and testing
prime_dataset()

# Build the model, and optionally plot performance measurements
model = build_model(measure_performance=True)
# Fetches a single image via a specified URL in the form of a matrix as a nested list
img = url_to_image('https://www.enr.com/ext/resources/News/2016/September/north_carolina_hurricane_matthew.jpg')

# Conduct XAI methods for an image on a predefined model; XAI methods include LIME, Grad-CAM, and Grad-CAM++
plot_XAI(img, model)

# Predict the image's class based on a predefined model
predict_image_class(img, model)
# Fetches a single image directly from the dataset in the form of a matrix as a nested list
img = path_to_image('05_01_1225.png')

# Conduct XAI methods for an image on a predefined model; XAI methods include LIME, Grad-CAM, and Grad-CAM++
plot_XAI(img, model)

# Predict the image's class based on a predefined model
predict_image_class(img, model)

Findings

Machine Learning Model Disaster Image Classification Dataset MEDIC Dataset
Bagging %61.83 %43.22
Decision Tree %44.98 %33.55
Random Forest %64.10 %46.02
K-Nearest Neighbors %35.67 %41.86
SVM ✅ %72.52 ✅ %54.46
Linear SVM (with SGD training) %66.08 %43.66
Logistic Regression (with SGD training) %65.49 %43.27
Disaster Image Classification Dataset Learning Rate of 0.01 Learning Rate of 0.001 Learning Rate of 0.0001
Pretrained model Optimizer Accuracy Loss Accuracy Loss Accuracy Loss
ResNet50 RMSProp %91.86 0.4534 %94.13 0.3170 %94.43 0.2185
Adam %90.40 0.5933 %94.43 0.3208 %94.65 0.1973
InceptionV3 RMSProp %85.64 0.4963 %90.76 0.3915 %91.94 0.3172
Adam %65.01 0.8476 %90.25 0.4804 %90.98 0.3253
VGG19 RMSProp %90.98 0.5177 %92.45 0.2995 %93.18 0.3274
Adam %91.13 0.5561 %92.60 0.3322 %92.67 0.2999
EfficientNetB0 RMSProp %93.99 0.5230 %93.99 0.3112 %94.43 0.2093
Adam %93.55 0.4254 %93.99 0.3295 %94.21 0.2096
EfficientNetB7 RMSProp %92.08 0.7207 %92.52 0.4363 %93.04 0.3009
Adam %91.50 0.6972 %92.45 0.4202 %92.96 0.2996
EfficientNetV2B0 RMSProp %95.09 0.4872 %94.87 0.2668 %95.16 0.1838
Adam %94.13 0.5253 %94.57 0.2977 %95.16 ✅ 0.1834
EfficientNetV2L RMSProp %91.72 0.4893 %92.45 0.3260 %93.62 0.2588
Adam %91.50 0.5071 %92.23 0.3382 %93.11 0.2658
ViT-B-32 RMSProp %93.84 1.3044 %95.01 0.5274 ✅ %95.23 0.2551
Adam %94.21 1.1693 %94.21 0.5438 %95.09 0.2557
MEDIC Dataset Learning Rate of 0.01 Learning Rate of 0.001 Learning Rate of 0.0001
Pretrained model Optimizer Accuracy Loss Accuracy Loss Accuracy Loss
ResNet50 RMSProp %69.77 0.9564 %74.99 0.8523 %75.17 0.7892
Adam %71.79 0.9097 %74.57 0.8770 %74.99 0.7892
InceptionV3 RMSProp %56.90 1.2042 %69.14 1.0372 %71.76 0.8921
Adam %66.00 1.1068 %71.29 0.9361 %71.87 0.7187
VGG19 RMSProp %70.79 1.0027 %73.10 0.8454 %74.99 0.7925
Adam %71.06 0.9752 %72.26 0.8487 %74.41 0.7893
EfficientNetB0 RMSProp %74.62 0.8699 %75.72 0.8556 %76.82 0.7424
Adam %73.81 0.8858 %75.64 0.8607 %76.51 0.7434
EfficientNetB7 RMSProp %71.45 0.9280 %74.04 0.9616 %75.36 0.7886
Adam %72.84 0.9325 %73.78 0.9565 %75.36 0.7891
EfficientNetV2B0 RMSProp %72.26 0.8460 %76.46 0.8137 %77.06 0.7157
Adam %75.15 0.8510 %76.38 0.7877 %77.22 ✅ 0.7140
EfficientNetV2L RMSProp %73.68 0.8792 %74.65 0.8268 %75.88 0.7582
Adam %72.94 0.9182 %75.15 0.8221 %75.93 0.7572
ViT-B-32 RMSProp %73.78 1.2433 %76.43 1.5416 ✅ %76.93 0.8988
Adam %74.83 1.4110 %76.27 1.7522 %76.85 0.9353

Based on the table, ViT-B-32 (RMSProp) at a learning rate of 0.0001 returned the highest accuracy, while EfficientNetV2B0 (Adam) at a learning rate of 0.0001 returned the lowest loss.

Model Performance

The following images are the result of using ViT-B-32 with RMSProp optimizer on 0.0001 learning rate (for the Disaster Image Classification Dataset).

accuracy.png loss.png

Note that the model started converging at the 8th epoch since the pretrained model's weight has expedited the learning process.

confusion.png

Regardless of the hyperparameters enforced, all models generally have a relatively higher error rate in distinguishing between urban fire and wildfire, as well as between infrastructure damage and landslide, such observed behaviour seems logical; due to the shared characteristics between these classes.


XAI Results

The following are the XAI interpretation on random image samples, either from the dataset itself, or from external sources.

ResNet50 has been used for the XAI instead of the best model (ViT-B-32); since Grad-CAM and Grad-CAM++ require a 2D layer; which is only available in the pretrained CNN models.

xai_infrastructure.png xai_infrastructure_url.png xai_land_slide.png xai_non_damage_buildings_street.png xai_non_damage_wildlife_forest.png xai_sea.png xai_urban_fire.png xai_water_disaster.png xai_wild_fire.png

All the images have been successfully classified to their true label.

Notes

  • It appears that some of the provided true labels of the images are incorrect. A fair amount of images is not refined, that is, some images contain banners or even watermarks that might hinder the model’s performance.

Citation

Ahmad M. Mustafa, Rand Agha, Lujain Ghazalat, Tariq Sha’ban, Natural disasters detection using explainable deep learning, Intelligent Systems with Applications, 2024, 200430, ISSN 2667-3053.

@article{MUSTAFA2024200430,
    title = {Natural disasters detection using explainable deep learning},
    journal = {Intelligent Systems with Applications},
    pages = {200430},
    year = {2024},
    issn = {2667-3053},
    doi = {https://doi.org/10.1016/j.iswa.2024.200430},
    url = {https://www.sciencedirect.com/science/article/pii/S2667305324001042},
    author = {Ahmad M. Mustafa and Rand Agha and Lujain Ghazalat and Tariq Sha’ban},
}

Releases

No releases published

Packages

No packages published