Skip to content

A repo using machine learning to classify chest x-ray images using CNNs. The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

License

Notifications You must be signed in to change notification settings

dataeducator/image_classification_with_deep_learning

Repository files navigation

Image Classification with Deep Learning


Detecting Penumonia with Deep Learning

Introduction:

Disclaimer:

This work is intended solely for educational purposes. The included business case and the results of the deep learning models should not be interpreted as medical advice and have not received endorsement or approval from any professional or medical organization. The models and outcomes presented here are for illustrative purposes only and should not be utilized for making real-world decisions without consulting appropriate domain experts and medical professionals. Any actions taken based on the information in this notebook are at the user's own risk. The author and contributors of this notebook disclaim any liability for the information's accuracy, completeness, or efficacy.

Business Understanding:

  • Stakeholder: Zephyr Health
  • Business Case: I am a new data analyst on the Data Analytics team and have been tasked with building a model to classify whether a given patient has pneumonia given a chest x-ray.

According to a 2022 report (click here for full report) by Johns Hopkins over 700 thousand children under 5 die from pneumonia each year.

Objectives

The main objectives of this project are:

  • Develop a robust and efficient system for early childhood pneumonia detection using a Convolutional Neural Network(CNN), which detects the presence of pneumonia with high precision.
  • Generate a system that can be validated and deployed across various healthcare settings to reach underserved populations.

Metrics for evaluation

Our task is a binary classification problem that uses chest x-ray images as input. Our model will predict whether the image depicts PNEUMONIA or a NORMAL chest x-ray. We will set the goal of at least 90% for the f-1 score. Our project will use the F1-score or the harmonic mean of recall and precision.

$$ F1\text{-}Score = \frac{{2 \cdot \text{True Positive}}}{{2 \cdot \text{True Positive} + \text{False Positive} + \text{False Negative}}} $$

This formula represents the F1-score, which is a metric used to evaluate the performance of a binary classification model. It balances the trade-off between precision and recall.

Our model will be successful where:

$$ F1\text{-}Score > .90 $$

Source of Data

The dataset used in this exploration is from this repository and can be found here -Citation: Kermany, Daniel; Zhang, Kang; Goldbaum, Michael (2018), “Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images,” Mendeley Data, V3, doi: 10.17632/rscbjbr9sj.3

Image Type Chest X-ray images
Patient Age 1-5 years
Source Pediatric Patients at Guangzhou Women and Children's Medical Center
Location Guangzhou, China during routine
Quality Control Initial screening + grading by 2 experts
Set Class Number of Images
Training Normal 1341
Training Pneumonia 3875
Testing Normal 234
Testing Pneumonia 390
Validation Normal 8
Validation Pneumonia 8

Class Distribution across sets

These images fall into two distinct categories:

  • Pneumonia
  • Normal

The data was biased, with the PNEUMONIA class having more than twice the number of images as the NORMAL class, so the PNEUMONIA class was shortened to a random selection of images equal to the number of NORMAL images in the training set.

Class Distribution rebalanced training set

Exploration

Pixel Intensities of Sample Images

Modeling

In this project, I will use the OSEMiN pipeline to:

  • Obtain → Import the data.
  • Scrub → Manage the datatypes and resolve missing data or duplicates.
  • Explore → Identify patterns within the relationships between variables in the data.
  • Model → Create a set of predictive models.
  • iNterpret → Identify insights and create visualizations of findings.

For our model, we could prioritize:

  • accuracy - the proportion of correctly predicted labels from all of the samples in the testing dataset.
  • precision - measures the model's accuracy in predicting true positives as a proportion of all positives.
  • recall- measures the model's ability to measure all positive instances correctly.
  • f1-score - the harmonic mean of precision and recall, used when we want to consider both identifying true positives and minimizing false negatives.

Our best model uses sequential, convolutional, dense, and dropout layers and is illustrated below: Model

Evaluation Metrics

We included a comparison of our baseline model and our best revised CNN with evaluation metrics in the table below:

Model Class Precision Recall F1-Score
Revised CNN NORMAL 0.94 0.76 0.84
Revised CNN PNEUMONIA 0.87 0.97 0.92
Baseline CNN NORMAL 0.40 0.19 0.26
Baseline CNN PNEUMONIA 0.63 0.83 0.72

Insights

  • For the "NORMAL" class, the revised model exhibits significantly higher precision, recall, and F1-score values, indicating a more accurate identification of normal cases.
  • For the "PNEUMONIA" class, the revised model demonstrates improved performance across all metrics, suggesting it is more effective at detecting pneumonia cases than the baseline model.
  • Overall, the revised model achieves a substantially higher accuracy of 89% compared to the baseline model's 59%, highlighting its superior performance.

The Confusion Matrix for our Test set is shown here: Confusion Matrix best model

We also compare the F1-Scores of the different models we tested out as we attempted this proof of concept task: F1-scores different models

We also experimented with LIME in an attempt to make what our model identifies as important features visible. Here is an example of an image, it's true classification label, it's a predicted classification label and a segmented image that highlights "features" of importance to our current best model.

Lime Output

Recommendations:

*Expert-evaluated data for Model Enhancement Utilizing images that have been expert-evaluated is crucial for improving the accuracy and performance of our model. This can be achieved through the following steps:

  • Collaboration with Radiology Experts: Establish partnerships with radiologists and other medical professionals to evaluate X-ray images. Their expertise will contribute to a high-quality annotated dataset.
  • Continuous Feedback Loop: Implement a feedback mechanism to incorporate expert evaluations into the training pipeline, ensuring the model learns from expert insights.
  • Optimal Scanner Placements To maximize the effectiveness of our Pneumonia detection system, it is recommended to strategically deploy X-ray scanners and computer systems with our model in the following locations to run a pilot to collect more expert verifiable data:
  • Pediatric Wards and Clinics: Ensure accessibility to children in healthcare facilities where pneumonia cases are most frequently diagnosed and treated.
  • High-Risk Areas and Communities: Identify regions with elevated incidences of childhood pneumonia and establish scanning facilities near these communities.

Future Work:

In the pursuit of refining our Pneumonia detection system, several avenues for further research and development present themselves:

  • Model Testing and Comparison:

  • Evaluate additional pre-trained models, such as ResNet50, VGG19, and InceptionV3, which have demonstrated effectiveness on large and diverse datasets. Data Augmentation and Collection:

  • Augment the dataset to acquire more diverse and representative samples of Normal chest X-ray images. This will enhance the model's ability to distinguish between normal and abnormal cases accurately. Tertiary Classification Model:

  • Expand the task scope to include a tertiary classification model identifying the underlying cause of pneumonia (virus, bacteria, normal). This distinction is crucial as bacterial and viral-driven pneumonia treatment strategies vary significantly.

These future steps will contribute to our Pneumonia detection system's continued improvement and versatility. Please review my full analysis in my Jupyter notebook or (my presentation). Feel free to contact me Tenicka Norwood at tenicka.norwood@gmail.com if you have any more questions.

Repository Structure


   .
   └──image_classification_with_deep_learning/
      ├── README.md                                            Overview for project reviewers  
      ├── image_classification_with_deep_learning.ipynb        Documentation of Full Analysis in Jupyter Notebook
      ├── presentation.pdf                                     PDF version of Full Analysis shown in a slide deck
      ├── notebook.pdf                                         PDF version of Full Analysis shown in Jupyter notebook
      ├── setup.yml                                            Includes instructions to obtain the dataset
      └── .gitignore                                           Specifies intentionally untracked files

About

A repo using machine learning to classify chest x-ray images using CNNs. The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published