- Introduction
- My own algorithm and the resulting scores
- Multi-model and multi-crop statistics
- Getting Started
In this project, I have designed an algorithm that can visually diagnose melanoma, the deadliest form of skin cancer. In particular, the algorithm distinguish this malignant skin tumor from two types of benign lesions (nevi and seborrheic keratoses).
The data and objective are pulled from the 2017 ISIC Challenge on Skin Lesion Analysis Towards Melanoma Detection. As part of the challenge, participants were tasked to design an algorithm to diagnose skin lesion images as one of three different skin diseases (melanoma, nevus, or seborrheic keratosis).
Open my Jupyter Notebook dermatologist-ai.ipynb to see how I trained multiple Convolution Neural Network to classify the three skin diseases and reached a Mean ROC AUC score of 0.944 (see ROC curves for melanoma and seborrheic keratosis below). It would have been a TOP 1 in the challenge (see scores in Evaluation). It's very satisfying for what I wanted to achieve, especially since the winner's score is 0.911. 😃
But much more than this score, I learned a lot and sometimes the hard way, and took a lot of fun. 😅
Particularly, I share how I turned my many mistakes while designing the model into positive learning experiences!
I also found very interesting to make some statisticts on multi-crop / multi-model scores. A picture is worth a thousand words! Here's the distributions of ROC AUC with respect to number of models:
This is interesting to see that multi-model gives me ≃3.25% return over investment... whereas multi-crop "only" gives ≃0.6%.
Click the link below to open and execute my notebook in Google Colab:
Ensure GPU is enabled: Edit > Notebook settings or Runtime>Change runtime type and select GPU as Hardware accelerator
In the Notebook's Getting Started, change settings according to your needs:
Then press Ctrl+F9 to Execute all cells in the notebook
Be patient, it takes +/-3* minutes (if download_images=False) or +/- 12 minutes (if download_images=True) to execute all the notebook cells including :
- 5' to download and 3' to extract default images
- 1' to download and extract additional images
- 1' to download results for all pretrained models
- 1' to build the best team
- 1' to execute all other cells around (1')
* timing is of course approximate, and estimated on NVIDIA Tesla T4 GPUs, which is faster than the older NVIDIA Tesla K80 GPU my models were trained on
Optionally add +/- 16 minutes if you want to skip loading results and force testing, as the very first time it requires to resize all images. And 1 more minute to test DenseNet, and up to 4 minutes for NasNetALarge...
Optionally Add many hours if you want to train your own model... 😊
Note that my models were trained with NVIDIA Tesla K80 GPU.
If you want to know more details about the challenge itself, or create your own project from scratch, read the original README.md.