Severstal: Steel Defect Detection

Abstract: this repo includes a pipeline using Catalyst for training UNet with different encoders for the problem of steel defect detection. Moreover, weights for trained models are provided, the result are:

UNet with ResNet-50 - IoU 0.413
UNet with EfficientNet-B3 - IoU 0.541
UNet with EfficientNet-B4 - IoU 0.592

Important: balanced (in the meaning of defect classes) dataset includes 1000 images, where each class includes roughly 250 images. With the whole dataset the metrics might be better.

I have not included EDA here, in general data seems to be clear (having our new dataset balanced).

Plan of research

First, let's identify the main architecture. UNet is a bit better for this problem than Mask R-CNN. It is enough to complete the task, without the need to use more complex instance segmentation like Mask R-CNN. I've conducted a research on several Kaggle kernels and papers from sources like arxiv.com.

So:

Architecture: UNet
Encoder: EfficientNet-B3,B4; ResNet-50
Loss function: DiceBCELoss, TverskyLoss (alpha=0.1, beta=0.9)
Optimizer: Adam (learning rate for encoder 1e-3, learning rate for decoder 1e-2), as encoder is much deeper
learning scheduler: ReduceLROnPlateau(factor=0.15, patience=2)

General thoughts

Important to notice that we have quite imbalanced dataset in the meaning of classes defect/no_defect (True Positive and True Negative). Thus, it is important to pick the appropriate loss. I've tried DiceBCELoss and Tversky Loss (alpha=0.1 and beta=0.9). The best results have been obtained with DiceBCELoss in this case.

Both of the encoders were pretrained on ImageNet. However, I do believe there is one more trick that can be fruitful: we can fine-tune encoders on the whole dataset (classification defect/no_defect). This way we can get some better results, but there were no images of class no_defect in the train.csv at all.

Also, in this situation of imbalanced classes there is point in using only images including True Positive. But, as I said above, there were no other pictures at all in the train.csv.

Moreover, we can try some multi-scale training methods to increase image resolution from small to large, but I haven't done that.

I need to add I've been bounded with Cuda memory capacity, so basicaly I could not try bigger encoders for batch size > 8.

Results

Encoder	IoU	DiceBCELoss	Mask Resolution	Epochs
ResNet-50	0.4132		(256, 1600)
EfficientNet-B3	0.513	0.444	(256, 768)	11
EfficientNet-B4	0.597	0.36	(256, 768)	37

Link to TensorBoard for EfficientNet-B4: tap here

Inferences for validation data:

EfficientNet-B4

Example 1:

Example 2:

Example 3:

Example 4:

Installation

Required libraries are catalyst, segmentation_models and albumentations.

P.S. I've used segmentation_models for fast prototyping.

Installation:

!pip install git+https://github.com/qubvel/segmentation_models.pytorch
!pip install -U git+https://github.com/albu/albumentations 
!pip install catalyst

Usage

The directory tree should be:

├── Predict_masks.py
├── Train.py
├── config.py
├── data
│   ├── results                #results
│   ├── test.csv
│   ├── test_images            #download test images here
│   ├── train.csv
│   ├── train_balanced.csv
│   └── train_images           #download train images here
├── images
│   
├── readme.md
├── utils
│   ├── losses.py
│   └── utils.py
└── weights
    ├── UnetEfficientNetB4_IoU_059.pth
    └── UnetResNet50_IoU_043.pth

Evaluation

There is a Predict_masks.py script which can be used to evaluate the model and predict masks for the test dataset (from test.csv). The weights are stored in the ./weights directory.

Pictures with predicted masks and source images will be stored in data/results folder.

Important: masks for ResNet-50 are of (256, 1600)px and masks for EfficientNet-B3,B4 are of (256, 768)px. Free Colab doesn't allow to use more Cuda memory:(

Usage example:

python3 Predict_masks.py -dir /Users/user/Documents/steel_defect_detection/data/  -weights_dir /Users/user/Documents/steel_defect_detection/data/weights

Arguments

-dir    : Pass the full path of a directory containing a folder "train" and "train.csv".
-num_of_images   : Number of test image from test.csv for segmentation.
-weights_dir   : Pass a weights directory.

Predict.py doesn't save binary masks, it saves pictures with image and predicted mask for better presentation.

Training

The model is supposed to be trained on the dataset from the Kaggle competition. You can choose which encoder to use and a batch size. The default is EfficientNet-B4. Mask size is set as (256, 768) in config.py, you can set your own.

It is necessary to point the directory where the train folder and train.csv are stored.

Usage example:

python3 Train.py -dir /Users/user/Documents/steel_defect_detection/data/ -num_of_workers 4

Arguments

-dir    : Pass the full path of a directory containing a folder "test" and "test.csv".
-encoder   : Backbone to use as encoder for UNet, default='efficientnet-b3'.
-batch_size   : Batch size for training, default=8.
-num_of_workers   : Number of workers for training, default=0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Severstal: Steel Defect Detection

Plan of research

General thoughts

Results

Installation

Usage

Evaluation

Arguments

Training

Arguments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
images		images
utils		utils
config.py		config.py
predict_masks.py		predict_masks.py
readme.md		readme.md
train.py		train.py

githubartema/Steel-Defect-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Severstal: Steel Defect Detection

Plan of research

General thoughts

Results

Installation

Usage

Evaluation

Arguments

Training

Arguments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages