In this project, we use the active fire dataset from https://github.com/pereira-gha/activefire (data link) and try to improve over their results. We use two Vision Transformer networks: Swin-Unet and TransUnet, and one CNN-based UNet network. We show that ViT can outperform well-trained and specialized CNNs to detect wildfires on a previously published dataset of LandSat-8 imagery (Pereira et al.). One of our ViTs outperforms the baseline CNN comparison by 0.92%. However, we find our own implementation of CNN-based UNet to perform best in every category, showing their sustained utility in image tasks. Overall, ViTs are comparably capable in detecting wildfires as CNNs, though well-tuned CNNs are still the best technique for detecting wildfire with our UNet providing an IoU of 93.58%, better than the baseline UNet by some 4.58%.
- UNet.py: Contains the pytorch code for UNet model.
- evaluate.py: Takes in the model name and evaluates the saved checkpoint on 4 metrics: precision, recall, f-score, and IoU.
- generator.py: Data generator code.
- models.py: Returns the instances of different models used in this work.
- predict.py: Saves the inference result from the a checkpoint file.
- train.py: Code to train a model.
- transform.py: Image transforms for data augmentation.
# Train
python train.py <model-name>
## Example
python train.py unet
# Evaluate
python evaluate.py <model-name>
## Example
python evaluate.py unet
# Save predictions
python predict.py <model-name> <image-path>
## Example
python predict.py unet predictions/unet/
Method | Precision | Recall | F-score | IoU |
---|---|---|---|---|
U-Net (10c) | 92.90 | 95.50 | 94.20 | 89.00 |
U-Net (3c) | 91.90 | 95.30 | 93.60 | 87.90 |
U-Net-Light (3c) | 90.20 | 96.50 | 93.20 | 87.30 |
TransUNet | 88.46 | 86.88 | 87.66 | 87.49 |
Swin-Unet | 88.28 | 92.30 | 90.24 | 89.93 |
Our UNet | 93.37 | 93.96 | 93.67 | 93.58 |