Skip to content

aschneuw/road-segmentation-unet

Repository files navigation

Unet for Road Segmentation

We built and trained a Convolutional Neural Network in Tensorflow to segment roads on satellite images, i.e. assign labels road=1, background=0 to each pixel. We implemented a convolution/deconvolution U-Net with dilated layers from Ronneberger et al. (2015). Data augmentation was a key element for performance improvement, we applied rotations, mirroring extension and vertical/horizontal flips.

The train and evaluation data contains 100 and 50 images of size 400x400 and 604x604 respectively. These are some samples of our predictions on evaluation images:

images_003
images_022 images_048

Consult our report for further information.

This project was part of the Machine Learning course taught at EPFL by Prof. Urbanke and Prof. Jaggi.

Contributors

Setup Environment

Our setup requires a Default Unix Environment (Ubuntu 16.04, or MAC OS X) with an installed Pyhton 3.5 or 3.6 Environment. Our implementation also requires the command sha256sum which is integrated OOB in Ubuntu. If you use OS X or Windows, install the command or do the model verification by computing the SHA256 with the appropriate tools. (Read more about model verification below)

The necessary can easily be setup using pip and the provided requirements.txt.

   pip install -r requirements.txt

Run

To generate our final Kaggle submission execute:

./run.py

The predictions and the submission file are saved in the folder prediction/

Train a new model

To train a new model, you may run:

python3 src/tf_aerial_images.py \
     --train_data_dir ./data/training/ \
     --save_path=./runs \
     --logdir=./logdir \
     --num_epoch=25 \
     --batch_size=1 \
     --patch_size=388 \
     --gpu=1 \
     --eval_every=1000 \
     --stride=12 \
     --train_score_every=10000 \
     --image_augmentation \
     --rotation_angles 15,30,45,60,75 \
     --ensemble_prediction \
     --dilated_layers \
     --num_layers=6 \
     --dropout=1.0

To inspect the model during and after training, use tensorboard:

tensorboard --logdir=./logdir

Flags

When running tf_aerial_images.py, the following flags may be set to control the application behavior:

Flag Description
batch_size Batch size of training instances
dilated_layers Add dilated CNN layers
dropout Probability to keep an input
ensemble_prediction Enable ensemble prediction
eval_data_dir Directory containing eval images
eval_every Number of steps between evaluations
eval_train Evaluate training data
gpu GPU to run the model on
image_augmentation Augment training set of images with transformations
interactive Spawn interactive Tensorflow session
logdir Directory where to write logfiles
lr Initial learning rate
model_path Restore exact model path
momentum Momentum
num_epoch Number of pass on the dataset during training
num_eval_images Number of images to predict for an evaluation
num_gpu Number of available GPUs to run the model on
num_layers Number of layers of the U-Net
patch_size Size of the prediction image
pred_batch_size Batch size of batchwise prediction
restore_date Restore the model from specific date
restore_epoch Restore the model from specific epoch
restore_model Restore the model from previous checkpoint
root_size Number of filters of the first U-Net layer
rotation_angles Rotation angles
save_path Directory where to write checkpoints, overlays and submissions
seed Random seed for reproducibility
stride Sliding delta for patches
train_data_dir Directory containing training images/ groundtruth/
train_score_every Compute training score after the given number of iterations

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published