If you use this code, please cite one of the following papers:
-
[1] Francesco Visin, Kyle Kastner, Kyunghyun Cho, Matteo Matteucci, Aaron Courville, Yoshua Bengio - ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks (BibTeX)
-
[2] Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville - ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation (BibTeX)
Download Theano and make sure it's working properly. All the information you need can be found by following this link: http://deeplearning.net/software/theano/
This software relies on some amazing third-party software libraries.
You can install them with pip:
pip install <--user> lasagne matplotlib Pillow progressbar2 pydot-ng retrying scikit-image scikit-learn tabulate
(Use the --user
option if you don't want to install them globally or you
don't have sudo privileges on your machine.)
Download the CamVid dataset from http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/
Resize the images to 480X360 resolution. The program expects to find the
dataset data in ./datasets/camvid/
. You can change this path modifying
camvid.py
if you want.
Download the VGG weights for Lasagne from: https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/vgg16.pkl
Once downloaded, rename them as w_vgg16.pkl
and put them in the root
directory of this code.
To reproduce the results of the ReSeg paper run python evaluate_camvid.py
(or
python evaluate_camvid_with_cb.py
to reproduce the experiment with class
balancing). Make sure to set the appropriate THEANO_FLAGS to run the model on
your machine (most probably export THEANO_FLAGS=device=gpu,floatX=float32
)
The program will output some metrics on the current minibatch iteration during training:
Epoch 0/5000 Up 367 Cost 270034.031250, DD 0.000046, UD 0.848205
(None, 360, 480, 3)
More in detail, it will show the current epoch, the incremental update counter
(i.e. number of minibatches seen), the cost of the current iteration, the time
(in seconds) required to load the data DD
and to train and update the
network's parameters DD
. Finally, it will print the size of the currently
processed minibatch. None
will be displayed on variable-sized dimensions.
At the end of each epoch, it will validate the performances on the training, validation and test set and save some sample images for each set in a segmentations directory inside the root directory of the script.
At the end of the training get_info_model.py
can be used to show some
information on the trained model. Run python get_info_model.py -h
for a
list of the arguments and their explanation.
Note: In case you want to modify this code to reproduce the results of "Combining the best of convolutional layers and recurrent layers: A hybrid network for semantic segmentation" please let us know!
Many people contributed in different ways to this project. We are extremely thankful to the Theano developers and to many people at MILA for their support and for the many insightful discussions. We also thank the developer of Lasagne, a powerful yet light framework on top of Theano. I wish I discovered it at the beginning of this project! :)
Finally, our gratitude goes to the developers of all the great libraries we used in this project, to all the people who got involved with the project at any level and to our generous sponsors.