Skip to content

mchancan/deepseqslam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepSeqSLAM - the deep learning framework for robot place recognition

This repository contains the official PyTorch implementation of the papers:

[1] Sequential Place Learning: Heuristic-Free High-Performance Long-Term Place Recognition. Marvin Chancán, Michael Milford. [ArXiv]  [Website]

[2] DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition. Marvin Chancán, Michael Milford. NeurIPS 2020 Workshop on Machine Learning for Autonomous Driving (ML4AD).  [ArXiv] [Website] [YouTube Video]

Both papers introduce DeepSeqSLAM, a CNN+LSTM baseline architecture for state-of-the-art route-based place recognition. DeepSeqSLAM leverages visual and positional time-series data for joint global description and sequential place inference in the context of simultaneous localization and mapping (SLAM) and autonomous driving research. Contrary to classical two-stage pipelines, e.g., match-then-temporally-filter, this codebase is orders of magnitud faster, scalable and learns from a single traversal of a route, while accurately generalizing to multiple traversals of the same route under very different environmental conditions.

   
DeepSeqSLAM: The baseline architecture for Sequential Place Learning

News

  • (May 10, 2021) Fixed and uploaded the Gardens Point dataset (.zip) file on Zenodo (version 2). You can also find it on Google Drive. Thanks everyone!
  • (Apr 30, 2021) The Gardens Point dataset link on Zenodo (version 1) has some errors when trying to unzip. Here is an alternative Google Drive link. Thanks for the interest in the code!
  • (Mar 4, 2021) Contributions welcome!
  • (Mar 3, 2021) Archive ML4AD release and update this README.md with new Gardens Point dataset link.
  • (Mar 1, 2021) Paper Sequential Place Learning submitted to RSS 2021.
  • (Oct 30, 2020) Paper DeepSeqSLAM accepted at the NeurIPS 2020 Workshop on ML4AD.

BibTex Citation

If you find any of the tools provided here useful for your research or report our results in a publication, please consider citing both Sequential Place Learning and DeepSeqSLAM papers:

@article{chancan2021spl,
  title = {Sequential Place Learning: Heuristic-Free High-Performance Long-Term Place Recognition},
  author = {Marvin Chanc{\'a}n and Michael Milford},
  journal = {arXiv preprint arXiv:2103.02074},
  year = {2021}
}

@article{chancan2020deepseqslam,
  title = {DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition},
  author = {Marvin Chanc{\'a}n and Michael Milford},
  journal = {arXiv preprint arXiv:2011.08518},
  year = {2020}
}

Getting Started

You just need Python v3.6+ with standard scientific packages, PyTorch v1.1+, and TorchVision v0.3.0+.

git clone https://github.com/mchancan/deepseqslam

Training Data

The challenging Gardens Point Walking dataset consists of three folders with 200 images each. The image name indicates correspondence in location between each of the three route traversals. Download the dataset, unzip, and place the day_left, day_right, and night_right image folders in the datasets/GardensPointWalking directory of DeepSeqSLAM.

Single Node Training (CPU, GPU or Multi-GPUs)

In this release, we provide an implementation of DeepSeqSLAM for evaluation on the Gardens Point dataset with challenging day-night changing conditions. We also provide normalized (synthetic) positional data for end-to-end training and deployment.

Run the demo on the Gardens Point dataset

sh demo_deepseqslam.sh

You can run this demo using one of these pre-trained models: alexnet, resnet18, vgg16, squeezenet1_0, densenet161, or easily configure the run.py script for training with any other PyTorch's model from torchvision.

Commands example

SEQ_LENGHT=10

BATCH_SIZE=16

EPOCHS=100

NGPUS=1

SEQ1='day_left'

SEQ2='day_right'

SEQ3='night_right'

CNN='resnet18'

MODEL_NAME="gp_${CNN}_lstm"

python run.py train \
    --model_name $MODEL_NAME \
    --ngpus $NGPUS \
    --batch_size $BATCH_SIZE \
    --seq_len $SEQ_LENGHT \
    --epochs $EPOCHS \
    --val_set $SEQ2 \
    --cnn_arch $CNN

for i in $SEQ1 $SEQ2 $SEQ3
do
    python run.py val \
    --model_name $MODEL_NAME \
    --ngpus $NGPUS \
    --batch_size $BATCH_SIZE \
    --seq_len $SEQ_LENGHT \
    --val_set $i \
    --cnn_arch $CNN
done

Help

usage: run.py [-h] [--data_path DATA_PATH] [-o OUTPUT_PATH]
              [--model_name MODEL_NAME] [-a ARCH] [--pretrained PRETRAINED]
              [--val_set VAL_SET] [--ngpus NGPUS] [-j WORKERS]
              [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--lr LR]
              [--load LOAD] [--nimgs NIMGS] [--seq_len SEQ_LEN]
              [--nclasses NCLASSES] [--img_size IMG_SIZE]

Gardens Point Training

optional arguments:
  -h, --help            show this help message and exit
  --data_path DATA_PATH
                        path to dataset folder that contains preprocessed
                        train and val *npy image files
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        path for storing model checkpoints
  --model_name MODEL_NAME
                        checkpoint model name (default:
                        deepseqslam_resnet18_lstm)
  -a ARCH, --cnn_arch ARCH
                        model architecture: alexnet | densenet121 |
                        densenet161 | densenet169 | densenet201 | googlenet |
                        inception_v3 | mobilenet_v2 | resnet101 | resnet152 |
                        resnet18 | resnet34 | resnet50 | resnext101_32x8d |
                        resnext50_32x4d | shufflenet_v2_x0_5 |
                        shufflenet_v2_x1_0 | shufflenet_v2_x1_5 |
                        shufflenet_v2_x2_0 | squeezenet1_0 | squeezenet1_1 |
                        vgg11 | vgg11_bn | vgg13 | vgg13_bn | vgg16 | vgg16_bn
                        | vgg19 | vgg19_bn (default: resnet18)
  --pretrained PRETRAINED
                        use pre-trained CNN model (default: True)
  --val_set VAL_SET     validation_set (default: day_right)
  --ngpus NGPUS         number of GPUs for training; 0 if you want to run on
                        CPU (default: 2)
  -j WORKERS, --workers WORKERS
                        number of data loading workers (default: 4)
  --epochs EPOCHS       number of total epochs to run (default: 200)
  --batch_size BATCH_SIZE
                        mini-batch size: 2^n (default: 32)
  --lr LR, --learning_rate LR
                        initial learning rate (default: 1e-3)
  --load LOAD           restart training from last checkpoint
  --nimgs NIMGS         number of images (default: 200)
  --seq_len SEQ_LEN     sequence length: ds (default: 10)
  --nclasses NCLASSES   number of classes = nimgs - seq_len (default: 190)
  --img_size IMG_SIZE   image size (default: 224)

Multiple Nodes

For training on multiple nodes, you should use the NCCL backend for multi-processing distributed training since it currently provides the best distributed training performance. Please refer to ImageNet training in PyTorch for additional information on this.

Contributions welcome

You are welcome to contribute with features that might be valuable:

  • training/testing using pre-computed (e.g. NetVLAD) global descriptors (reference.npy/query.npy)
  • add more CNN models for global description from raw images (e.g. NetVLAD)
  • supporting multiple datasets (e.g. Oxford RobotCar, Nordland)
  • standardize positional encoding inputs (mean=0, variance=1)
  • deployment visualizations (e.g. raw image sequences, features, top-k matches)

Acknowledgements

This code has been largely inspired by the following projects:

License

GNU GPL 3+

Created and maintained by Marvin Chancán.