Skip to content

Latest commit

 

History

History
207 lines (159 loc) · 7.71 KB

README.md

File metadata and controls

207 lines (159 loc) · 7.71 KB

Nighttime Scene Understanding with Label Transfer Scene Parser

This repository is the official implementation of the paper entitled: Nighttime Scene Understanding with Label Transfer Scene Parser at Image and Vision Computing Journal, Sep 2024.
Authors: Thanh-Danh Nguyen, Nguyen Phan, Tam V. Nguyen*, Vinh-Tiep Nguyen, and Minh-Triet Tran.

[Paper] [Code] [Project Page]

1. Environment Setup

Download and install Anaconda with the recommended version from Anaconda Homepage: Anaconda3-2019.03-Linux-x86_64.sh

git clone https://github.com/danhntd/Label_Transfer_Scene_Parser.git
cd Label_Transfer_Scene_Parser
curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
bash Anaconda3-2019.03-Linux-x86_64.sh

After completing the installation, please create and initiate the workspace with the specific versions below. The experiments were conducted on a Linux server with a single GeForce RTX 2080Ti GPU, CUDA 10.2, Torch 1.7.

conda create --name LTSP python=3
conda activate LTSP
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.2 -c pytorch
conda env update -f enviroment.yml --prune

2. Data Preparation

In this work, we mainly use NEXET and Cityscapes for the training process of the whole framework.

2.1. Image Domain Translation

Our image domain translation is basically based on the work UNIT. Please have a look for further explained information.

We provided the prepared version of NEXET dataset at this link with the following structure.

NEXET
|---dataset
    |---trainA
        |---*.jpg
    |---trainB
        |---*.jpg
    |---testA
        |---*.jpg
    |---testB
        |---*.jpg
    # name of images in corresponding folders
    |---trainA.txt
    |---trainB.txt
    |---testA.txt
    |---testB.txt

2.2. Semantic Scence Parser

Cityscapes is our main semantic segmentation training and validation dataset. We also utilized Nighttime Driving Test as our testing set. Other segmentation datasets are considered appropriate when they follow the data structure and labels of Cityscapes. Readers can reach the original published work Cityscapes for details. Please follow the folder structure to prepare the data:

Cityscapes
|---leftImg8bit
    |---train
        |---cityA
            |---*.png
    |---val
    |---test
|---gtFine_trainvaltest
    |---train
        |---gtFine
            |---cityA
                |---*_gtFine_color.png
                |---*_gtFine_labelIds.png
                |---*_gtFine_labelTrainIds.png
                |---*_gtFine_polygons.json
    |---val
    |---test

In ./Semantic_Segmentor/mypath.py, adjust the path to Cityscapes dataset: <root_path>/Cityscapes/

3. Training Pipeline

Our proposed Label Transfer Scene Parser includes a 5-step pipeline:

3.1. Image Domain Translation Training

cd <your_root>/Label_Transfer_Scene_Parser/Domain_Translator/
python train.py \
                --trainer UNIT \
                --config configs/<path_to_config_file>.yaml

3.2. Synthetic Nighttime Inference

python test_batch.py \
                --trainer UNIT \
                --config /path_to_unit_day2night_folder_add_vgg_loss.yaml \
                --input_folder path_to_folder_testA/ \
                --output_folder /output_testA/  \
                --checkpoint /path_to_ckpt_day2night_gen_00330000.pt \
                --a2b 1 \
                --output_only

The whole script commands can be found in ./Domain_Translator/scripts/run_nightime2daytime.sh.

We provide the pretrained checkpoint of our Nighttime Translation Model at this link.

3.3. Semantic Scene Parser Training

cd <your_root>/Label_Transfer_Scene_Parser/Semantic_Segmentor/
CUDA_VISIBLE_DEVICES=0 python train_val_CL_CE6FL4_stage1_cosine_UNIT.py \
                --dataset Cityscapes \
                --save_dir <path/to/save/result>/run_stage1_combine_CE6FL4/

3.4. Inference on Unlabeled Nighttime Data

CUDA_VISIBLE_DEVICES=0 python infer.py \
                --experiment_dir ./Semantic_Segmentor/run_stage1_combine_CE6FL4 \
                --path_to_unlabel_set /path/to/unlabel/set/ \
                --path_to_save /path/to/dataset/Cityscapes/

3.5. Semantic Scene Parser Re-training

CUDA_VISIBLE_DEVICES=0 python train_val_CL_CE6FL4_stage2_cosine_UNIT.py \
                --dataset Cityscapes \
                --save_dir <path/to/save/result>/run_stage2_combine_CE6FL4/
                --checkpoint ./Semantic_Segmentor/saved_checkpoints/run_stage1_combine_CE6FL4/Cityscapes/fpn-resnet101/model_best.pth.tar

For testing on the trained models:

CUDA_VISIBLE_DEVICES=0 python test.py \
                --dataset Cityscapes \ 
                --experiment_dir ./Semantic_Segmentor/run_stage2_combine_CE6FL4

For inferencing on testing images:

CUDA_VISIBLE_DEVICES=0 python predict.py \
                --experiment_dir ./Semantic_Segmentor/run_stage2_combine_CE6FL4
                --path_to_save /path/to/destination/folder/
                --path_to_test_set /path/to/dataset/Cityscapes/

The whole script commands can be found in ./Semantic_Segmentor/scripts.sh.

Released checkpoints and results:

We provide the checkpoints of our final segmentation model including 2 stages: S1_CE6FL4 and S2_CE6FL4.

Download and place the checkpoints at the corresponding paths or re-train the model by yourself:

./Semantic_Segmentor/saved_checkpoints/run_stage1_combine_CE6FL4/Cityscapes/fpn-resnet101/stage1_model_best.pth.tar
./Semantic_Segmentor/saved_checkpoints/run_stage2_combine_CE6FL4/Cityscapes/fpn-resnet101/stage2_model_best.pth.tar

4. Visualization

We provide the sample translation results in this path ./Semantic_Segmentor/Domain_Translator/results.

Our prediction results on Nighttime Driving Dataset are available at this link.

Citation

Please use this bibtex to cite this repository:

@article{nguyen2024nighttime,
  title={Nighttime scene understanding with label transfer scene parser},
  author={Nguyen, Thanh-Danh and Phan, Nguyen and Nguyen, Tam V and Nguyen, Vinh-Tiep and Tran, Minh-Triet},
  journal={Image and Vision Computing},
  volume={151},
  pages={105257},
  year={2024},
  publisher={Elsevier}
}

Acknowledgements

FPN-Semantic-Segmentation FCN-Pytorch Pytorch-Deeplab-Xception Pytorch-FPN FPN.Pytorch UNIT CycleGAN