A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes¹, Tuan-Hung Vu^1,2, Andrei Bursuc^1,2, Patrick Pérez³, Raoul de Charette¹
¹ Inria, ² valeo.ai, ³ Kyutai

Project page: https://astra-vision.github.io/FAMix/
Paper: https://arxiv.org/abs/2311.17922

TL; DR: FAMix (for Freeze, Augment, and Mix) is a simple method for domain generalized semantic segmentation, based on minimal fine-tuning, language-driven patch-wise style augmentation, and patch-wise style mixing of original and augmented styles.

Citation

@InProceedings{fahes2024simple,
  title={A Simple Recipe for Language-guided Domain Generalized Segmentation},
  author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
  booktitle={CVPR},
  year={2024}
}

Demo

Test on unseen youtube videos in different cities
Training dataset: GTA5
Backbone: ResNet-50
Segmenter: DeepLabv3+

Watch the full video on YouTube

Table of Content

Installation
Running FAMix
Inference & Visualization
License
Acknowledgement

Installation

Dependencies

First create a new conda environment with the required packages:

conda env create --file environment.yml

Then activate environment using:

conda activate famix_env

Datasets

ACDC: Download ACDC images and labels from ACDC. Please follow the dataset directory structure:

<ACDC_DIR>/                   % ACDC dataset root
├── rbg_anon/                 % input image (rgb_anon_trainvaltest.zip)
└── gt/                       % semantic segmentation labels (gt_trainval.zip)

BDD100K: Download BDD100K images and labels from BDD100K. Please follow the dataset directory structure:

<BDD100K_DIR>/              % BDD100K dataset root
├── images/                 % input image
└── labels/                 % semantic segmentation labels

Cityscapes: Follow the instructions in Cityscapes to download the images and semantic segmentation labels. Please follow the dataset directory structure:

<CITYSCAPES_DIR>/             % Cityscapes dataset root
├── leftImg8bit/              % input image (leftImg8bit_trainvaltest.zip)
└── gtFine/                   % semantic segmentation labels (gtFine_trainvaltest.zip)

GTA5: Download GTA5 images and labels from GTA5. Please follow the dataset directory structure:

<GTA5_DIR>/                   % GTA5 dataset root
├── images/                   % input image 
└── labels/                   % semantic segmentation labels

Mapillary: Download Mapillary images and labels from Mapillary. Please follow the dataset directory structure:

<MAPILLARY_DIR>/              % Mapillary dataset root
├── training                  % Training subset 
 └── images                     % input image
 └── labels                     % semantic segmentation labels
├── validation                % Validation subset
 └── images                     % input image
 └── labels                     % semantic segmentation labels

Synthia: Download Synthia images and labels from SYNTHIA-RAND-CITYSCAPES and split it following SPLIT-DATA. Please follow the dataset directory structure:

<SYNTHIA>/                 % Synthia dataset root
├── RGB/                   % input image 
└── GT/                    % semantic segmentation labels

Trained models

The trained models are available here.

Running FAMix

Style mining

python3 patch_PIN.py \
  --dataset <dataset_name> \
  --data_root <dataset_root> \
  --resize_feat \
  --save_dir <path_for_learnt_parameters_saving>

Training

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--total_itrs  40000 \
--batch_size 8 \
--val_interval 750 \
--transfer \
--data_aug \
--ckpts_path <path_to_save_checkpoints> \
--path_for_stats <path_for_mined_styles>

Evaluation

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--ckpt <path_to_tested_model> \
--test_only \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>

Inference & Visualization

To test any model on any image and visualize the output, please add the images to predict_test directory and run:

python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>

License

FAMix is released under the Apache 2.0 license.

Acknowledgement

The code is based on this implementation of DeepLabv3+, and uses code from CLIP, PODA and RobustNet.

↑ back to top

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
datasets		datasets
demo		demo
metrics		metrics
network		network
utils		utils
README.md		README.md
environment.yml		environment.yml
main.py		main.py
patch_PIN.py		patch_PIN.py
predict.py		predict.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Simple Recipe for Language-guided Domain Generalized Segmentation

Citation

Demo

Table of Content

Installation

Dependencies

Datasets

Trained models

Running FAMix

Style mining

Training

Evaluation

Inference & Visualization

License

Acknowledgement

About

Releases

Packages

Languages

astra-vision/FAMix

Folders and files

Latest commit

History

Repository files navigation

A Simple Recipe for Language-guided Domain Generalized Segmentation

Citation

Demo

Table of Content

Installation

Dependencies

Datasets

Trained models

Running FAMix

Style mining

Training

Evaluation

Inference & Visualization

License

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages