RETAB: Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary

Official PyTorch Implementation for RETAB (Region Expansion by Transferring semantic Affinity and Boundary).

Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary [arXiv]

Siyuan Zhou, Li Niu*, Jianlou Si, Chen Qian, Liqing Zhang
Accepted by BMVC2022.

Introduction

In this paper, we show that existing fully-annotated base categories can help segment objects of novel categories with only image-level labels, even if base categories and novel categories have no overlap. We refer to this task as weak-shot semantic segmentation, which could also be treated as WSSS with auxiliary fully-annotated categories. Based on the observation that semantic affinity and boundary are classagnostic, we propose a method called RETAB under the WSSS framework to transfer semantic affinity and boundary from base to novel categories. As a result, we find that pixel-level annotation of base categories can facilitate affinity learning and propagation, leading to higher-quality CAMs of novel categories.

This repository takes the initial response (CAM) in PSA as an example to illustrate the usage of our RETAB model. RETAB can be applied to any type of initial response. Since the usage of other initial responses are similar to CAM, we omit them here.

Model Zoo

Fold	Backbone	Train all-/base-/novel-mIoU of CAM	Train all-/base-/novel-mIoU of CAM+RETAB	Weights of RETAB
0	ResNet-38	48.0/51.4/37.4	71.2/74.0/62.5	psa_ourbest_fold0_affnet.pth
1	ResNet-38	48.0/47.8/48.8	71.3/71.2/71.6	psa_ourbest_fold1_affnet.pth
2	ResNet-38	48.0/47.2/50.7	70.9/70.2/73.3	psa_ourbest_fold2_affnet.pth
3	ResNet-38	48.0/47.6/49.4	70.1/72.4/62.8	psa_ourbest_fold3_affnet.pth

We plan to include more models in the future.

Usage

We provide instructions on how to install dependencies via conda. First, clone the repository locally:

git clone https://github.com/bcmi/RETAB-Weak-Shot-Semantic-Segmentation.git

Then, create a virtual environment with PyTorch 1.8.1 (require CUDA >= 11.1):

conda env create -f environment.yaml
conda activate retab

Data preparation

Download PASCAL VOC 2012 development kit and extra annotations from SBD. We expect the directory structure of the dataset (denoted by ${VOC12HOME}) to be:

<VOC12HOME>
  Annotations/
  ImageSets/
  JPEGImages/
  SegmentationClass/
  SegmentationClassAug/

Then, make some preprocessing:

cp VOC2012_supp/* ${VOC12HOME}/ImageSets/SegmentationAug/
cd psa && ln -s ${VOC12HOME} VOC2012 && cd ..
cd RETAB && ln -s ${VOC12HOME} VOC2012 && cd ..

Following the category split rule in PASCAL-5i, which is commonly used in few-shot segmentation, we evenly divide the 20 foreground categories into four folds (Fold 0,1,2,3). Categories in each fold are regarded as 5 novel categories, and the remaining categories (including background) are regarded as 16 base categories. We further divide 10582 training samples into base samples and novel samples for each fold. The list of base samples and novel samples can be found at RETAB/voc12/trainaug_fold*_base.txt and RETAB/voc12/trainaug_fold*_novel.txt, respectively.

Use PSA to generate & evaluate initial CAM

Download AffinityNet weights (ResNet-38), and put it under psa/best/ to form psa/best/res38_cls.pth. Then, run:

cd psa && sh run_psa.sh && cd ..

You could find more details in psa/run_psa.sh.

Train & infer RETAB to generate pesudo labels, and evaluate them

Download Mxnet and ResNet-38 pretrained weights, and put it under RETAB/pretrained_model/ to form RETAB/pretrained_model/ilsvrc-cls_rna-a1_cls1000_ep-0001.params. Then, make some preparation:

cd RETAB
mkdir psa_initcam && cd psa_initcam && ln -s ../../psa/result/psa_trainaug_cam psa_trainaug_cam && cd ..
mkdir psa_afflabel && cd psa_afflabel && ln -s ../../psa/result/psa_trainaug_crf_4.0 psa_trainaug_crf_4.0 && ln -s ../../psa/result/psa_trainaug_crf_32.0 psa_trainaug_crf_32.0 && cd ..

and execute:

sh run_retab.sh

You could find more details in RETAB/run_retab.sh. The default setting is Fold 0. If you want to try other folds, please replace the fouth line of RETAB/run_retab.sh with FOLD=1/FOLD=2/FOLD=3.

Perform mixed-supervised segmentation

Use the ground truth labels of base samples and the generated pesudo labels of novel samples to train a segmentation network in a mixed-supervised manner. In our implementation, we adopt ResNet-38 as our final segmentation network.

Acknowledgements

Some of the evaluation codes in this repo are borrowed and modified from PSA and SEAM. Thanks them for their great work.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
RETAB		RETAB
VOC2012_supp		VOC2012_supp
psa		psa
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RETAB: Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary

Introduction

Model Zoo

Usage

Data preparation

Use PSA to generate & evaluate initial CAM

Train & infer RETAB to generate pesudo labels, and evaluate them

Perform mixed-supervised segmentation

Acknowledgements

About

Languages

bcmi/RETAB-Weak-Shot-Semantic-Segmentation

Folders and files

Latest commit

History

Repository files navigation

RETAB: Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary

Introduction

Model Zoo

Usage

Data preparation

Use PSA to generate & evaluate initial CAM

Train & infer RETAB to generate pesudo labels, and evaluate them

Perform mixed-supervised segmentation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages