This repository contains the official implementation of our paper:
Abstract: We introduce a new setting of Novel Class Discovery in Semantic Segmentation (NCDSS), which aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes. In contrast to existing approaches that look at novel class discovery in image classification, we focus on the more challenging semantic segmentation. In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image, which increases the difficulty in using the unlabeled data. To tackle this new setting, we leverage the labeled base data and a saliency model to coarsely cluster novel classes for model training in our basic framework. Additionally, we propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels, further improving the model performance on the novel classes. Our EUMS utilizes an entropy ranking technique and a dynamic reassignment to distill clean labels, thereby making full use of the noisy data via self-supervised learning. We build the NCDSS benchmark on the PASCAL-5^i dataset and COCO-20^i dataset. Extensive experiments demonstrate the feasibility of the basic framework (achieving an average mIoU of 49.81% on PASCAL-5^i) and the effectiveness of EUMS framework (outperforming the basic framework by 9.28% mIoU on PASCAL-5^i).
Illustration of Novel Class Discovery in Semantic Segmentation (NCDSS).
- [Jan 3 2023] 🔔 Training and evaluation code for COCO-20i dataset is released.
- Python = 3.7
- Pytorch = 1.8.0
- CUDA = 11.1
- Install other packages in
requirements.txt
We follow MaskContrast to prepare the data.
Download PASCAL VOC 2012. Unzip the dataset and ensure the file structure is as follows:
VOCSegmentation
├── images
├── SegmentationClassAug
├── saliency_supervised_model
└── sets
Download COCO 2014. Unzip the dataset and preprocess the dataset with the following command:
cd data/data_preprocess
python coco.py train2014
python coco.py val2014
Download the saliency maps from Google Drive. The saliency maps are estimated via BASNet. Specifically, we directly download the BASNet pre-trained model and inference on the COCO dataset.
The file structure is as follows:
coco
├── train2014
├── val2014
├── masks_train2014
├── masks_val2014
├── saliency_supervised_model
├── val2014.txt
└── train2014.txt
You can download the pre-trained models in this paper from Google Drive. Then run the command.
sh scripts/eval.sh
-
Base Training.
sh scripts/base_train.sh
-
Clustering Pseudo-labeling.
sh scripts/clustering_cmd.sh
-
Novel Fine-tuning.
The pseudo-labels generated in the Clustering Pseudo-labeling stage is used for Novel Fine-tuning stage. To ensure the reproducibility, you can directly download our generated clustering pseudo-labels from Google Dive.
-
Basic framework.
sh scripts/finetune_basic.sh
-
Entropy ranking.
sh scripts/entropy_ranking.sh
The clean and unclean splits are also provided in Google Dive.
-
EUMS framework.
sh scripts/finetune_eums.sh
-
The training and evalution scripts for coco-20i dataset are available at:
scripts/coco/*.sh
This project is based on the following open-source projects. We thank their authors for making the source code publically available.
We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:
@inproceedings{zhao2022ncdss,
title={Novel Class Discovery in Semantic Segmentation},
author={Zhao, Yuyang and Zhong, Zhun and Sebe, Nicu and Lee, Gim Hee},
booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}}
If you have any questions about this code, please do not hesitate to contact me.