MCTformer (CVPR2022)

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation.

Fig.1 - Overview of MCTformer

🚩 Updates

2023-08-08: MCTformer+ on Arxiv

Environment Setup

Ubuntu 18.04, with Python 3.6 and the following python dependencies.

pip install -r requirements.txt

Data Preparation

PASCAL VOC 2012

Download the PASCAL VOC 2012 development kit.

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar –xvf VOCtrainval_11-May-2012.tar

Download augmented annoations SegmentationClassAug.zip from SBD dataset via this link.

Make your data directory like this below

VOCdevkit/
└── VOC2012
    ├── Annotations
    ├── ImageSets
    ├── JPEGImages
    ├── SegmentationClass
    ├── SegmentationClassAug
    └── SegmentationObject

MS COCO 2014

Download MS COCO 2014 dataset

wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip

Usage

Train MCTformer+

bash run_mct_plus.sh

Step 1: Run the run.sh script for training MCTformer, visualizing and evaluating the generated class-specific localization maps.

bash run.sh

PASCAL VOC 2012 dataset

Model	Backbone	Google drive
MCTformer-V1	DeiT-small	Weights
MCTformer-V2	DeiT-small	Weights

Step 2: Run the run_psa.sh script for using PSA to post-process the seeds (i.e., class-specific localization maps) to generate pseudo ground-truth segmentation masks. To train PSA, the pre-trained classification weights were used for initialization.

bash run_psa.sh

Step 3: For the segmentation part, run the run_seg.sh script for training and testing the segmentation model. When training on VOC, the model was initialized with the pre-trained classification weights on VOC.

bash run_seg.sh

MS COCO 2014 dataset

Run run_coco.sh for training MCTformer and generating class-specific localization maps. The class label numpy file can be download here. The trained MCTformer-V2 model is here.

bash run_coco.sh

Contact

If you have any questions, you can either create issues or contact me by email lian.xu@uwa.edu.au

Citation

Please consider citing our paper if the code is helpful in your research and development.

@inproceedings{xu2022multi,
  title={Multi-class Token Transformer for Weakly Supervised Semantic Segmentation},
  author={Xu, Lian and Ouyang, Wanli and Bennamoun, Mohammed and Boussaid, Farid and Xu, Dan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4310--4319},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
coco		coco
psa		psa
seg		seg
voc12		voc12
MCTformer-V1.png		MCTformer-V1.png
MCTformer-V2.png		MCTformer-V2.png
README.md		README.md
datasets.py		datasets.py
engine.py		engine.py
evaluation.py		evaluation.py
main.py		main.py
models.py		models.py
requirements.txt		requirements.txt
run.sh		run.sh
run_coco.sh		run_coco.sh
run_mct_plus.sh		run_mct_plus.sh
run_psa.sh		run_psa.sh
run_seg.sh		run_seg.sh
utils.py		utils.py
vision_transformer.py		vision_transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCTformer (CVPR2022)

🚩 Updates

Environment Setup

Data Preparation

Usage

Train MCTformer+

PASCAL VOC 2012 dataset

MS COCO 2014 dataset

Contact

Citation

About

Releases

Packages

Languages

xulianuwa/MCTformer

Folders and files

Latest commit

History

Repository files navigation

MCTformer (CVPR2022)

🚩 Updates

Environment Setup

Data Preparation

Usage

Train MCTformer+

PASCAL VOC 2012 dataset

MS COCO 2014 dataset

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages