[arXiv
]
This repository contains the implementation of the ConvMAE transfer learning for object detection on COCO.
For ImageNet pretraining and pretrained checkpoint, please refer to ConvMAE.
Name | pre-train | pre-train epoch |
finetune epoch |
box AP |
mask AP |
model | log |
---|---|---|---|---|---|---|---|
ViTDet, ViT-B | IN1K, MAE | 1600 | 100 | 51.6 | 45.9 | - | - |
ViTDet, ConvMAE-B | IN1K, ConvMAE | 1600 | 25 | 53.9 | 47.6 | model | log |
Please follow Installation to install detectron2.
cd datasets
ln -s /path/to/coco coco
python tools/lazyconfig_train_net.py --num-gpus 8 --config-file \
projects/ConvMAEDet/configs/COCO/mask_rcnn_vitdet_convmae_b_25ep.py \
train.init_checkpoint=path/to/pretrained_model
python tools/lazyconfig_train_net.py --num-gpus 8 --eval-only --config-file \
projects/ConvMAEDet/configs/COCO/mask_rcnn_vitdet_convmae_b_25ep.py \
train.init_checkpoint=path/to/model_checkpoint
This project is based on Detectron2 and VitDet. Thanks for their wonderful work.
@article{gao2022convmae,
title={ConvMAE: Masked Convolution Meets Masked Autoencoders},
author={Gao, Peng and Ma, Teli and Li, Hongsheng and Dai, Jifeng and Qiao, Yu},
journal={arXiv preprint arXiv:2205.03892},
year={2022}
}