This is a fork of original MoCo v3 repository. The purpose of this repository is the hardcoded training of MoCoV3 using MNIST dataset.
In this fork, we also corrected some bugs popping up in case of non-distributed training.
This is a PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.
Install PyTorch.
For ViT models, install timm (timm==0.4.9
).
The code has been tested with CUDA 10.2/CuDNN 7.6.5, PyTorch 1.9.0 and timm 0.4.9.
Below is an example for MoCo v3 training.
Run:
python main_moco.py \
--arch=resnet50 \
--workers=4 \
--epochs=10 \
--batch-size=32 \
--learning-rate=1e-4 \
--moco-dim=16 \
--moco-mlp-dim=1024 \
--crop-min=0.7 \
--print-freq=10
Using a smaller batch size has a more stable result (see paper), but has lower speed. Using a large batch size is critical for good speed in TPUs (as we did in the paper).
See the commands listed in CONFIG.md for specific model configs, including our recommended hyper-parameters and pre-trained reference models.
This project is under the CC-BY-NC 4.0 license. See LICENSE for details.
@Article{chen2021mocov3,
author = {Xinlei Chen* and Saining Xie* and Kaiming He},
title = {An Empirical Study of Training Self-Supervised Vision Transformers},
journal = {arXiv preprint arXiv:2104.02057},
year = {2021},
}