Arxiv link for the SimCLR paper : A Simple Framework for Contrastive Learning of Visual Representations
This repository contains my implementation of the SimCLR paper in PyTorch. I've written a blog post about the same here. SimCLR presented a simple framework to learn representations from unlabeled image data using contrastive learning. A composition of two data augmentation operations, namely random crop and color jittering, produced outlooks from images, and then positive and negative pairs are defined as outlooks from same and different images, respectively, in a batch.
The recommended version for running the experiments is Python3. The results using this code have been generated with: torch (1.4.0) torchvision (0.5.0) scikit-learn (0.23.2) numpy (1.19.1) matplotlib (3.3.2) seaborn (0.11.0)
The skeletal overview of this project is as follows:
.
├── utils/
│ ├── __init__.py
│ ├── model.py
│ ├── ntxent.py
│ ├── plotfuncs.py
│ └── transforms.py
├── results/
│ ├── model/
│ │ ├── lossesfile.npz
│ │ ├── model.pth
│ │ ├── optimizer.pth
│ ├── plots/
│ │ ├── training_losses.png
├── linear_evaluation.py
├── main.py
├── simclr.py
├── README.md
You can pass command line arguments to the files main.py for simclr training and linear_evaluation.py for linear classifier evaluation on top of the learned representations.
file main.py
- datapath: Path to the data root folder which contains train and test folders
- respath: Path to the results directory where the saved model and evaluation graphs would be stored.
- -bs: The batch size for self-supervised training (default = 250)
- -nw: The number of workers for loading data (default=2)
- -c: if present, use cuda
- --multiple_gpus: if multiple gpus are available, you can use them using this option
file linear_evaluation.py
- datapath: Path to the data root folder which contains train and test folders
- modelpath: Path to the trained self-supervised model
- respath: Path to the results directory where the saved model and evaluation graphs would be stored.
- -bs: The batch size for linear evaluation (default=250)
- -nw: The number of workers for loading data (default=2)
- -c: if present, use cuda
- --multiple_gpus: if multiple gpus are available, you can use them using this option
- --remove_top_layers: remove these many top layers from the overall network, they define the projection head
Example usage:
This command would do the self-supervised training on the dataset at '../milli_imagenet' and store the results in the 'results' directory. The training batch size is 250, cuda and multiple gpus are used.
python main.py '../milli_imagenet' 'results' -bs 250 -c --multiple_gpus &
This command would run the linear evaluator, for which the dataset is at '../milli_imagenet', stored self-supervised model is at 'results/model/model.pth' and would produce the results at 'results' (if any), uses cuda and multiple gpus. Batch size used is 125.
python linear_evaluation.py '../milli_imagenet/' 'results/model/model.pth' 'results' -c --multiple_gpus -bs 125
We used the imagenet-5-categories dataset, which has a total of 1250 train and 250 test images. We used this version of this dataset. For linear evaluation, we used 250 train images (i.e. 10% of the train set).
class | precision | recall | f1-score | support |
---|---|---|---|---|
car | 0.8600 | 0.8600 | 0.8600 | 50 |
airplane | 0.7679 | 0.8600 | 0.8113 | 50 |
elephant | 0.7500 | 0.7200 | 0.7347 | 50 |
dog | 0.5345 | 0.6200 | 0.5741 | 50 |
cat | 0.7632 | 0.5800 | 0.6591 | 50 |
Test Accuracy: 72.80% (with projection head having 2 layers)
Pretrained model (number of epochs: 1000) with optimizer and loss file is present here.
SimCLR paper by Chen et. al:
@misc{chen2020simple,
title={A Simple Framework for Contrastive Learning of Visual Representations},
author={Ting Chen and Simon Kornblith and Mohammad Norouzi and Geoffrey Hinton},
year={2020},
eprint={2002.05709},
archivePrefix={arXiv},
primaryClass={cs.LG}
}