Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default training parameters to classification refrence README #1998

Merged
merged 2 commits into from
Mar 20, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion references/classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,31 @@ This folder contains reference training scripts for image classification.
They serve as a log of how to train specific models, as provide baseline
training and evaluation scripts to quickly bootstrap research.

Except otherwise noted, all models have been trained on 8x V100 GPUs.
Except otherwise noted, all models have been trained on 8x V100 GPUs with
the following parameters:

| Parameter | value |
| ------------------------ | ------ |
| `--batch_size` | `32` |
| `--epochs` | `90` |
| `--lr` | `0.1` |
| `--momentum` | `0.9` |
| `--wd`, `--weight-decay` | `1e-4` |
| `--lr-step-size` | `30` |
| `--lr-gamma` | `0.1` |

### AlexNet and VGG

Since `AlexNet` and the original `VGG` architectures do not include batch
normalization, the default initial learning rate `--lr 0.1` is to high.

```
python main.py --model $MODEL --lr 1e-2
```

Here `$MODEL` is one of `alexnet`, `vgg11`, `vgg13`, `vgg16` or `vgg19`. Note
that `vgg11_bn`, `vgg13_bn`, `vgg16_bn`, and `vgg19_bn` include batch
normalization and thus are trained with the default parameters.

### ResNext-50 32x4d
```
Expand Down