The project aims to quickly implement a deep learning algorithm.
For a deep learning algorithm, we need four parts: dataset, network, trainer, and evaluator.
- dataset: provide the data for training or testing.
- network: the architecture of our algorithm.
- trainer: define loss functions for training.
- evaluator: define metrics for evaluation.
With this project, you only need to implement the four parts. The training and testing pipeline has been implemented.
task
means the name of this algorithm.
The meta data of dataset like dataset_id
is registered in lib/datasets/dataset_catalog.py
.
dataset:
lib/datasets/
├── dataset_catalog.py
├── $dataset_id/
└── $task.py
We could define networks for this task under lib/networks/$task
.
network:
lib/networks/
├── $task/
└── __init__.py
└── resnet.py
A trainer computes the losses for the algorithm.
Define a trainer for this task under lib/train/trainers/$task
.
trainer:
lib/train/
├── trainers/
└── $task.py
It is very very necessary for us to define a evaluator. It is a feedback for our training strategies.
Evaluate an algorithm task
on a dataset that is registered with dataset_id
.
evaluator:
lib/evaluators/
├── $dataset_id/
└── $task.py
Use a visualizer to watch the output of the network or something else.
Define a visualizer under lib/visualizers
.
lib/visualizers/
└── $task.py
Some variables in this project:
$task
: denote the algorithm.$dataset_name
: denote the dataset used for training or testing, which is registered inlib/datasets/dataset_catalog.py
.$model_name
: the model with a specific configuration.
Test a model of our algorithm on a dataset:
python run.py --type evaluate test.dataset $dataset_name resume True model $model_name task $task
Some variables during training:
$test_dataset
: the dataset used for evaluation during training.$train_dataset
: the dataset used for training.$network_name
: a specific network architecture for our algorithm$task
.
Train a model of our algorithm on a dataset for 140 epochs and evaluate it every 5 epochs:
python train_net.py test.dataset $test_dataset train.dataset $train_dataset resume False model $model_name task $task network $network_name train.epoch 140 eval_ep 5
Some configurations that are frequently revised for training:
train.milestones
andtrain.gamma
: multiply the learning rate bygamma
at some specified epochs.train.batch_size
: the training batch_size.train.lr
: the initial learning rate.train.optim
: the optimizer.train.warmup
: use the warmup training or not.
More training configurations can be seen in lib/config/config.py
.
You could define the configuration with the terminal command or yaml files under configs/
.
During training, we can watch the training losses, testing losses and dataset metrics:
cd data/record
tensorboard --logdir $task
The losses are defined in the trainer, and the metrics are defined in the evaluator.