efficient-densenet-pytorch

Memory-Efficient Implementation of DenseNets, supporting both DenseNet and DeseNet-BC series.

Environments: Linux CPU/GPU, Python 3, PyTorch 1.0

Check the implementation correctness by python -m utils.gradient_checking.py with different settings in utils/gradient_checking.py (CPU, single GPU, multiple GPUs).

Benchmark the forward/backward of efficient&non-efficient DenseNet by python -m utils.benckmark_effi.py (CPU, single GPU, multiple GPUs). The following results are reported on the Linux system equipped with 40 Intel(R) Xeon(R) CPUs (E5-2630 v4 @ 2.20GHz) and NVIDIA GTX TiTan 1080Ti.

Model setting:

num_init_features=24, block_config=(12, 12, 12), compression=1, input_size=32, bn_size=None, batch size=128.

Model	CPU	1 GPU	2 GPUs	4 GPUs
Efficient	F=15849, B=36738, R=2.3	F=38.0, B=103.5, R=2.72	F=38.5, B=64.0, R=1.66	F=63.3, B=77.5, R=1.23
Non-efficient	F=24889, B=36732, R=1.5	F=38.0, B=77.9, R=2.05	F=36.2, B=42.1, R=1.16	F=56.2, B=31.8, R=0.57

F means average forward time (ms), B means average backward time (ms), R=B/F.

The efficient version can process up to 1450 batches in a single GPU (~12GB), compared with 350 batches of the non-efficient version. That is, the efficient version is ~4x memory-efficient as the non-efficient version.

How to load the pretrained DenseNet into the efficient version?

It is simple.

Take DenseNet-121 as an example.

First, download the checkpoint:

wget https://download.pytorch.org/models/densenet121-a639ec97.pth

Then run

cd utils
python convert.py --to efficient --checkpoint densenet121-a639ec97.pth  --output densenet121_effi.pth

Done.

You can load the state dict in the saved file densenet121_effi.pth into the efficient model now!

import torch
from models import DenseNet
model = DenseNet(num_init_features=64, block_config=(6, 12, 24, 16), compression=0.5,
                 input_size=224, bn_size=4, num_classes=1000, efficient=True)
state_dict = torch.load('densenet121_effi.pth')
model.load_state_dict(state_dict, strict=True)

Train efficient DenseNet on ImageNet

Easy configuration and run:

Install the requirements via pip install -r requirements.txt
Prepare ImageNet dataset following the installation instructions, and the shell scripts in the datasets/imagenet_pre should be helpful.
Configure the experiment settings in config.yaml.
run the command like ./run.sh 0,1,2,3 config.yaml .

You will find that 4 GPUs are totally enough for training the efficient model with batch size of 256!

CIFAR training is also provided and easy to configure :-)

References

Pleiss, Geoff, et al. "Memory-efficient implementation of densenets." arXiv preprint arXiv:1707.06990 (2017).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

efficient-densenet-pytorch

How to load the pretrained DenseNet into the efficient version?

Train efficient DenseNet on ImageNet

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
datasets		datasets
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
demo.py		demo.py
requirements.txt		requirements.txt
run.sh		run.sh

License

wandering007/efficient-densenet-pytorch

Folders and files

Latest commit

History

Repository files navigation

efficient-densenet-pytorch

How to load the pretrained DenseNet into the efficient version?

Train efficient DenseNet on ImageNet

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages