Skip to content

zlwangustc/SWA_paddle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stochastic Weight Averaging (SWA) - Paddle Version

This repoitory contains a Paddle implementation of the Stochastic Weight Averaging(SWA) training method.

by Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov and Andrew Gordon Wilson.

Introduction

Deep neural networks are typically trained by optimizing a loss function with an SGD variant, in conjunction with a decaying learning rate, until convergence. but simple averaging of multiple points along the trajectory of SGD, with a cyclical or constant learning rate, leads to better generalization than conventional training which called Stochastic Weight Averaging (SWA) procedure.

SWA is extremely easy to implement, improves generalization,and has almost no computational overhead.The experimental results in the paper is summarized in the following.

We implement the SWA method with Paddle and test with VGG16 model. The results are close to the orginal paper on the CIFAR-10 datasets.

Structure

swa-paddle
     ├──  models 
        ├── vgg.py
        ├── preresnet.py
        ├── wide_resnet.py
     ├── eval.py 
     ├── train.py 
     ├── utils.py 

Training:

!python train.py --swa  

Evaluating:

!python eval.py --model_path="out/checkpoint.pdparams" 

Results:

Method DataSet Environment Model Epoch Test Accuracy
SWA CIFAR-10 Tesla V100 VGG-16 200 93.68

AI studio:

!python -m paddle.distributed.launch train.py --swa

Model:

The model we have trained is save to : Baidu Aistudio SWA Paddle

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages