A Systematic Comparison of Activation Functions

This project is a comparison of various activation functions on the MNIST dataset. We compared the performance of five different activation functions based on the accuracy, precision/recall, time, and number of epochs in their respective models.

Model

We utilized the same machine learning model for each experiment. We decided on a Feedforward Neural Network with 784 inputs (each pixel), 100 hidden units, and 10 outputs [0, 9].

Activation Functions

Sigmoid/Tanh

The downside to Signmoid/Tanh activation functions is that they are susceptible to the vanishing gradient problem which drastically slows down training, and is very sensitive to initial weights.

ReLU

The ReLU activation function grows for positive values so for x >> 1 there's no vanishing gradient. This mean you can get faster training times. The downside is that you collect many dead neurons as the gradients go negative, additionally teh computation can explode as teh output is unbound.

Saturating Linear Function

This is essentially ReLU with an upper bound which stop the output explosion problem.

Leaky ReLU (custom)

This function is ReLU with a small gradient below zero. This slope is controlled by a hyper parameter α. This fixes the dead neuron problem as there is now a gradient for outputs below zero.

Linear

This is a control for our experiments, using a linear activation function is essentially the same as having no activation function.

Results

Model	Accuracy	Precision	Recall
(sigmoid)	93.12%	0.9786	0.9946
(tanh)	91.81%	0.9776	0.9939
(ReLU)	93.80%	0.9735	0.9953
(satlin)	92.86%	0.9806	0.9945
(Leaky ReLU, a=0.1)	93.84%	0.9786	0.9952
(purelin/None)	91.16%	0.9684	0.9938

Other Results/ Meta Learning

Model	Accuracy	Precision	Recall
(sigmoid)	93.12%	0.9786	0.9946
(tanh)	91.81%	0.9776	0.9939
(ReLU)	93.80%	0.9735	0.9953
(satlin)	92.86%	0.9806	0.9945
(Leaky ReLU, a=0.1)	93.84%	0.9786	0.9952
(purelin/None)	91.16%	0.9684	0.9938

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
lib		lib
.gitignore		.gitignore
Activate.m		Activate.m
BagNN.m		BagNN.m
README.md		README.md
StackingNN.m		StackingNN.m
bad-weights.png		bad-weights.png
count_dead.m		count_dead.m
hidden.m		hidden.m
leaky-relu-alpha.png		leaky-relu-alpha.png
project.m		project.m
train.png		train.png
traingin-graph.png		traingin-graph.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Systematic Comparison of Activation Functions

Model

Activation Functions

Sigmoid/Tanh

ReLU

Saturating Linear Function

Leaky ReLU (custom)

Linear

Results

Other Results/ Meta Learning

About

Releases

Packages

Languages

fegan104/CS539-Project

Folders and files

Latest commit

History

Repository files navigation

A Systematic Comparison of Activation Functions

Model

Activation Functions

Sigmoid/Tanh

ReLU

Saturating Linear Function

Leaky ReLU (custom)

Linear

Results

Other Results/ Meta Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages