We explore Multiplicative Normalizing flows [1] in Bayesian neural networks with different prior distributions over the network weights. The prior over the parameters can not only influence how the network behaves, but can also affect the uncertainty calibration and the achievable compression rate. We experiment with uniform, Cauchy, log-uniform, Gaussian, and standard Gumbel priors on predictive accuracy and predictive uncertainty.
We use the code implemented by authors available here: AMLab-Amsterdam/MNF_VBNN. src
folder contains the codes for MNF, LeNet and soft weight sharing [2] (code). To run all experiments with default parameters
cd src/mnf
python mnf_lenet_mnist.py
To specify the prior distribution, modify PARAMS in constants.py
. Available options are ['standard_normal', 'log_uniform', 'standard_cauchy', 'standard_gumbel', 'uniform']
('gaussian_mixture'
support will be added soon.)
Dependencies: The code requires tensorflow. We have created a environment.yml
file with the (working) package versions. It can be installed using conda.
Predictive performance: Table below shows the validation and test accuracy achieved on the MNIST
dataset.
Uncertainty evaluation: For the task of uncertainty evaluation, we use the trained network to predict the distribution forunseen classes. We train the models on MNIST
dataset and evaluate on the notMNIST
[3] and MNIST-rot
[4] datasets.
Entropy of the predictive distribution for the MNIST-rot
test set. The left figure is the histogram of entropy values and the right figure shows the corresponding cumulative distribution function.
Sparsity
- Multiplicative Normalizing Flows for Variational Bayesian Neural Networks. Christos Louizos & Max Welling. arXiv:1703.01961
- Soft Weight-Sharing for Neural Network Compression. Karen Ullrich, Edward Meeds & Max Welling. arXiv:1702.04008
- Dataset available at: http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html
- Dataset available at: http://www-labs.iro.umontreal.ca/~lisa/icml2007data/