Skip to content

Latest commit

 

History

History
51 lines (51 loc) · 2.08 KB

2022-06-28-alexos22a.md

File metadata and controls

51 lines (51 loc) · 2.08 KB
title booktitle abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Structured Stochastic Gradient MCMC
Proceedings of the 39th International Conference on Machine Learning
Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is a scalable algorithm for asymptotically exact Bayesian inference in parameter-rich models, such as Bayesian neural networks. However, since mixing can be slow in high dimensions, practitioners often resort to variational inference (VI). Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. To relax these assumptions, this work proposes a new non-parametric variational inference scheme that combines ideas from both SGMCMC and coordinate-ascent VI. The approach relies on a new Langevin-type algorithm that operates on a "self-averaged" posterior energy function, where parts of the latent variables are averaged over samples from earlier iterations of the Markov chain. This way, statistical dependencies between coordinates can be broken in a controlled way, allowing the chain to mix faster. This scheme can be further modified in a "dropout" manner, leading to even more scalability. We test our scheme for ResNet-20 on CIFAR-10, SVHN, and FMNIST. In all cases, we find improvements in convergence speed and/or final accuracy compared to SGMCMC and parametric VI.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
alexos22a
0
Structured Stochastic Gradient {MCMC}
414
434
414-434
414
false
Alexos, Antonios and Boyd, Alex J and Mandt, Stephan
given family
Antonios
Alexos
given family
Alex J
Boyd
given family
Stephan
Mandt
2022-06-28
Proceedings of the 39th International Conference on Machine Learning
162
inproceedings
date-parts
2022
6
28