Skip to content

Latest commit

 

History

History
43 lines (43 loc) · 1.58 KB

2022-06-28-agarwala22a.md

File metadata and controls

43 lines (43 loc) · 1.58 KB
title booktitle abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Deep equilibrium networks are sensitive to initialization statistics
Proceedings of the 39th International Conference on Machine Learning
Deep equilibrium networks (DEQs) are a promising way to construct models which trade off memory for compute. However, theoretical understanding of these models is still lacking compared to traditional networks, in part because of the repeated application of a single set of weights. We show that DEQs are sensitive to the higher order statistics of the matrix families from which they are initialized. In particular, initializing with orthogonal or symmetric matrices allows for greater stability in training. This gives us a practical prescription for initializations which allow for training with a broader range of initial weight scales.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
agarwala22a
0
Deep equilibrium networks are sensitive to initialization statistics
136
160
136-160
136
false
Agarwala, Atish and Schoenholz, Samuel S
given family
Atish
Agarwala
given family
Samuel S
Schoenholz
2022-06-28
Proceedings of the 39th International Conference on Machine Learning
162
inproceedings
date-parts
2022
6
28