title | booktitle | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn |
Proceedings of the 39th International Conference on Machine Learning |
This paper introduces the notion of “Initial Alignment” (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in (AS-NeurIPS’20). The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL. |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
abbe22a |
0 |
An Initial Alignment between Neural Network and Target is Needed for Gradient Descent to Learn |
33 |
52 |
33-52 |
33 |
false |
Abbe, Emmanuel and Cornacchia, Elisabetta and Hazla, Jan and Marquis, Christopher |
|
2022-06-28 |
Proceedings of the 39th International Conference on Machine Learning |
162 |
inproceedings |
|