title

booktitle

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Understanding the unstable convergence of gradient descent

Proceedings of the 39th International Conference on Machine Learning

Most existing analyses of (stochastic) gradient descent rely on the condition that for $L$-smooth costs, the step size is less than $2/L$. However, many works have observed that in machine learning applications step sizes often do not fulfill this condition, yet (stochastic) gradient descent still converges, albeit in an unstable manner. We investigate this unstable convergence phenomenon from first principles, and discuss key causes behind it. We also identify its main characteristics, and how they interrelate based on both theory and experiments, offering a principled view toward understanding the phenomenon.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

ahn22a

0

Understanding the unstable convergence of gradient descent

247

257

247-257

247

false

Ahn, Kwangjun and Zhang, Jingzhao and Sra, Suvrit

given	family
Kwangjun	Ahn

given	family
Jingzhao	Zhang

given	family
Suvrit	Sra

2022-06-28

Proceedings of the 39th International Conference on Machine Learning

162

inproceedings

date-parts

2022

6

28

https://proceedings.mlr.press/v162/ahn22a/ahn22a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-06-28-ahn22a.md

2022-06-28-ahn22a.md

Files

2022-06-28-ahn22a.md

Latest commit

History

2022-06-28-ahn22a.md

File metadata and controls