Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AdaScale SGD: A User-Friendly Algorithm for Distributed Training #13

Open
nocotan opened this issue Jan 4, 2021 · 0 comments
Open

AdaScale SGD: A User-Friendly Algorithm for Distributed Training #13

nocotan opened this issue Jan 4, 2021 · 0 comments

Comments

@nocotan
Copy link
Member

nocotan commented Jan 4, 2021

一言でいうと

分散学習時の学習率チューニングを不要にするようなSGDの拡張

論文リンク

https://arxiv.org/abs/2007.05105

著者/所属機関

Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin (Apple)

投稿日付(yyyy/MM/dd)

2020/07/09

概要

Screen Shot 2021-01-04 at 14 00 54

新規性・差分

既存のスケジューリングルールであるIdentity scaling ruleとlinear scaling ruleを適応的にした.

手法

Screen Shot 2021-01-04 at 14 01 01

結果

Screen Shot 2021-01-04 at 14 01 34

Screen Shot 2021-01-04 at 14 01 44

コメント

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant