Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer module overhaul #396

Merged
merged 43 commits into from
Jan 31, 2018
Merged

Optimizer module overhaul #396

merged 43 commits into from
Jan 31, 2018

Conversation

iblislin
Copy link
Member

will sort out summary later...

* Before

```julia
NadamScheduler(; mu0 = 0.99, delta = 0.004, gamma = 0.5, alpha = 0.96)
```

* After

```julia
NadamScheduler(; μ₀ = 0.99, δ = 0.004, γ = 0.5, α = 0.96)
```
Blocker: #394

renames:
        * `get_momentum` -> `getmomentum`
        * `get_momentum_scheduler` -> `getmomsched`
        * `Momentum.Fixed.momentum` -> `Momentum.Fixed.μ`
Decouple learning rate update from OptimizationState,
give more control for user to trigger update
via `update!`
Let user can control it.
And provide default value `1/batch_size` in
high-level API `fit!` only.
@iblislin iblislin changed the title Optimizer module overhaul WIP: Optimizer module overhaul Dec 31, 2017
@iblislin iblislin changed the title WIP: Optimizer module overhaul Optimizer module overhaul Jan 1, 2018
@iblislin
Copy link
Member Author

iblislin commented Jan 1, 2018

Ready for review.
I list changes in NEWS.md:
https://github.com/dmlc/MXNet.jl/pull/396/files#diff-8312ad0561ef661716b48d09478362f3R263

The motivation of this PR is building more elegant APIs than Python (thanks to good Unicode support of Julia's REPL and editor plugin),
and I'm paving the way for porting Python's Gluon.

@iblislin iblislin requested a review from pluskid January 1, 2018 15:35
@iblislin iblislin added this to the 0.4.0 milestone Jan 7, 2018
@pluskid
Copy link
Member

pluskid commented Jan 7, 2018

Thanks for the efforts.

  • I'm OK with most of the renaming to unicodes.
  • I'm a bit against using \eta for learning rate, \mu for momentum and \lambda for weight decay. Although it seems to be commonly used, but some papers use different symbols and could be a bit confusing. But those are still OK if you insist.
  • I'm strictly against using ∇c for gradient clip and ∇r for gradient rescaling. Those are just super confusing. Because typically means taking gradient with respect to something or similar things.

@iblislin
Copy link
Member Author

iblislin commented Jan 8, 2018

  • I'm a bit against using \eta for learning rate, \mu for momentum and \lambda for weight decay. Although it seems to be commonly used, but some papers use different symbols and could be a bit confusing. But those are still OK if you insist.
  • I'm strictly against using ∇c for gradient clip and ∇r for gradient rescaling. Those are just super confusing. Because typically means taking gradient with respect to something or similar things.

I agree the your point of view on gradient clipping and rescaling.

I want to hear what naming you want. We can list all of them here, then vote.

@pluskid
Copy link
Member

pluskid commented Jan 8, 2018

Unfortunately, I don't have better naming suggestion apart from the more verbose grad_clip and grad_scale.

@iblislin
Copy link
Member Author

iblislin commented Jan 8, 2018

Add another options:

  • clip
  • rescale or scale

@iblislin
Copy link
Member Author

Is it okay to omit the grad_ prefix, given that optimizor is related to gradient stuff?

@iblislin
Copy link
Member Author

I did the renaming, please check it out.

@iblislin
Copy link
Member Author

good to go?

@pluskid
Copy link
Member

pluskid commented Jan 22, 2018

Sorry for the late reply. I still prefer with grad_ prefix, but I do not have strong objection to the current naming as it is the options for optimizers as you mentioned (and we probably are not going to have scales for momentum and other quantities). Please feel free to merge it.

@iblislin
Copy link
Member Author

Thanks a lot.

About the grad_ prefix, there is a way to add aliases in Julia 0.7 (JuliaLang/julia#24960). Once we drop the support of Julia 0.6, I can add the prefix as aliases.

@pluskid
Copy link
Member

pluskid commented Jan 23, 2018

Alias sounds good!

@iblislin iblislin merged commit 9f4f533 into master Jan 31, 2018
@iblislin iblislin deleted the ib/opt-rework branch January 31, 2018 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants