Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix AdamW #198

Merged
merged 2 commits into from
Dec 11, 2024
Merged

fix AdamW #198

merged 2 commits into from
Dec 11, 2024

Conversation

CarloLucibello
Copy link
Member

fix #197

src/rules.jl Outdated
@@ -538,7 +538,7 @@ end

function AdamW(η, β = (0.9, 0.999), λ = 0.0, ϵ = 1e-8; couple::Bool = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One method puts epsilon 3rd, and one puts it 4th. Shouldn't these orders always agree?

Copy link
Member

@mcabbott mcabbott Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(That was supposed to highlight the struct definition, too.)

Also, why does this allow any types, not Float64 like other rules? Which I think @ref would just do for you?

Edit, I tried...

julia> @def struct AdamWref <: AbstractRule
         eta = 0.001
         beta = (0.9, 0.999)
         lambda = 0.0
         epsilon = 1e-8
         couple = true
       end

julia> AdamWref()
AdamWref(0.001, (0.9, 0.999), 0.0, 1.0e-8, true)

julia> AdamWref(0.0123)
AdamWref(0.0123, (0.9, 0.999), 0.0, 1.0e-8, true)

julia> AdamWref(epsilon = 0.0123)
AdamWref(0.001, (0.9, 0.999), 0.0, 0.0123, true)

julia> AdamWref(epsilon = 0.0123, couple=false)
AdamWref(0.001, (0.9, 0.999), 0.0, 0.0123, false)

julia> AdamWref(0.0123; couple=false)  # this partly-keyword method is what it doesn't give you
ERROR: MethodError: no method matching AdamWref(::Float64; couple::Bool)
This method does not support all of the given keyword arguments (and may not support any).

julia> methods(AdamWref)
# 6 methods for type constructor:
 [1] AdamWref(; eta, beta, lambda, epsilon, couple)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:265
 [2] AdamWref(eta, beta, lambda, epsilon, couple)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:259
 [3] AdamWref(eta, beta, lambda, epsilon)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:259
 [4] AdamWref(eta, beta, lambda)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:259
 [5] AdamWref(eta, beta)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:259
 [6] AdamWref(eta)
     @ ~/.julia/packages/Optimisers/V8kHf/src/interface.jl:259

Perhaps the macro is no help, as you will need to overwrite methods [3] to [6] to append the keyword couple. If that's indeed how it must work.

Otherwise you could decide that one keyword means all keywords, like normal.
Or we could elaborate the macro to always allow n positional and N-n keywords for the remaining arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would allow for at least AdamW(eta; couple=true) so

Or we could elaborate the macro to always allow n positional and N-n keywords for the remaining arguments.

seems a viable option that preserves all current construction modalities, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

For now at least fix the types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@CarloLucibello CarloLucibello merged commit 669798c into master Dec 11, 2024
4 of 5 checks passed
murrellb added a commit to MurrellGroup/Optimisers.jl that referenced this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AdamW: epsilon and lambda swapped?
3 participants