Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iteration over params(m) in explicit mode gives no gradient #2091

Closed
mcabbott opened this issue Oct 20, 2022 · 4 comments · Fixed by #2118
Closed

Iteration over params(m) in explicit mode gives no gradient #2091

mcabbott opened this issue Oct 20, 2022 · 4 comments · Fixed by #2118

Comments

@mcabbott
Copy link
Member

mcabbott commented Oct 20, 2022

The use of params within explicit-mode gradients has been broken by #2054 . Sadly there were no tests of this:

julia> m = (x=[1,2.0], y=[3.0]);

julia> gradient(m -> (sum(norm, Flux.params(m))), m)
((x = [0.4472135954999579, 0.8944271909999159], y = [1.0]),)  # before, Flux@0.13.5
(nothing,)  # after, Flux@0.13.6

julia> gradient(() -> (sum(norm, Flux.params(m))), Flux.params(m))
Grads(...)

julia> ans[m.x]  # unchanged
2-element Vector{Float64}:
 0.4472135954999579
 0.8944271909999159

(This form really is very annoying. Suggestions are fine but required textboxes seem to be solving a problem we didn't have. Also, it seems to break links to issues, like #2054 above. Edited to delete the boilerplate.)

@mcabbott mcabbott added the bug label Oct 20, 2022
@ToucheSir
Copy link
Member

ToucheSir commented Oct 20, 2022

How much do we care about this working? I'm not sure I have the wherewithal to figure it out while not regressing Flux.

This form really is very annoying. Suggestions are fine but required textboxes seem to be solving a problem we didn't have...

We (or really github) provide an escape hatch. You can still open a blank issue via the " Don’t see your issue here? Open a blank issue." link at the bottom of the new issue page.

@mcabbott
Copy link
Member Author

IDK, it's the officially documented way to add a regularisation term, http://fluxml.ai/Flux.jl/stable/models/regularisation/ .

Ideally that would move to something like FluxML/Optimisers.jl#57 . But if the goal is to provide a soft transition to explicit mode, so you can add the new things slowly during 0.13 rather than all-at-once to run on 0.14, then I think the old way ought to keep working a bit longer?

@mcabbott mcabbott transferred this issue from FluxML/Zygote.jl Oct 20, 2022
@ToucheSir
Copy link
Member

Is there a smart rule we could write which avoids #2054? I think the main considerations would be how map trainable pullbacks to the original tree structure and the final params vector. Maybe maintaining an auxiliary tree structure with indices at the leaves?

@mcabbott
Copy link
Member Author

This is surely possible, someone just has to write one more Functors thing which builds this gradient tree, as the rrule for params(m).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants