`Flux.Optimise.update!` updating grads instead of params? #2121

Vilin97 · 2022-11-23T00:23:43Z

Package Version

v0.13.7

Julia Version

1.8.2

OS / Environment

Windows 111

Describe the bug

Flux.Optimise.update! seems to update grads instead of params. I must be doing something wrong but this is the result I am getting.

Steps to Reproduce

using Flux
actual(x) = -x
x_train = hcat(0:5...)
y_train = actual.(x_train)
# predict = Dense(1 => 1)
predict = Chain(
  Dense(1 => 50, relu),
  Dense(50 => 50, relu),
  Dense(50 => 50, relu),
  Dense(50 => 1))
loss_(x, y) = sum( (predict(x) - y).^2 ) / sum(y_train.^2) ;
opt = Descent(10^-4)
parameters = Flux.params(predict)
grads = gradient(() -> loss_(x_train, y_train), parameters)
gr = maximum.([grads.grads[p] for p in Flux.params(predict)])
loss_(x_train, y_train)
p1=first(Flux.params(predict))
Flux.Optimise.update!(opt, Flux.params(predict), grads)
p2=first(Flux.params(predict)) # the parameter does not change 
loss_(x_train, y_train) # loss does not change
gr = maximum.([grads.grads[p] for p in Flux.params(predict)]) # all gradients go down by a factor of 10^4 -- the learning rate!

Expected Results

I was expecting the params(predict) to change, and the loss to go down.

Observed Results

Instead, the grads changed, and the loss and parameters of the NN did not change.

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

Vilin97 · 2022-11-23T00:25:02Z

I am 99% sure this is not a bug, and I am just doing something weird. But perhaps the fact that I am getting this behavior and cannot figure out what I am doing wrong points to an issue in documentation.

mcabbott · 2022-11-23T02:28:28Z

Pasting that in I get initial & final loss 1.1265075f0 - 1.1257615f0, slightly changed. With Descent(0.1) instead 1.6649617f0 - 0.6593005f0, a bigger change.

Flux.Optimise does mutate the gradients. #2098 removed one effect of this (on v0.13.8) but not the one seen here.

Vilin97 · 2022-11-23T21:31:41Z

Hmm, you are right. I cannot reproduce the behavior I was observing anymore.
I do notice something weird though. The NN is unable to approximate the x -> -x function! Does this point to a mistake in my code? I would expect that approximating such an easy function would be a piece of cake.

using Flux, Random
Random.seed!(123)
actual(x) = -x
x_train = hcat(0:5...)
y_train = actual.(x_train)
loss_(x, y) = sum( (predict(x) - y).^2 ) / sum(y_train.^2) ;
for k in 1:6
    predict = Chain(
        Dense(1 => 50, relu),
        Dense(50 => 50, relu),
        Dense(50 => 50, relu),
        Dense(50 => 1));
    parameters = Flux.params(predict)
    grads = gradient(() -> loss_(x_train, y_train), parameters)
    learning_rate = 10. ^-k
    opt = Descent(learning_rate)
    loss_(x_train, y_train) # 1.241
    for _ in 1:10000 
        Flux.Optimise.update!(opt, Flux.params(predict), grads) 
    end
    @show learning_rate, loss_(x_train, y_train) 
    # (learning_rate, loss_(x_train, y_train)) = (0.1, 0.51563776f0)        
    # (learning_rate, loss_(x_train, y_train)) = (0.010000000000000002, 1.0894512f0)
    # (learning_rate, loss_(x_train, y_train)) = (0.001, 0.92247385f0)      
    # (learning_rate, loss_(x_train, y_train)) = (0.0001, 1.2685742f0)      
    # (learning_rate, loss_(x_train, y_train)) = (1.0e-5, 0.66605574f0)     
    # (learning_rate, loss_(x_train, y_train)) = (1.0e-6, 1.1450504f0) 
end

Vilin97 · 2022-11-27T02:39:35Z

The problem was not recomputing grads after each update! step.

Vilin97 added the bug label Nov 23, 2022

Vilin97 closed this as completed Nov 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Flux.Optimise.update!` updating grads instead of params? #2121

`Flux.Optimise.update!` updating grads instead of params? #2121

Vilin97 commented Nov 23, 2022

Vilin97 commented Nov 23, 2022

mcabbott commented Nov 23, 2022

Vilin97 commented Nov 23, 2022

Vilin97 commented Nov 27, 2022

Flux.Optimise.update! updating grads instead of params? #2121

Flux.Optimise.update! updating grads instead of params? #2121

Comments

Vilin97 commented Nov 23, 2022

Package Version

Julia Version

OS / Environment

Describe the bug

Steps to Reproduce

Expected Results

Observed Results

Relevant log output

Vilin97 commented Nov 23, 2022

mcabbott commented Nov 23, 2022

Vilin97 commented Nov 23, 2022

Vilin97 commented Nov 27, 2022

`Flux.Optimise.update!` updating grads instead of params? #2121

`Flux.Optimise.update!` updating grads instead of params? #2121