Implicit gradient failing with matrices #137

theogf · 2020-07-17T14:00:50Z

Here is a MWE:

using KernelFunctions, Flux, LinearAlgebra
k = transform(SqExponentialKernel(), 2.0)
ps = Flux.params(k)
X = rand(10, 1); x = vec(X)
A = rand(10, 10)
g = gradient(ps) do
  tr(kernelmatrix(k, X, obsdim = 1) * A)
end
g[ps[1]] == nothing

g2 = gradient(k) do k
  tr(kernelmatrix(k, X, obsdim = 1) * A)
end
g2[1].transform.s != nothing

g3 = gradient(ps) do
  tr(kernelmatrix(k, x) * A)
end
g3[ps[1]] != nothing

I think this is related to FluxML/Zygote.jl#692
Any idea on how to solve this @willtebbutt ? It is probably connected to the ColVecs structure

The text was updated successfully, but these errors were encountered:

willtebbutt · 2020-07-17T14:10:49Z

This is outside my area of expertise I'm afraid.

theogf · 2020-07-17T14:14:57Z

I think there is a general issue with the adjoint of ColVecs/RowVecs, do you know who could help with it?

willtebbutt · 2020-07-17T14:18:21Z

Have you tried wrapping everything in a let block? Globals are hard, so it's possible that Zygote is buggy w.r.t. them.

edit: I'm not sure exactly how the ColVecs etc pullbacks would affect this. If they work under usual circumstances, I would expect them to work here 🤷‍♂️

theogf · 2020-07-17T14:24:48Z

You mean this ?

let kernel = k
  g = gradient(ps) do
    tr(kernelmatrix(kernel, X, obsdim = 1) * A)
  end
end

willtebbutt · 2020-07-17T14:29:50Z

Nah, just

using KernelFunctions, Flux, LinearAlgebra

let

k = transform(SqExponentialKernel(), 2.0)
ps = Flux.params(k)
X = rand(10, 1); x = vec(X)
A = rand(10, 10)
g = gradient(ps) do
  tr(kernelmatrix(k, X, obsdim = 1) * A)
end
g[ps[1]] == nothing

g2 = gradient(k) do k
  tr(kernelmatrix(k, X, obsdim = 1) * A)
end
g2[1].transform.s != nothing

g3 = gradient(ps) do
  tr(kernelmatrix(k, x) * A)
end
g3[ps[1]] != nothing

end

theogf · 2020-07-17T14:31:10Z

Nope same behavior

theogf · 2020-07-26T16:00:35Z

I found a fix \o/ ! I think we should avoid relying on Base.map, removing it and replacing it directly by _map solves the problem.
I think this is connected to FluxML/Zygote.jl#522 which you and Mike already looked at apparently.
Also it looks like the adjoints for ColVecs and RowVecs are not necessary. I will make a PR with a fix

theogf mentioned this issue Jul 26, 2020

Fixing implicit gradients #141

Merged

theogf closed this as completed in #141 Jul 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit gradient failing with matrices #137

Implicit gradient failing with matrices #137

theogf commented Jul 17, 2020 •

edited

Loading

willtebbutt commented Jul 17, 2020

theogf commented Jul 17, 2020

willtebbutt commented Jul 17, 2020 •

edited

Loading

theogf commented Jul 17, 2020 •

edited

Loading

willtebbutt commented Jul 17, 2020

theogf commented Jul 17, 2020

theogf commented Jul 26, 2020

Implicit gradient failing with matrices #137

Implicit gradient failing with matrices #137

Comments

theogf commented Jul 17, 2020 • edited Loading

willtebbutt commented Jul 17, 2020

theogf commented Jul 17, 2020

willtebbutt commented Jul 17, 2020 • edited Loading

theogf commented Jul 17, 2020 • edited Loading

willtebbutt commented Jul 17, 2020

theogf commented Jul 17, 2020

theogf commented Jul 26, 2020

theogf commented Jul 17, 2020 •

edited

Loading

willtebbutt commented Jul 17, 2020 •

edited

Loading

theogf commented Jul 17, 2020 •

edited

Loading