-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tied Weights #488
Comments
I can reproduce this with: julia> x = param(cu(rand(10,10)));
julia> sum((x')*x) |> Flux.back! So that's something we should fix. OTOH it's probably not best to write your autoencoder this way. I would write it more like: w = param(rand(5, 10))
function m(x)
encoding = w*x
decoding = w'*encoding
end This way the transpose of |
This is fixed now, x = rand(10,2) |> gpu
encoder = Dense(10, 5, relu)
decoder = Dense(transpose(encoder.W), zeros(Float32, 10), relu)
m = Chain(encoder, decoder) |> gpu
gradient(() -> Flux.Losses.mse(m(x), x), Flux.params(m)) works fine |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi guys,
I'm trying to build an autoencoder with tied weights between the encoder and the decoder. The model is easy to encode, just transposing the weights of the encoder layers on the decoder layers. Here's an example with just one layer for the encoder and one for the decoder:
This model works quite well when I run it on CPU. However, when I'm trying to execute it on GPU:
I got a pretty ugly error message, see below. I'm pretty sure that there is something wrong with the broadcasting over the transposed array on the GPU.
Do you have any clue about what's happening?
The text was updated successfully, but these errors were encountered: