-
-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to apply L2 regularization to a subset of parameters? #1284
Comments
maybe you can obtain some slight simplification using nonregularized_params_(net) = ....
function regularized_params(net::FluxNetwork)
ps = Flux.params(net)
for p in nonregularized_params_(net)
delete!(ps, p)
end
return ps
end Either you push! or delete!, you can avoid the presence check, it's done internally |
Maybe we can implement in Flux something similar to |
I don't see how the |
sorry, forget what I said, I was being stupid. Yes, I don't see how to simplify this, besides making a |
Currently there isn't a simple way to filter out biases specifically. I can see this becoming a real need for bigger models. It will be a bit of a manual process with the current infrastructure since we currently don't distinguish between weights and biases as both are assumed to be parameters, but with defining a functor definition that splits these out would do the trick |
When training a neural network with an L2 regularization, it is often advised not to regularize the bias parameters (in contrast with weight parameters).
I implemented this as follows in AlphaZero.jl:
This feels a bit hackish though (and also it relies on internals and so it tends to break at every new Flux release).
Do you see any better way? Shouldn't we make this easier?
The text was updated successfully, but these errors were encountered: