Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore Zeros when gathering parameters #1281

Closed
wants to merge 2 commits into from
Closed

Ignore Zeros when gathering parameters #1281

wants to merge 2 commits into from

Conversation

pxl-th
Copy link
Member

@pxl-th pxl-th commented Jul 12, 2020

This fixes #1277.
IMO, it makes more sense to ignore disabled parameters when gathering model parameters, than have something that is not used and may lead to confusion.

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable
  • Final review from @dhairyagandhi96 (for API changes).

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jul 12, 2020

Removing Zeros from the list params might give incorrect results for loading complete chains where it is expected to see the bias term appear.

Another way of looking at it is, if you see an array where you try loading it with nothing (instead of zeros) you break the assumption that it might be populated with legitimate values if you train it further. If you see Flux.Zeros in the bias, ignore it, if you see an array either convert it to Flux.Zeros or load it with zeros.

@pxl-th
Copy link
Member Author

pxl-th commented Jul 12, 2020

In this case there is an issue, if we have a model with all biases disabled:

m = Chain(
    Conv((3, 3), 1=>2, bias=Flux.Zeros()),
    Conv((3, 3), 2=>3, bias=Flux.Zeros()),
    Conv((3, 3), 3=>4, bias=Flux.Zeros()),
)

Then in parameters we have Flux.Zeros only from first convolution, since parameters are stored in Zygote.Params which wraps IdSet{Any} and so we effectively loose information about all other disabled biases.

for p in Flux.params(m)
    println(typeof(p))
end

>>> Output:
Array{Float32,4}
Flux.Zeros{Bool,0}
Array{Float32,4}
Array{Float32,4}

@DhairyaLGandhi
Copy link
Member

The crucial thing to note here is that we check for the presence of Flux.Zeros in the model side. That way the information is preserved. To make them different, we could make the struct mutable to create new instances of it with every call.

@pxl-th
Copy link
Member Author

pxl-th commented Jan 13, 2022

Closing as it is not relevant anymore

@pxl-th pxl-th closed this Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flux.Zeros conflicts with Flux.loadparams!
2 participants