Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docstrings in upsample.jl, recurrent.jl, and normalise.jl #1995

Merged
merged 15 commits into from
Jun 26, 2022

Conversation

Saransh-cpp
Copy link
Member

Added missing docstrings (in the manual), updated the existing ones, and added doctests in the following files -

  • normalise.jl
  • recurrent.jl
  • upsample.jl

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable

@Saransh-cpp Saransh-cpp marked this pull request as draft June 11, 2022 08:43
@codecov-commenter
Copy link

codecov-commenter commented Jun 11, 2022

Codecov Report

Merging #1995 (2ab42cd) into master (0b01b77) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1995   +/-   ##
=======================================
  Coverage   87.10%   87.10%           
=======================================
  Files          20       20           
  Lines        1528     1528           
=======================================
  Hits         1331     1331           
  Misses        197      197           
Impacted Files Coverage Δ
src/layers/normalise.jl 88.81% <ø> (ø)
src/layers/recurrent.jl 90.00% <ø> (ø)
src/layers/upsample.jl 69.56% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b01b77...2ab42cd. Read the comment docs.

@Saransh-cpp
Copy link
Member Author

I tried a lot of filters, but I can't get the doctests for the Dropout layer to pass. The primary cause for this failure is the presence of 0.0 and -0.0 in the output, which the doctests can't handle.

For example, a regex pattern available here - https://regex101.com/r/1KtxKj/1 - correctly handles the 0.0 and -0.0 case, but it fails while running the doctests.

I have converted the jldoctest block to a julia block for the moment, but I will revert it back if I find something.

@Saransh-cpp Saransh-cpp marked this pull request as ready for review June 11, 2022 10:52
@Saransh-cpp
Copy link
Member Author

Additionally, should the dropout function stay in the manual or should it be removed (internal API?)? If it stays, should I add examples for it too?

@ToucheSir
Copy link
Member

ToucheSir commented Jun 11, 2022

Instead of trying to match the output, you can add a check that the number of zeros is equal to the dropout rate +- some margin of error.

I just checked the docs for dropout and it explicitly guides users towards Dropout, so additional examples are not necessary.

src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/recurrent.jl Outdated Show resolved Hide resolved
src/layers/recurrent.jl Outdated Show resolved Hide resolved
src/layers/recurrent.jl Outdated Show resolved Hide resolved
src/layers/upsample.jl Outdated Show resolved Hide resolved
Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>
@Saransh-cpp
Copy link
Member Author

Thank you for the review, @ToucheSir!

These should be unprefixed and imported from Statistics if not already.

Should I import Statistics in every doctest or should I import it only in the first doctest and group the other doctests together?

Instead of using comments and having to filter out the random numbers, why not compare directly against the thing in the comment? e.g: std(y[:, :, 1, 1]) \approx 1 && std(y[:, :, 1, 1]) \notapprox std(xs[:, :, 1, 1]) (can't find the symbols right now)

Yes, this would make the doctests much cleaner. However, I could not use the \approx symbol here due to the default low tolerance. Should I use the isapprox function with atol = some quantity (or perhaps rtol) here?

For this, I think it would actually be good to feed a fixed input through and display how the output changes because the operation is not terribly intuitive.

I did this initially (with random values, I really need to stop using random values), but then the extra vertical space taken changed my mind. I'll revert it back!

@Saransh-cpp Saransh-cpp force-pushed the docstring-for-layers branch from 914b12d to 10a2b52 Compare June 11, 2022 20:34
src/layers/recurrent.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved
Saransh-cpp and others added 2 commits June 12, 2022 13:01
Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>
@Saransh-cpp Saransh-cpp requested a review from ToucheSir June 14, 2022 09:22
src/layers/normalise.jl Outdated Show resolved Hide resolved
src/layers/normalise.jl Outdated Show resolved Hide resolved

# Examples
```jldoctest; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> r = RNN(1 => 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to set the weight and bias to fixed values (even [1.0f0] might work, but I'm sure you could come up with something better) and show deterministic outputs much like you have for the accum example above. Really hammer home the iterative nature of the RNN interface

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried initializing an RNN with weights by passing them in through the RNNCell's constructor, but for some reason I cannot get it to work. Could you please let me know how can I set the weights to a fixed value?

Copy link
Member

@ToucheSir ToucheSir Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share what you tried? That should work.

Copy link
Member Author

@Saransh-cpp Saransh-cpp Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried everything again today and it works 😅

julia> r = Flux.RNNCell(σ, zeros(1,1), zeros(1,1), zeros(1,1), zeros(1,1));
RNNCell(1 => 1, σ)

julia> y = Flux.Recur(r, ones(1,1));

julia> y.state
1×1 Matrix{Float64}:
 1.0

julia> y(ones(1,1))  # σ(0)
1×1 Matrix{Float64}:
 0.5

julia> y.state
1×1 Matrix{Float64}:
 0.5

julia> Flux.reset!(y)
1×1 Matrix{Float64}:
 0.0

julia> y.state
1×1 Matrix{Float64}:
 0.0

I am not sure if users are supposed to access these constructors like this? If they are then I can probably add this example to the doctests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just a comment that usually one does not create a RNN this way, but this is showing how data and hidden states are propagated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as clearly as shown by the accum example. I looked up in the documentation but couldn't figure out how to use RNNCells to create an example similar to the one using accum. But, this example does show that reset! resets the state to state0 of RNNCell rather than the initial state passed to Recur.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can choose a slightly easier to interpret activation function for the cell. I think sigmoid is a little hard to visualize, but there may be something better than just identity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relu might look good here. I have updated the example to use the weights instead of just setting them to 0 -

julia> r = Flux.RNNCell(relu, ones(1,1), zeros(1,1), ones(1,1), zeros(1,1));

julia> y = Flux.Recur(r, ones(1,1));

julia> y.state
1×1 Matrix{Float64}:
 1.0

julia> y(ones(1,1))  # relu(1*1 + 1)
1×1 Matrix{Float64}:
 2.0

julia> y.state
1×1 Matrix{Float64}:
 2.0

julia> Flux.reset!(y)
1×1 Matrix{Float64}:
 0.0

julia> y.state
1×1 Matrix{Float64}:
 0.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Member

@ToucheSir ToucheSir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure on Nightly is from ComponentArrays and can be safely ignored. Thanks for your patient work on this @Saransh-cpp !

@ToucheSir ToucheSir merged commit 97f981b into FluxML:master Jun 26, 2022
@Saransh-cpp Saransh-cpp deleted the docstring-for-layers branch June 26, 2022 15:33
@Saransh-cpp
Copy link
Member Author

Saransh-cpp commented Jun 26, 2022

Thank you for all the constructive reviews!🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants