Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take testing seriously #73

Closed
wants to merge 5 commits into from
Closed

Take testing seriously #73

wants to merge 5 commits into from

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Jan 12, 2022

This adds many tests from Zygote. I've done some filtering to leave out the most repetitive ones, things which clearly don't apply (like Params), and things testing rules for external packages.

All which didn't pass are marked broken... with (the local version of) #68. Thus many will fail. Some have the error copied next to them. Exactly two have SILENTLY WRONG ANSWER in the comment (and should probably become issues) but this is not automatically distinguished from just an error.

Probably we should move the existing tests here into another file or two, to be a bit more organised. (done) I also had somewhere a few test cases of rules in ChainRules which are meant to support higher derivatives, but not very well tested there to do so. (some added)

Closes #67, by adding the minimal piracy needed to allow Pairs to have a NamedTuple gradient (since many others tests failed without this).

@Keno
Copy link
Collaborator

Keno commented Jan 12, 2022

ForwardDiff and TaylorSeries are also sources of tests since we have and want to explicitly support forward mode.

@mcabbott
Copy link
Member Author

mcabbott commented Jan 12, 2022

Good point. Ideally as many as possible of these would run all ways. The extent of my attempt at this so far is these lines , which ran into #70 .

Edit: All of ForwardDiff's tests are copied here, and have loops set up to run them with all combinations of reverse jacobain over forward Hessian, etc. But the exotic combinations mostly don't work at the moment.

@mcabbott
Copy link
Member Author

These tests really want a different @test_broken macro, which accepts an error as a broken test, but fails on false -- a silently wrong answer is much worse than an error. Has anyone got one in their desk drawer?

@ToucheSir
Copy link

I looked into this a while back for an unrelated issue, but it seems the functionality behind those macros is not terribly extensible. Perhaps worthy of creating a TestMacros or TestExtras package if one doesn't already exist?

@codecov-commenter
Copy link

codecov-commenter commented Sep 18, 2022

Codecov Report

Merging #73 (1250a47) into main (55d2871) will increase coverage by 2.42%.
The diff coverage is 87.50%.

@@            Coverage Diff             @@
##             main      #73      +/-   ##
==========================================
+ Coverage   55.54%   57.96%   +2.42%     
==========================================
  Files          21       21              
  Lines        2157     2172      +15     
==========================================
+ Hits         1198     1259      +61     
+ Misses        959      913      -46     
Impacted Files Coverage Δ
src/runtime.jl 78.26% <66.66%> (-1.74%) ⬇️
src/extra_rules.jl 57.98% <100.00%> (+10.27%) ⬆️
src/stage1/recurse.jl 96.25% <0.00%> (-0.86%) ⬇️
src/interface.jl 70.90% <0.00%> (ø)
src/stage1/recurse_fwd.jl 94.28% <0.00%> (ø)
src/stage1/generated.jl 78.47% <0.00%> (+0.19%) ⬆️
src/stage1/forward.jl 80.13% <0.00%> (+5.98%) ⬆️
src/tangent.jl 60.00% <0.00%> (+25.95%) ⬆️
src/stage1/broadcast.jl 92.30% <0.00%> (+92.30%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@oscardssmith
Copy link
Member

should this be updated and merged?

@mcabbott
Copy link
Member Author

Will circle back, but last time I tried to update this, so many things were broken that I stopped.

@oscardssmith
Copy link
Member

makes sense. Right now first order scalar forward mode is probably the thing that is most reliable.

@staticfloat
Copy link
Collaborator

staticfloat commented Feb 27, 2023

These tests really want a different @test_broken macro, which accepts an error as a broken test, but fails on false -- a silently wrong answer is much worse than an error. Has anyone got one in their desk drawer?

For what it's worth, this was fixed in JuliaLang/julia#47804, which will ship in Julia v1.10

@mcabbott
Copy link
Member Author

Thanks. But this isn't quite what's wanted here, which is to prioritise silently wrong answers. It does this:

julia> @test_broken error()  # this I want to accept -- broken functionality
Test Broken
  Expression: error()

julia> @test_broken false  # this I would like to fail -- it's a silently wrong gradient
Test Broken
  Expression: false

julia> @test_broken 1  # this changed in 47804, ok
Error During Test at REPL[832]:1
  Expression evaluated to non-Boolean

@staticfloat
Copy link
Collaborator

You don't want @test_broken then, you want @test_throws:

julia> @test_throws ErrorException error()
Test Passed
      Thrown: ErrorException

julia> @test_throws ErrorException false
Test Failed at REPL[5]:1
  Expression: false
    Expected: ErrorException
  No exception thrown
ERROR: There was an error during testing

julia> @test_throws ErrorException 1
Test Failed at REPL[6]:1
  Expression: 1
    Expected: ErrorException
  No exception thrown
ERROR: There was an error during testing

@mcabbott
Copy link
Member Author

Maybe. My hope was that an upgrade which changes an error to a correct answer would cause warnings (to fix at leisure) not failure, but changing an error to false would cause warnings (as now you have silently wrong answers). Maybe describing that as a variant of @test_broken is weird, since "Unexpected Pass" is a failure not a warning.

@mcabbott mcabbott closed this Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure with ComposedFunction
6 participants