Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added generated seeds for FuzzRangeNodes #35

Merged
merged 1 commit into from
Jul 25, 2022

Conversation

mhutchinson
Copy link
Contributor

@mhutchinson mhutchinson commented Jul 21, 2022

Reading and extrapolating from https://go.dev/doc/fuzz/, the Fuzz test will only be fully exercised when a single Fuzz test is run explicitly with the -fuzz parameter to go test. When running go test without this, the Fuzz test will be invoked, but only on the seed corpus. As FuzzRangeNodes did not have a seed corpus, then this test is effectively skipped in our presubmits and CI pipeline.

This change adds the generated corpus from running the fuzz test as the seed corpus. This shouldn't change how well the test is covered when the -fuzz parameter is provided, but does mean that the test will be exercised as part of CI. This seems like a win, but maybe it's bonkers to be submitting generated files?

The generated corpus was acquired with the following commands:

go test --fuzz=FuzzRangeNodes --fuzztime=20s ./compact
mkdir -p compact/testdata/fuzz
cp -R `go env GOCACHE`/fuzz/github.com/transparency-dev/merkle/compact/FuzzRangeNodes compact/testdata/fuzz

Reading and extrapolating from https://go.dev/doc/fuzz/, the Fuzz test will only be fully exercized when a single Fuzz test is run explicitly with the -fuzz parameter to go test. When running go test without this, the Fuzz test will be invoked, but only on the seed corpus. As FuzzRangeNodes did not have a seed corpus, then this test is effectively skipped in our presubmits and CI pipeline.

This change adds the generated corpus from running the fuzz test as the seed corpus. This shouldn't change how well the test is covered when the -fuzz parameter is provided, but does mean that the test will be exercised as part of CI. This seems like a win, but maybe it's bonkers to be submitting generated files?
@mhutchinson mhutchinson requested a review from a team as a code owner July 21, 2022 10:37
@mhutchinson
Copy link
Contributor Author

@hickford PTAL as the instigator of this fuzzing adventure.

@mhutchinson
Copy link
Contributor Author

It doesn't appear that this generated corpus ever becomes stable; after copying the generated corpus into the seed corpus and running the fuzz command again, new "interesting" entries are always found. I've tried repeating this 3 times and there's always something new.

@hickford
Copy link
Contributor

hickford commented Jul 21, 2022

Indeed go test (rather than go test -fuzz) only runs the seed corpus, currently empty.

Simplest solution is to add some explicit cases with F.add:

for begin := 0; begin <= 10; begin++ {
    for end := begin; end <= 20; end++ {
        f.Add(uint64(end), uint64(end))
    }
}

Ultimately could add the project to https://google.github.io/oss-fuzz/ (continuous fuzzing on Google infrastructure) and https://google.github.io/clusterfuzzlite/ (fuzzes pull requests in GitHub Actions)

@mhutchinson
Copy link
Contributor Author

I considered the F.add approach, but this approach of copying in the generated seeds seems more comprehensive[1] and ~equivalent complexity (swap the complexity of simple code for that of extra files).

[1] My understanding is that the "interestingness" of the test cases is determined via code coverage instrumentation, so using these rather than a simple for loop should allow more interesting test cases that a human might not predict or be lucky enough to capture with the simple for loop seed generation.

@hickford
Copy link
Contributor

Good point

@AlCutter
Copy link
Collaborator

The middle ground could be more along the lines of the classic test cases pattern, with params taken from the generated files:

for s, e, p := range []struct{
  start int64
  end int64
  phaseOfMoon
  }{
    { 1, 3, 6 },
    ...
  } {
   // fuzziness(s, e, p)
  }

I do kinda like the idea of having the fuzz seed data near the fuzz test it's for (thinking of future me wondering what on earth these weird files checked in under testdata/... are) :)

@mhutchinson
Copy link
Contributor Author

There's no need to define the struct and iterate it; you can just repeatedly add things:

f.Add(1, 3, 6)
f.Add(6, 8, 2)
...

But even this approach I find a bit weird; these could just be normal unit tests at this point. I proposed this approach to just lean into the automated generation and tooling that fuzzing provides, but ensure it's being used. I think the world's not ready for this yet though.

For now, I think we just take the easy seeding approach @hickford suggests above: for loops over small numbers to add some seeding, and then see how the future tooling for these files develops. I think we'd want to have:

  • presubmit checks that all fuzz tests have these files
  • presubmit checks that no files exist for fuzz tests that don't exist (I'm thinking of renamed/deleted tests)
  • IDE support that easily links from a fuzz test to the corpus

We should also have a policy that any failures that are found through deeper fuzzing are moved to a standard unit test to prevent regressions.

@AlCutter
Copy link
Collaborator

Having the explicit f.Add lines in the test somewhat removes the need for all the bullets in your list, I think?

I also suspect this blurry line between fuzz and unit test is by design;
essentially you are adding specific regression tests against previously failed inputs that happened to be have been discovered by fuzzing (and/or tests whose parameters are known to have increased test coverage), it's just that you can turbo charge these unit tests to go off on an adventure to find new params.

@hickford
Copy link
Contributor

hickford commented Jul 25, 2022

#37 fuzzes beyond the seed corpus in GitHub Actions using ClusterFuzzLite.

@AlCutter AlCutter merged commit 52891d8 into transparency-dev:main Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants