-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuous fuzzing #55
Conversation
It looks like the Travis build failed due to missing the API key. I've run successful tests on my own Travis account. If you decide to go for this, getting the API key in should make that work. |
Interesting. I'm going to try to build this up in a separate PR. Will update with results. |
Oops. |
So I think this package is a great candidate for fuzzing, but I'd like to do a few things before committing any kind of change to master. Some of this will seem really basic if you're already familiar with fuzzing, but please bear with me :) First, and ignoring the continuous fuzzing stuff for now, I want make sure we all understand exactly the value proposition. It's clear that Fuzz{New,Parse,ParseStrict} exercise the robustness of the various parsers, but are we using the input []byte the right way? Should we be parsing it in different ways? And what do the return codes imply? Second, what should we do with the "interesting" inputs discovered by fuzzing? For example, I ran it locally, and it found something that I converted into this unit test: func TestFuzzCorpus(t *testing.T) {
for _, testcase := range []struct {
name string
ms []byte
entropy []byte
want error
}{
{
name: "1",
ms: []byte{0x45, 0x6f, 0xef, 0xbf, 0xbd, 0x01, 0x00, 0x00},
entropy: []byte{0x00, 0x10, 0xbf, 0x80, 0xef, 0xbf, 0xbd, 0x6e, 0x66},
want: io.ErrUnexpectedEOF, // need at least 10 bytes of entropy
},
} {
t.Run(testcase.name, func(t *testing.T) {
var ms uint64
if err := binary.Read(bytes.NewReader(testcase.ms), binary.LittleEndian, &ms); err != nil {
t.Fatal(err)
}
_, err := ulid.New(ms, bytes.NewReader(testcase.entropy))
if want, have := testcase.want, err; want != have {
t.Fatalf("error: want %v, have %v", want, have)
}
})
}
} Is that the goal? Should we be doing something more? Something different, i.e. keeping the interesting data files in a subdir that we commit and parsing them from that? Finally, besides the ulid_fuzz.go file, I'd like to find a way to sequester all of the required scaffolding—the updated build environment, fuzz scripts, corpus data, etc.—separately from the existing CI stuff. Is that feasible? Maybe @bookmoons you can provide a link to some other repos where this stuff is integrated that I look to for examples? @tsenart any opinions? |
Thanks for looking at it @peterbourgon. Going to run through answers to all of these questions.
coredns recently got Fuzzit set up. So did systemd.
Fuzzit is using go-fuzz. It's a genetic fuzzer. So it instruments your binary, detects coverage changes with new inputs, and uses that to decide what to mutate next and what to keep the same. It uses this kind of natural selection model to try to evolve knowledge about your program and automatically walk all the branches. So the idea is you just let it watch what happens with each input and build up that information. It will try to hit all possible behavior. Return
The Fuzzit CI setup has 2 stages. 1 is the main fuzzing that tries to find new crashes. The other is a fuzz regression test that just runs past crashes. Fuzzit keeps a log of these past crashes and the regression test is a quick little run in CI of that list of past crashes. It also tracks a corpus, all the interesting cases that led to new branches in the program. So future fuzzing runs build on the knowledge developed by past runs. It'll send you an email if there's a new crash. Basically I think you take it as a bug report, get the bug fixed, and then it won't crash again in fuzz regression tests.
I'm not sure if there's a way to do this. I think Fuzzit only runs through Travis CI. You could set up go-fuzz to run manually, but then you're maybe not getting it on every commit. All the corpus data should be tracked by Fuzzit, so I don't think you'll end up with that in the repo. But that CI config seems to be the channel they're using. |
There's a successful run under my Travis account if you'd like to see what it looks like. Fuzz regression tests pull data from Fuzzit and run in Travis. It's a short little test. The main Fuzz test submits to Fuzzit to do a longer run and ends the build as soon as it's in the queue. |
This fasthttp package also just got fuzzit setup. |
Can we cover this too? |
I think so, will look into that. |
Added a new target that tests construction by passing fuzz through a |
Fuzzit have kindly made a ulid org for the project. I've preconfigured all the targets and added @tsenart and @peterbourgon to the org. A test build succeeded and I see fuzzing jobs in the queue: Final setup is to get the API key into CI:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
Proposing to integrate continuous fuzzing. Fuzzit gives free service for open source projects.
This targets
Parse
andParseStrict
by input +New
by fuzzing the timestamp and entropy.Setup is like this:
ulid-new
ulid-parse
ulid-parse-strict
.FUZZIT_API_KEY
.Thanks for considering.
@peterbourgon @tsenart