Continuous fuzzing #55

bookmoons · 2019-08-22T02:45:35Z

Proposing to integrate continuous fuzzing. Fuzzit gives free service for open source projects.

This targets Parse and ParseStrict by input + New by fuzzing the timestamp and entropy.

Setup is like this:

In Fuzzit create targets ulid-new ulid-parse ulid-parse-strict.
In Fuzzit settings grab an API key. On the repo settings in Travis paste it to envvar FUZZIT_API_KEY.

Thanks for considering.

coveralls · 2019-08-22T02:50:40Z

Coverage remained the same at 97.959% when pulling 27fd369 on bookmoons:master into e51a56f on oklog:master.

bookmoons · 2019-08-22T02:52:09Z

It looks like the Travis build failed due to missing the API key. I've run successful tests on my own Travis account. If you decide to go for this, getting the API key in should make that work.

peterbourgon · 2019-08-22T03:09:17Z

Interesting. I'm going to try to build this up in a separate PR. Will update with results.

peterbourgon · 2019-08-22T03:10:49Z

Oops.

peterbourgon · 2019-08-24T22:58:26Z

So I think this package is a great candidate for fuzzing, but I'd like to do a few things before committing any kind of change to master. Some of this will seem really basic if you're already familiar with fuzzing, but please bear with me :)

First, and ignoring the continuous fuzzing stuff for now, I want make sure we all understand exactly the value proposition. It's clear that Fuzz{New,Parse,ParseStrict} exercise the robustness of the various parsers, but are we using the input []byte the right way? Should we be parsing it in different ways? And what do the return codes imply?

Second, what should we do with the "interesting" inputs discovered by fuzzing? For example, I ran it locally, and it found something that I converted into this unit test:

func TestFuzzCorpus(t *testing.T) {
	for _, testcase := range []struct {
		name    string
		ms      []byte
		entropy []byte
		want    error
	}{
		{
			name:    "1",
			ms:      []byte{0x45, 0x6f, 0xef, 0xbf, 0xbd, 0x01, 0x00, 0x00},
			entropy: []byte{0x00, 0x10, 0xbf, 0x80, 0xef, 0xbf, 0xbd, 0x6e, 0x66},
			want:    io.ErrUnexpectedEOF, // need at least 10 bytes of entropy
		},
	} {
		t.Run(testcase.name, func(t *testing.T) {
			var ms uint64
			if err := binary.Read(bytes.NewReader(testcase.ms), binary.LittleEndian, &ms); err != nil {
				t.Fatal(err)
			}
			_, err := ulid.New(ms, bytes.NewReader(testcase.entropy))
			if want, have := testcase.want, err; want != have {
				t.Fatalf("error: want %v, have %v", want, have)
			}
		})
	}
}

Is that the goal? Should we be doing something more? Something different, i.e. keeping the interesting data files in a subdir that we commit and parsing them from that?

Finally, besides the ulid_fuzz.go file, I'd like to find a way to sequester all of the required scaffolding—the updated build environment, fuzz scripts, corpus data, etc.—separately from the existing CI stuff. Is that feasible? Maybe @bookmoons you can provide a link to some other repos where this stuff is integrated that I look to for examples?

@tsenart any opinions?

bookmoons · 2019-08-24T23:29:46Z

Thanks for looking at it @peterbourgon. Going to run through answers to all of these questions.

Maybe @bookmoons you can provide a link to some other repos where this stuff is integrated that I look to for examples?

coredns recently got Fuzzit set up. So did systemd.

It's clear that Fuzz{New,Parse,ParseStrict} exercise the robustness of the various parsers, but are we using the input []byte the right way? Should we be parsing it in different ways? And what do the return codes imply?

Fuzzit is using go-fuzz. It's a genetic fuzzer. So it instruments your binary, detects coverage changes with new inputs, and uses that to decide what to mutate next and what to keep the same. It uses this kind of natural selection model to try to evolve knowledge about your program and automatically walk all the branches. So the idea is you just let it watch what happens with each input and build up that information. It will try to hit all possible behavior.

Return 0 means there's no special meaning to this case so use your normal coverage logic. Return 1 means increase priority, it wants that signal if something parses successfully. If you look at the go-fuzz readme these values are near the top.

Data is a random input generated by go-fuzz, note that in most cases it is invalid. The function must return 1 if the fuzzer should increase priority of the given input during subsequent fuzzing (for example, the input is lexically correct and was parsed successfully); -1 if the input must not be added to corpus even if gives new coverage; and 0 otherwise; other values are reserved for future use.

Second, what should we do with the "interesting" inputs discovered by fuzzing? For example, I ran it locally, and it found something that I converted into this unit test

The Fuzzit CI setup has 2 stages. 1 is the main fuzzing that tries to find new crashes. The other is a fuzz regression test that just runs past crashes. Fuzzit keeps a log of these past crashes and the regression test is a quick little run in CI of that list of past crashes.

It also tracks a corpus, all the interesting cases that led to new branches in the program. So future fuzzing runs build on the knowledge developed by past runs.

It'll send you an email if there's a new crash. Basically I think you take it as a bug report, get the bug fixed, and then it won't crash again in fuzz regression tests.

Finally, besides the ulid_fuzz.go file, I'd like to find a way to sequester all of the required scaffolding—the updated build environment, fuzz scripts, corpus data, etc.—separately from the existing CI stuff. Is that feasible?

I'm not sure if there's a way to do this. I think Fuzzit only runs through Travis CI. You could set up go-fuzz to run manually, but then you're maybe not getting it on every commit.

All the corpus data should be tracked by Fuzzit, so I don't think you'll end up with that in the repo. But that CI config seems to be the channel they're using.

bookmoons · 2019-08-25T02:02:02Z

There's a successful run under my Travis account if you'd like to see what it looks like.

Fuzz regression tests pull data from Fuzzit and run in Travis. It's a short little test. The main Fuzz test submits to Fuzzit to do a longer run and ends the build as soon as it's in the queue.

https://travis-ci.org/bookmoons/ulid/builds/575119636

bookmoons · 2019-08-27T12:10:58Z

This fasthttp package also just got fuzzit setup.

valyala/fasthttp#634

tsenart · 2019-09-03T07:19:10Z

Can we cover this too?

https://godoc.org/github.com/oklog/ulid#MonotonicEntropy

bookmoons · 2019-09-04T03:05:08Z

I think so, will look into that.

bookmoons · 2019-09-04T06:32:04Z

Added a new target that tests construction by passing fuzz through a MonotonicEntropy. I've tried to exercise the monotonicity logic by constructing 10 ULIDs each iteration.

bookmoons · 2019-09-04T06:32:16Z

Fuzzit have kindly made a ulid org for the project. I've preconfigured all the targets and added @tsenart and @peterbourgon to the org. A test build succeeded and I see fuzzing jobs in the queue:
https://travis-ci.org/bookmoons/ulid/builds/580569854

Final setup is to get the API key into CI:

Grab an API key from Fuzzit.
Paste to FUZZIT_API_KEY in Travis.

tsenart

Thank you.

bookmoons added 3 commits August 21, 2019 18:54

Define fuzz targets

4983b09

Configure Fuzzit

13a327d

Add Fuzzit badge

793f16a

peterbourgon closed this Aug 22, 2019

peterbourgon reopened this Aug 22, 2019

bookmoons added 4 commits September 4, 2019 01:24

Document fuzz targets

e1f8ae2

fuzz: Rename input param fuzz

b408456

Fuzz monotonic entropy construction

3c338c0

fuzz: Qualify target names

27fd369

bookmoons force-pushed the master branch from 7740d13 to 27fd369 Compare September 4, 2019 06:23

tsenart approved these changes Sep 4, 2019

View reviewed changes

tsenart merged commit a41d229 into oklog:master Sep 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous fuzzing #55

Continuous fuzzing #55

bookmoons commented Aug 22, 2019

coveralls commented Aug 22, 2019 •

edited

Loading

bookmoons commented Aug 22, 2019

peterbourgon commented Aug 22, 2019

peterbourgon commented Aug 22, 2019

peterbourgon commented Aug 24, 2019 •

edited

Loading

bookmoons commented Aug 24, 2019 •

edited

Loading

bookmoons commented Aug 25, 2019 •

edited

Loading

bookmoons commented Aug 27, 2019

tsenart commented Sep 3, 2019

bookmoons commented Sep 4, 2019

bookmoons commented Sep 4, 2019

bookmoons commented Sep 4, 2019

tsenart left a comment

Continuous fuzzing #55

Continuous fuzzing #55

Conversation

bookmoons commented Aug 22, 2019

coveralls commented Aug 22, 2019 • edited Loading

bookmoons commented Aug 22, 2019

peterbourgon commented Aug 22, 2019

peterbourgon commented Aug 22, 2019

peterbourgon commented Aug 24, 2019 • edited Loading

bookmoons commented Aug 24, 2019 • edited Loading

bookmoons commented Aug 25, 2019 • edited Loading

bookmoons commented Aug 27, 2019

tsenart commented Sep 3, 2019

bookmoons commented Sep 4, 2019

bookmoons commented Sep 4, 2019

bookmoons commented Sep 4, 2019

tsenart left a comment

Choose a reason for hiding this comment

coveralls commented Aug 22, 2019 •

edited

Loading

peterbourgon commented Aug 24, 2019 •

edited

Loading

bookmoons commented Aug 24, 2019 •

edited

Loading

bookmoons commented Aug 25, 2019 •

edited

Loading