Skip to content
This repository has been archived by the owner on Jan 10, 2024. It is now read-only.

Add support to fuzz from go-fuzz input #35

Merged
merged 2 commits into from
Mar 17, 2020
Merged

Conversation

posener
Copy link
Contributor

@posener posener commented Jan 23, 2020

Add a helper function that enables using gofuzz (this project) with go-fuzz for continuose fuzzing. Essentially, it enables translating the fuzzing bytes from go-fuzz to any Go object using this library.

This change will enable using this project with fuzzing websites such as fuzzit.dev or fuzzbuzz.io.

The underlying implementation was an idea of @lavalamp, by which a random source is created that generates random numbers from the input bytes slice. In this way changes in the source result in logical changes in the random data, which enables the fuzzer to efficiently search the space of inputs.

Fixes #33

fuzz.go Show resolved Hide resolved
fuzz.go Outdated Show resolved Hide resolved
@posener
Copy link
Contributor Author

posener commented Jan 23, 2020

Seems like the build failure has nothing to do with my code.

fuzz.go Outdated Show resolved Hide resolved
@lavalamp
Copy link
Contributor

I'll have to look into the test failure

@posener posener force-pushed the gofuzz branch 6 times, most recently from 9f1dad6 to 019391e Compare January 24, 2020 06:35
@posener posener changed the title Add NewFromGoFuzz Add support to fuzz from go-fuzz input Jan 24, 2020
@posener
Copy link
Contributor Author

posener commented Jan 24, 2020

@lavalamp Thanks for the help in this one.
Please re-review and see if you find the code and style as you like them.
I also updated the commit message and PR message.

BTW, I'll be off the grid in the upcoming week.

Copy link
Contributor

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great-- my comments are mostly cosmetic.

Thanks!

(p.s. can you edit the commit messages to e.g. not include the @ in front of lavalamp? github sends people email every time anything is done with such commits :) )

bytessource.go Outdated Show resolved Hide resolved
bytessource.go Outdated Show resolved Hide resolved
fuzz_test.go Outdated Show resolved Hide resolved
fuzz_test.go Outdated Show resolved Hide resolved
fuzz.go Show resolved Hide resolved
fuzz.go Show resolved Hide resolved
@posener
Copy link
Contributor Author

posener commented Jan 25, 2020

A few more things:

  • What do you think about the name of the exported function?
  • What do you think about adding something to the README?

Copy link
Contributor

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only find 2 tiny nits, sorry for the delay, I got busy.

I'm willing to take this as is, but I was thinking and I think we can improve on it a bit, and since we won't be able to change this after the fact except by introducing a new version of it, I'll say what I'm thinking and you can decide if it's worth changing.

The current version lets the fuzzer lock in the values of the first N random calls, but changing the length of the byte slice completely changes the random sequence for the rest of the calls, in two ways:

  • Since the seed is padded, changing the length changes how much padding is needed, meaning only insertions/removals of 8 bytes keep the same sequence
  • ...but an insertion (removal) of 8 bytes will delay (hasten) the first call to the fallback generator, meaning that although the sequence is the same, it's being called for different locations, which will change every site.

Here's an idea that solves these problems:

  • Use the first 8 bytes as a seed.
  • When returning a Uint64 directly from the data, also get a random Uint64 from the fallback source and throw it away.

I think this makes length changes much less consequential, which should let the fuzzer search with them much more effectively?

bytessource.go Outdated Show resolved Hide resolved
bytessource.go Outdated Show resolved Hide resolved
@lavalamp
Copy link
Contributor

lavalamp commented Feb 3, 2020

oh, and the questions:

  • Yes, a readme change is welcome
  • The name is OK, you could also consider making the data-backed random source public, move the comment there, and just let people chain it together with the RandSource() method. That would also let people reuse this for other purposes, I don't have anything in mind but it feels like the sort of thing that might be useful in general.

@posener
Copy link
Contributor Author

posener commented Feb 4, 2020

Thanks!
Not sure about exposing the struct. I thought that exposing the bare functionality through a function might be easier to use. It can always be exposed later when needed I guess.

@posener posener force-pushed the gofuzz branch 2 times, most recently from a9a49e9 to e11177b Compare February 8, 2020 11:06
Copy link
Contributor

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final round of nits and it looks like you'll need to rebase, too, sorry...

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
bytesource/bytesource.go Outdated Show resolved Hide resolved
bytesource/bytesource.go Show resolved Hide resolved
// New returns a new ByteSource from a given slice of bytes.
func New(input []byte) *ByteSource {
if len(input) == 0 {
panic("ByteSource was initiated with empty input")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, mention this in the function comment, or (my preference) use a seed of 0 in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

bytesource/bytesource.go Outdated Show resolved Hide resolved
bytesource/bytesource.go Outdated Show resolved Hide resolved
@posener
Copy link
Contributor Author

posener commented Feb 11, 2020

Before commiting this, I would like you to take a look at https://github.com/posener/fuzzing.
There, I avoided going through the rand library (except for the fallback that we run out of input bytes), and just decode the bytes to Go types.
I think that approach has slight advantages, regarding the fuzzer exploration, and additionally, it is a bit simpler. Maybe you have ideas how to use that method to work with gofuzz?

Add a helper function that enables using gofuzz (this project) with
go-fuzz (https://github.com/dvyukov/go-fuzz) for continuose
fuzzing. Essentially, it enables translating the fuzzing bytes from
go-fuzz to any Go object using this library.

This change will enable using this project with fuzzing websites such as fuzzit.dev
or fuzzbuzz.io.

The underlying implementation was an idea of lavalamp, by which a random source
is created that generates random numbers from the input bytes slice. In this way
changes in the source result in logical changes in the random data, which enables
the fuzzer to efficiently search the space of inputs.

Fixes google#33
@lavalamp
Copy link
Contributor

Sorry I've been super busy and that was a complicated question :)

Do you have a specific spot in that repo I can look at? I don't think I found what you were talking about in a quick skim...

@posener
Copy link
Contributor Author

posener commented Feb 20, 2020

It is a small repo, you can check the fuzz.go. The technique is similar and simple: create a random seed from the first 8 bytes. Then read an amount of required bytes according to the requested type and use the binary.BigEndian to decode them.

@lavalamp
Copy link
Contributor

lavalamp commented Mar 9, 2020

Ah, I understand now. Yes, I agree, that is nicer. I'm not sure of a way to do it without defining an interface that covers rand.Rand?

(sorry for the delay; busy few weeks!)

@posener
Copy link
Contributor Author

posener commented Mar 11, 2020

Maybe it is better just to close this one.
I'm not sure about it anymore.
Please do so if you agree.

func (s *ByteSource) consumeUint64() uint64 {
var bytes [8]byte
_, err := s.Read(bytes[:])
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a test with data that's a length not divisible by 8-- this might return an EOF at some point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK. https://play.golang.org/p/9uC8M9Nm5pE
But according to the io.Reader interface definition, you are right, the implementation can return io.EOF even when there were bytes that were read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is however tested, when the input bytes are the numbers 1..9.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I do see that test now, sorry!

@lavalamp
Copy link
Contributor

After more thought, I think this is still useful. Thanks for the idea! I did see one more thing that should probably be addressed though, if you still want to pursue this.

@posener
Copy link
Contributor Author

posener commented Mar 17, 2020

Yes sure, lets continue with it. Why not

@lavalamp
Copy link
Contributor

Thanks!

@lavalamp lavalamp merged commit c89cefb into google:master Mar 17, 2020
@posener
Copy link
Contributor Author

posener commented Mar 17, 2020

Thank for the review! It was a pleasure!

@posener posener deleted the gofuzz branch March 17, 2020 20:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate with fuzzit?
3 participants