-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fuzz: new fuzz package and lnwire parsing harnesses #1895
fuzz: new fuzz package and lnwire parsing harnesses #1895
Conversation
8fef0ae
to
f73abf3
Compare
I fuzzed watchtower messages and a couple of the lnwire messages for ~70+ hours each, but came up short. I haven't been having much luck fuzzing the standard lnwire messages - I will add more updates to this branch when I have time like formatting correct messages that require encoding and stuff like that. I think I will design a new fuzzing harness in the same fashion that bitcoin-core does theirs. Besides the lnwire and wtwire messages, I am open to suggestions for fuzzing other things in the project. I currently don't have the ability to fuzz consistently so this is hampering any bug finding - if anybody has computational resources, HIT ME UP. I have also been toying with the idea of using llgo with klee to run symbolic execution tests. This could even be combined with fuzzing as was done in the Darpa CGC (see: Driller) |
Ping on this PR, is it good to go/review as is, or do you plan to address the lingering TODO's (or update the list)? |
I'm going to do the first TODO and commit that and then it should be ready to review. The second TODO should in a different PR because it can get quite involved generating better inputs (see above comment). For the third TODO, I'll be able to start fuzzing at work so I can run for a long time and it doesn't need a PR. |
Experiencing this problem: dvyukov/go-fuzz#195 because of go modules. Was able to get around it though, but not ideal / neat because it doesn't put the binaries in the project directory. |
I wasn't entirely sure up until this point that separating harnesses was a good idea, but I am now positive that it is. I was using Bitcoin Core as a guide, which switches on input even though it isn't as efficient because it can confuse the fuzzer and use input intended for one target on a different target. This also lets us know when we have reached the maximum coverage for a specific target (like the I am maybe a bit confused on what to do about the "better corpus messages" because I have read (I'll try to find the link) that an "ideal" corpus is one that is small in size yet generates the maximum coverage. I am not sure how I feel about this, but I can leave that for a later time. As far as my comment about symbolic execution, that is totally a bust. The library that converts golang to LLVM will only build in docker with a super old version of go and it will not work with KLEE (the symex). In general, I don't know enough about symbolic execution anyways. And it's sort of unrelated. |
84f79b0
to
7064f9d
Compare
Should be ready for review @Roasbeef |
9c0543d
to
4eb7e40
Compare
Removed message prefixes from the seed corpus, so everything is good to go now. Going to do some coverage comparison tests on a compute engine instance (new approach vs. old, general harness). Will post results |
Old, slightly patched harness fuzzes some messages more than others (24 hour test on an 8 core machine running 2 workers ~14k execs/sec, resulting corpus size of 72 which is extremely small given the time). Specifically Coverage here: https://crypt-iq.github.io/count_wirefuzz.html |
Added zlib harnesses to hit the |
0502b2a
to
f612ef6
Compare
Updated the PR with just 2 commits and included better building instructions since |
f612ef6
to
c7f74b4
Compare
c7f74b4
to
2445034
Compare
Removed the corpus from the PR as it's overly burdensome to review with hundreds of files in the way. Updated coverage is here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stoked to finally see this land. Mostly some high level questions w.r.t methodology, also haven't yet run the latest versions of the per-message fuzzers.
) | ||
|
||
// Fuzz is used by go-fuzz. | ||
func Fuzz(data []byte) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level question: why do certain message targets like AcceptChannel
use a custom scenario rather than the fuzz.Harness
scenario as many of these below do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so this could have been more explicit. it's because of parsing UpfrontShutdownScript - before the test starts, if it's empty (nil slice), it will be converted into a non-nil slice at the very end of the test (the decoding+encoding changes it). since there's no way to only reflect.DeepEqual subsets of a struct (or like switch on error / determine which field the error occured), had to do it this way
@@ -0,0 +1,21 @@ | |||
// +build gofuzz | |||
|
|||
package fundinglocked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale for each of them having a new package rather than just a series of files in a blanket package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah so at the time of writing the tests you couldn't have multiple Fuzz funcs in the same package, but that's changed now so can change it to this - would make the directory look cleaner too..
fuzz/lnwire/fuzz_utils.go
Outdated
// PrefixWithMsgType takes []byte and adds a wire protocol prefix | ||
// to make the []byte into an actual message to be used in fuzzing. | ||
func PrefixWithMsgType(data []byte, prefix lnwire.MessageType) []byte { | ||
prefixBytes := make([]byte, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively we can just declare it as:
var prefixBytes [2]byte
Will get rid of all the packages. NOTE: Some of the harnesses are a bit different ( |
Flattened all packages to single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🦅
I think this is ready to land after the fixup commits have been squashed, and this PR rebased to master.
c0f2ef6
to
cdd33f8
Compare
@Roasbeef rebased+squashed, not sure why Travis is failing. I would also like to point to your attention 0b78f59 which errors if the decoded |
lnwire/query_short_chan_ids.go
Outdated
// Check whether the encodingType is valid or not. | ||
if encodingType > EncodingSortedZlib { | ||
return 0, nil, ErrUnknownShortChanIDEncoding(encodingType) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible just to remove this whole check, i.e. the zero length check? to me it looks like both of the existing sections below handle the zero-length case gracefully, and then error will instead be produced by the default
case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can just remove this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the check, but had to add a special check here. The first ReadMessage
call would succeed and return an empty list of ShortChannelIDs
(it would succeed because there would be bytes, but not enough to be a zlib-encoded *ShortChannelID
); the subsequent call would error on ReadMessage
when creating the zlib reader because there were no more bytes left to be read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, removing this check makes TestReplyChannelRangeEmpty
fail. Now any QueryShortChanIDs
or ReplyChannelRange
message with zlib encoding and no ShortChannelIDs
will fail with unable to create zlib reader: unexpected EOF
. I don't think the peer should fail on a legitimate message like this.
e303410
to
7d28d7e
Compare
// At this point, if there's no body remaining, then only the encoding | ||
// type was specified, meaning that there're no further bytes to be | ||
// parsed. | ||
if len(queryBody) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we just need to move the check to the beginning of the zlib case and the tests should pass again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup that should work
New test failure?
|
Have to do this: #1895 (comment) |
7d28d7e
to
363bdc4
Compare
Ran all the fuzzers one last time and no more bugs with the fuzzing harness from what i can tell. Should be g2g now |
This PR updates the fuzzing harness and converts it into one harness per message, updates the README for updated building instructions, and updates the corpus to include support for the newer messages. A new fuzz package was introduced which contains all of the message harnesses.
TODO:
Update the README with newer build instructions.
Build script
Update the go-fuzz fuzzing corpus to include support for the newer lnwire messages.
Add a fuzzing harness for each lnwire message so that things like correct encodings / signatures can be added and so the fuzzer doesn't get "confused".
Future Work:
Flame graph tooling.
Fuzz the
wtwire
messages.Add better corpus messages for increased code coverage.
Add correct signatures / encodings in the harnesses and other things the fuzzer won't guess.
Run for a long time. See: Offering my Bitcoin fuzzers bitcoin/bitcoin#11045 (comment)