fuzz: new fuzz package and lnwire parsing harnesses #1895

Crypt-iQ · 2018-09-13T01:24:02Z

This PR updates the fuzzing harness and converts it into one harness per message, updates the README for updated building instructions, and updates the corpus to include support for the newer messages. A new fuzz package was introduced which contains all of the message harnesses.

TODO:

Update the README with newer build instructions.
Build script
Update the go-fuzz fuzzing corpus to include support for the newer lnwire messages.
Add a fuzzing harness for each lnwire message so that things like correct encodings / signatures can be added and so the fuzzer doesn't get "confused".

Future Work:

Flame graph tooling.
Fuzz the wtwire messages.
Add better corpus messages for increased code coverage.
Add correct signatures / encodings in the harnesses and other things the fuzzer won't guess.
Run for a long time. See: Offering my Bitcoin fuzzers bitcoin/bitcoin#11045 (comment)

Crypt-iQ · 2018-11-17T11:24:33Z

I fuzzed watchtower messages and a couple of the lnwire messages for ~70+ hours each, but came up short. I haven't been having much luck fuzzing the standard lnwire messages - I will add more updates to this branch when I have time like formatting correct messages that require encoding and stuff like that. I think I will design a new fuzzing harness in the same fashion that bitcoin-core does theirs.

Besides the lnwire and wtwire messages, I am open to suggestions for fuzzing other things in the project. I currently don't have the ability to fuzz consistently so this is hampering any bug finding - if anybody has computational resources, HIT ME UP. I have also been toying with the idea of using llgo with klee to run symbolic execution tests. This could even be combined with fuzzing as was done in the Darpa CGC (see: Driller)

Roasbeef · 2018-12-04T04:02:25Z

Ping on this PR, is it good to go/review as is, or do you plan to address the lingering TODO's (or update the list)?

Crypt-iQ · 2018-12-05T15:29:10Z

I'm going to do the first TODO and commit that and then it should be ready to review. The second TODO should in a different PR because it can get quite involved generating better inputs (see above comment). For the third TODO, I'll be able to start fuzzing at work so I can run for a long time and it doesn't need a PR.

Crypt-iQ · 2018-12-18T16:30:39Z

Experiencing this problem: dvyukov/go-fuzz#195 because of go modules. Was able to get around it though, but not ideal / neat because it doesn't put the binaries in the project directory.

Crypt-iQ · 2018-12-19T19:24:42Z

I wasn't entirely sure up until this point that separating harnesses was a good idea, but I am now positive that it is. I was using Bitcoin Core as a guide, which switches on input even though it isn't as efficient because it can confuse the fuzzer and use input intended for one target on a different target. This also lets us know when we have reached the maximum coverage for a specific target (like the init message for example). Though it can be argued that we should continue to fuzz even once we reach a point where coverage is not changing, see this comment.
I am currently fuzzing all protocol messages and have noticed a ceiling for the coverage (~1850), but I think this is because go-fuzz is in this "confused" state and that our coverage could easily increase with better and split-up harnesses. So far, I have made only made a fuzzing harness for the init message.

I am maybe a bit confused on what to do about the "better corpus messages" because I have read (I'll try to find the link) that an "ideal" corpus is one that is small in size yet generates the maximum coverage. I am not sure how I feel about this, but I can leave that for a later time.

As far as my comment about symbolic execution, that is totally a bust. The library that converts golang to LLVM will only build in docker with a super old version of go and it will not work with KLEE (the symex). In general, I don't know enough about symbolic execution anyways. And it's sort of unrelated.

Crypt-iQ · 2018-12-24T00:39:28Z

I was talking to @tuxx42 and we can get very good coverage information similar to AFL by using gccgo - so if the fuzzer is trying to guess a magic value, we will know and can add this input to the corpus.

Crypt-iQ · 2018-12-25T01:22:41Z

Should be ready for review @Roasbeef

Crypt-iQ · 2019-01-16T12:21:43Z

Removed message prefixes from the seed corpus, so everything is good to go now. Going to do some coverage comparison tests on a compute engine instance (new approach vs. old, general harness). Will post results

Crypt-iQ · 2019-01-23T00:10:14Z

Old, slightly patched harness fuzzes some messages more than others (24 hour test on an 8 core machine running 2 workers ~14k execs/sec, resulting corpus size of 72 which is extremely small given the time). Specifically openchannel, acceptchannel, nodeannouncement, queryshortchanids, & replychannelrange (bad - we want an even distribution!!). These are pretty shallow fuzzing targets (no state machines or anything fancy, just deserialization + serialization), so individually fuzzing messages might not give us much more coverage. The decodeShortChanIDs function doesn't hit the EncodingSortedZlib block, so that might be worth fuzzing individually... Will add a brontide fuzzing PR soonish that depends on this one!

Coverage here: https://crypt-iq.github.io/count_wirefuzz.html

Crypt-iQ · 2019-02-16T20:45:44Z

Added zlib harnesses to hit the EncodingSortedZlib block

Crypt-iQ · 2019-08-09T16:04:15Z

Updated the PR with just 2 commits and included better building instructions since go-fuzz-build master fails to build the targets. Also included a one-liner in fuzz.md that builds all fuzzing binaries for a given package. The only thing left to do would be to generate a decent corpus for each message and zip it up in each harness directory. However, I don't have the resources to do that right now so maybe somebody else could do it 😉

Crypt-iQ · 2020-01-18T05:40:40Z

Removed the corpus from the PR as it's overly burdensome to review with hundreds of files in the way. Updated coverage is here.

Roasbeef

Stoked to finally see this land. Mostly some high level questions w.r.t methodology, also haven't yet run the latest versions of the per-message fuzzers.

Roasbeef · 2020-01-24T21:38:53Z

fuzz/lnwire/channelannouncement/channel_announcement.go

+)
+
+// Fuzz is used by go-fuzz.
+func Fuzz(data []byte) int {


High level question: why do certain message targets like AcceptChannel use a custom scenario rather than the fuzz.Harness scenario as many of these below do?

Yeah so this could have been more explicit. it's because of parsing UpfrontShutdownScript - before the test starts, if it's empty (nil slice), it will be converted into a non-nil slice at the very end of the test (the decoding+encoding changes it). since there's no way to only reflect.DeepEqual subsets of a struct (or like switch on error / determine which field the error occured), had to do it this way

Roasbeef · 2020-01-24T21:39:30Z

fuzz/lnwire/fundinglocked/funding_locked.go

@@ -0,0 +1,21 @@
+// +build gofuzz
+
+package fundinglocked


What's the rationale for each of them having a new package rather than just a series of files in a blanket package?

yeah so at the time of writing the tests you couldn't have multiple Fuzz funcs in the same package, but that's changed now so can change it to this - would make the directory look cleaner too..

Roasbeef · 2020-01-24T21:40:09Z

fuzz/lnwire/fuzz_utils.go

+// PrefixWithMsgType takes []byte and adds a wire protocol prefix
+// to make the []byte into an actual message to be used in fuzzing.
+func PrefixWithMsgType(data []byte, prefix lnwire.MessageType) []byte {
+	prefixBytes := make([]byte, 2)


Alternatively we can just declare it as:

var prefixBytes [2]byte

fuzz/lnwire/replychannelrangezlib/reply_channel_range_zlib.go

Crypt-iQ · 2020-01-27T16:00:30Z

Will get rid of all the packages. NOTE: Some of the harnesses are a bit different (acceptchannel, nodeannouncement, openchannel) since the channel ones have a quirk during serialization +deserialization with the UpfrontShutdownScript. nodeannouncement contains []net.Addr which don't serialize + deserialize back to the byte-by-byte same address (ipv6 address can be converted into an ipv4 address to be minimally encoded)

Crypt-iQ · 2020-01-28T21:48:42Z

Flattened all packages to single lnwirefuzz package. Had to modify the decodeShortChanIds function so that invalid EncodingType returns an error if there are no short channel ids rather than no error.

Roasbeef

LGTM 🦅

I think this is ready to land after the fixup commits have been squashed, and this PR rebased to master.

Crypt-iQ · 2020-02-04T02:43:33Z

@Roasbeef rebased+squashed, not sure why Travis is failing. I would also like to point to your attention 0b78f59 which errors if the decoded EncodingType is unknown. I think the fix more accurately follows the ideal behavior of decoding+mimics the encoding, plus it also helps in fuzz testing since the general harness will panic if WriteMessage fails, which it would, here.

docs/gofuzz/fuzz.md

cfromknecht · 2020-02-05T01:23:07Z

lnwire/query_short_chan_ids.go

+		// Check whether the encodingType is valid or not.
+		if encodingType > EncodingSortedZlib {
+			return 0, nil, ErrUnknownShortChanIDEncoding(encodingType)
+		}


is it possible just to remove this whole check, i.e. the zero length check? to me it looks like both of the existing sections below handle the zero-length case gracefully, and then error will instead be produced by the default case

Yeah I can just remove this check

Removed the check, but had to add a special check here. The first ReadMessage call would succeed and return an empty list of ShortChannelIDs (it would succeed because there would be bytes, but not enough to be a zlib-encoded *ShortChannelID); the subsequent call would error on ReadMessage when creating the zlib reader because there were no more bytes left to be read.

On second thought, removing this check makes TestReplyChannelRangeEmpty fail. Now any QueryShortChanIDs or ReplyChannelRange message with zlib encoding and no ShortChannelIDs will fail with unable to create zlib reader: unexpected EOF. I don't think the peer should fail on a legitimate message like this.

cfromknecht · 2020-02-25T18:40:07Z

lnwire/query_short_chan_ids.go

-	// At this point, if there's no body remaining, then only the encoding
-	// type was specified, meaning that there're no further bytes to be
-	// parsed.
-	if len(queryBody) == 0 {


i think we just need to move the check to the beginning of the zlib case and the tests should pass again?

yup that should work

Roasbeef · 2020-02-26T14:55:42Z

New test failure?

--- FAIL: TestReplyChannelRangeEmpty (0.00s)
    --- FAIL: TestReplyChannelRangeEmpty/empty_zlib_encoding (0.00s)
        reply_channel_range_test.go:86: unable to decode req: unable to create zlib reader: unexpected EOF

Crypt-iQ · 2020-02-26T14:57:38Z

New test failure?

--- FAIL: TestReplyChannelRangeEmpty (0.00s)
    --- FAIL: TestReplyChannelRangeEmpty/empty_zlib_encoding (0.00s)
        reply_channel_range_test.go:86: unable to decode req: unable to create zlib reader: unexpected EOF

Have to do this: #1895 (comment)

Crypt-iQ · 2020-03-03T19:55:56Z

Ran all the fuzzers one last time and no more bugs with the fuzzing harness from what i can tell. Should be g2g now

halseth added documentation Documentation changes that do not affect code behaviour testing Improvements/modifications to the test suite P3 might get fixed, nice to have labels Sep 13, 2018

halseth modified the milestones: 0.5.1, 0.5.2 Sep 13, 2018

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from 8fef0ae to f73abf3 Compare September 23, 2018 10:21

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from 84f79b0 to 7064f9d Compare December 21, 2018 16:00

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from 9c0543d to 4eb7e40 Compare January 14, 2019 12:59

Roasbeef removed this from the 0.5.2 milestone Jan 16, 2019

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from 0502b2a to f612ef6 Compare August 9, 2019 16:00

Crypt-iQ changed the title ~~docs+lnwire: Updating corpus, fuzz test, README~~ fuzz: new fuzz package and lnwire parsing harnesses Aug 9, 2019

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from f612ef6 to c7f74b4 Compare January 8, 2020 01:53

Crypt-iQ added the fuzzing label Jan 8, 2020

Roasbeef requested a review from cfromknecht January 14, 2020 07:34

Roasbeef added this to the 0.10.0 milestone Jan 14, 2020

Roasbeef added the v0.10 label Jan 17, 2020

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from c7f74b4 to 2445034 Compare January 18, 2020 05:33

Crypt-iQ mentioned this pull request Jan 23, 2020

fuzz: adding fuzz harnesses for acts 1-3, encryption+decryption #2593

Merged

5 tasks

Roasbeef requested changes Jan 24, 2020

View reviewed changes

Crypt-iQ mentioned this pull request Jan 29, 2020

fuzz/wtwire: adding wtwire fuzzers #3969

Merged

Roasbeef approved these changes Feb 4, 2020

View reviewed changes

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch 2 times, most recently from c0f2ef6 to cdd33f8 Compare February 4, 2020 01:48

Crypt-iQ requested a review from Roasbeef February 4, 2020 03:00

cfromknecht reviewed Feb 5, 2020

View reviewed changes

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch 2 times, most recently from e303410 to 7d28d7e Compare February 19, 2020 03:15

cfromknecht reviewed Feb 25, 2020

View reviewed changes

Crypt-iQ added 3 commits March 3, 2020 13:58

docs: add fuzz.md

7b9e0c8

lnwire: move zero-length queryBody check to zlib case

5a03fe5

fuzz/lnwire: adding fuzz harnesses for all lnwire messages + zlib

363bdc4

Crypt-iQ force-pushed the lnwire_fuzzing_09_11_2018 branch from 7d28d7e to 363bdc4 Compare March 3, 2020 18:59

Roasbeef merged commit 462048a into lightningnetwork:master Mar 10, 2020

Crypt-iQ deleted the lnwire_fuzzing_09_11_2018 branch March 10, 2020 00:39

Crypt-iQ mentioned this pull request Oct 8, 2020

Add fuzzing commands to Makefile #4643

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzz: new fuzz package and lnwire parsing harnesses #1895

fuzz: new fuzz package and lnwire parsing harnesses #1895

Crypt-iQ commented Sep 13, 2018 •

edited

Loading

Crypt-iQ commented Nov 17, 2018 •

edited

Loading

Roasbeef commented Dec 4, 2018

Crypt-iQ commented Dec 5, 2018

Crypt-iQ commented Dec 18, 2018 •

edited

Loading

Crypt-iQ commented Dec 19, 2018 •

edited

Loading

Crypt-iQ commented Dec 24, 2018

Crypt-iQ commented Dec 25, 2018

Crypt-iQ commented Jan 16, 2019 •

edited

Loading

Crypt-iQ commented Jan 23, 2019 •

edited

Loading

Crypt-iQ commented Feb 16, 2019

Crypt-iQ commented Aug 9, 2019 •

edited

Loading

Crypt-iQ commented Jan 18, 2020 •

edited

Loading

Roasbeef left a comment

Roasbeef Jan 24, 2020

Crypt-iQ Jan 25, 2020

Roasbeef Jan 24, 2020

Crypt-iQ Jan 25, 2020

Roasbeef Jan 24, 2020

Crypt-iQ commented Jan 27, 2020 •

edited

Loading

Crypt-iQ commented Jan 28, 2020

Roasbeef left a comment

Crypt-iQ commented Feb 4, 2020 •

edited

Loading

cfromknecht Feb 5, 2020 •

edited

Loading

Crypt-iQ Feb 7, 2020

Crypt-iQ Feb 19, 2020

Crypt-iQ Feb 19, 2020

cfromknecht Feb 25, 2020

Crypt-iQ Feb 25, 2020

Roasbeef commented Feb 26, 2020

Crypt-iQ commented Feb 26, 2020

Crypt-iQ commented Mar 3, 2020

fuzz: new fuzz package and lnwire parsing harnesses #1895

fuzz: new fuzz package and lnwire parsing harnesses #1895

Conversation

Crypt-iQ commented Sep 13, 2018 • edited Loading

Crypt-iQ commented Nov 17, 2018 • edited Loading

Roasbeef commented Dec 4, 2018

Crypt-iQ commented Dec 5, 2018

Crypt-iQ commented Dec 18, 2018 • edited Loading

Crypt-iQ commented Dec 19, 2018 • edited Loading

Crypt-iQ commented Dec 24, 2018

Crypt-iQ commented Dec 25, 2018

Crypt-iQ commented Jan 16, 2019 • edited Loading

Crypt-iQ commented Jan 23, 2019 • edited Loading

Crypt-iQ commented Feb 16, 2019

Crypt-iQ commented Aug 9, 2019 • edited Loading

Crypt-iQ commented Jan 18, 2020 • edited Loading

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Crypt-iQ commented Jan 27, 2020 • edited Loading

Crypt-iQ commented Jan 28, 2020

Roasbeef left a comment

Choose a reason for hiding this comment

Crypt-iQ commented Feb 4, 2020 • edited Loading

cfromknecht Feb 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef commented Feb 26, 2020

Crypt-iQ commented Feb 26, 2020

Crypt-iQ commented Mar 3, 2020

Crypt-iQ commented Sep 13, 2018 •

edited

Loading

Crypt-iQ commented Nov 17, 2018 •

edited

Loading

Crypt-iQ commented Dec 18, 2018 •

edited

Loading

Crypt-iQ commented Dec 19, 2018 •

edited

Loading

Crypt-iQ commented Jan 16, 2019 •

edited

Loading

Crypt-iQ commented Jan 23, 2019 •

edited

Loading

Crypt-iQ commented Aug 9, 2019 •

edited

Loading

Crypt-iQ commented Jan 18, 2020 •

edited

Loading

Crypt-iQ commented Jan 27, 2020 •

edited

Loading

Crypt-iQ commented Feb 4, 2020 •

edited

Loading

cfromknecht Feb 5, 2020 •

edited

Loading