-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more fuzzers and libFuzzer harness #553
Conversation
The windows failure appears to be related to a problem fixed in #550 about those functions not having |
if(ENABLE_LIBFUZZER) | ||
list(APPEND H3_COMPILE_FLAGS -fsanitize=fuzzer,address,undefined) | ||
list(APPEND H3_LINK_FLAGS -fsanitize=fuzzer,address,undefined) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that these settings don't really work for the other binaries since main
gets redefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great overall - I think my confusion here is mostly my own lack of understanding of fuzzers, not problems with the code, but more comments might be helpful overall.
@@ -44,6 +44,9 @@ option(BUILD_BENCHMARKS "Build benchmarking applications." ON) | |||
option(BUILD_FUZZERS "Build fuzzer applications (for use with afl)." ON) | |||
option(BUILD_FILTERS "Build filter applications." ON) | |||
option(BUILD_GENERATORS "Build code generation applications." ON) | |||
# If ON, libfuzzer settings are used to build the fuzzer harnesses. If OFF, a frontend | |||
# for afl++ is provided instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the frontend part of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frontend
in this context means an implementation of main
that accepts arguments in a way that AFL can integrate with. Should I expand on the comments for this option?
H3Index *compacted = calloc(inputSize, sizeof(H3Index)); | ||
H3_EXPORT(compactCells)(input, compacted, inputSize); | ||
|
||
// fuzz uncompactCells using output of above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense? Would we miss cases for uncompactCells
that can't be generated by the output of compact
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A second run of uncompactCells is done below, using the original input data
. Not sure if this additional run of uncompact
on the compact
output is needed but it seemed like a reasonable check to append.
src/apps/fuzzers/fuzzerCompact.c
Outdated
|
||
// fuzz compactCells | ||
H3Index *compacted = calloc(inputSize, sizeof(H3Index)); | ||
H3_EXPORT(compactCells)(input, compacted, inputSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I consider failing and returning the appropriate error code acceptable behavior. Unacceptable behavior would be undefined behavior, faulting/crashing/out of bounds reads, etc.
Edit: Oh, I see we use the output of this function later. In this particular case I don't think it's too bad since we are guaranteed that the memory pointed to by compacted
has been zeroed by calloc
. If this were data on the stack using it in a potentially unintialized state could be bad. I will update this one to check the error code.
} | ||
const inputArgs *args = (const inputArgs *)data; | ||
if (args->sz >= 1024) { | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my info - it seems like there's a class of issue here where the user passes in an array and a size that don't match, and we get an invalid array access. Since we can't deal with this (we don't know the array size), are we assuming this is outside the bounds of testing, and is user error? Since this is a pretty serious error, do we need to indicate where users might need to be careful in their own code not to hit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a serious issue with memory management in C. I considered this case to be out of scope for the fuzzers since I don't know of any way to make H3 resilient to this kind of error.
We should be careful to document the expected array sizes when users pass memory into H3 so this can be correctly implemented on the user side.
return 0; | ||
} | ||
inputArgs args; | ||
memcpy(&args, data, sizeof(inputArgs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my info - why memcpy
here and casts in the other tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because we modify args.str
on the next line to ensure it meets the contract that it is null terminated. This cannot be done on the original buffer because of const
and because libfuzzer will complain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a comment in that case I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add. Come to think of it we may wish to address this in the API of stringToH3
by passing the buffer length.
H3Error err = H3_EXPORT(maxPolygonToCellsSize)(geoPolygon, res, &sz); | ||
if (!err && sz < MAX_SZ) { | ||
if (sz < 0) { | ||
printf("Oh no - sz is negative\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here - should this return 0
and exit? If not, why print the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is it will crash on line 57. Not sure if this will be relevant after #551.
return 0; | ||
} | ||
geoPolygon.holes = calloc(geoPolygon.numHoles, sizeof(GeoLoop)); | ||
size_t offset = sizeof(inputArgs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite confused here - isn't the fuzzer only going to provide sizeof(inputArgs)
worth of data? Where does the rest of the data after offset
come from in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will need to check if this will work with AFL (might need to adjust the seed file so it has data after inputArgs). For libFuzzer it does not know about inputArgs
so it will generate larger test cases.
248b1cf
to
323d9e9
Compare
double out; | ||
H3_EXPORT(getHexagonAreaAvgKm2)(args->res, &out); | ||
H3_EXPORT(getHexagonAreaAvgM2)(args->res, &out); | ||
H3_EXPORT(getHexagonEdgeLengthAvgKm)(args->res, &out); | ||
H3_EXPORT(getHexagonEdgeLengthAvgM)(args->res, &out); | ||
|
||
int64_t outInt; | ||
H3_EXPORT(getNumCells)(args->res, &outInt); | ||
|
||
H3Index pentagons[12]; | ||
H3_EXPORT(getPentagons)(args->res, pentagons); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably just a reflection of me not knowing how fuzzers work, but when you're doing multiple sequential tests like this, does it become harder to test the last function because you need to "get past" all the previous ones? Is there any value is splitting out each function separately? But that's also a lot of annoying boilerplate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's correct that if any of the previous functions called were to crash, this function wouldn't be exercised. But in that case we'd still have found a crash warranting investigation anyways. If we explicitly terminate the fuzzing run before reaching this function (i.e. if getNumCells were to return an error then we return 0), then that would not be a good test because this function's invocation would be dependent on the previous function. (And that is not part of the contract of calling this function, as with the memory allocation size functions.) That doesn't happen here so I believe this is a valid exercise of all the functions in this file.
This adds additional fuzzing harnesses (consult
README.md
in the PR for which functions should be covered and uncovered - a few are left uncovered since they are trivial) and also adds an adapter from AFL++ to libFuzzer. The idea being we can move the fuzzer harness from https://github.com/google/oss-fuzz/blob/master/projects/h3/h3_fuzzer.c to in repo (replaces #448) and cover all functions.Possible improvements include testing the build using afl++ itself and hooking this up to OSS-Fuzz.
cc @AdamKorcz