Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzz testing harness to smithy-json and fix bug found by it #462

Merged
merged 4 commits into from
Jun 4, 2021
Merged

Add fuzz testing harness to smithy-json and fix bug found by it #462

merged 4 commits into from
Jun 4, 2021

Conversation

jdisanti
Copy link
Collaborator

@jdisanti jdisanti commented Jun 4, 2021

This is more work towards #161. It adds a fuzz testing harness for testing the token streaming JSON deserializer that looks for panics and also compares valid output against serde_json.

I ran fuzzing with the included corpus, and also with the JSONTestSuite test cases. Fuzzing revealed an issue where unescape_string incorrectly accepted \u+04D as a valid JSON Unicode escape sequence, which is fixed in this PR.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jdisanti jdisanti requested a review from rcoh June 4, 2021 20:02
@@ -0,0 +1,7 @@

target
corpus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to exclude corpus here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the default from cargo fuzz. I force added the seed files, and this ignore should prevent adding any of the libFuzzer generated corpus documents. Whether or not it will be valuable to commit those is open for debate.

[package]
name = "smithy-json-fuzz"
version = "0.0.0"
authors = ["Automatically generated"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an author here


// Error on characters `u16::from_str_radix` would otherwise accept, such as `+`
for byte in codepoint_str.bytes() {
match byte {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, could do

if !matches(byte, b'0'..=b'9' | b'a'..=b'f' | b'A'..=b'F') {
  return Err(..)
} 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's much nicer, thanks.

position: usize,
}

impl<'a> RewindingTokenIter<'a> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't really matter, but this could be replaced with https://doc.rust-lang.org/std/iter/struct.Peekable.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of this. Thanks.

if codepoint_str.bytes().any(|byte| !matches!(byte, b'0'..=b'9' | b'a'..=b'f' | b'A'..=b'F')) {
return Err(Error::InvalidUnicodeEscape(codepoint_str.into()))
if codepoint_str
.bytes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, I was going to suggest this, great minds :-)

@jdisanti jdisanti merged commit 5ef157d into smithy-lang:main Jun 4, 2021
@jdisanti jdisanti deleted the json-de-fuzz branch June 4, 2021 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants