Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor JSON deserialization to assist code generation #474

Merged
merged 9 commits into from
Jun 11, 2021
Merged

Refactor JSON deserialization to assist code generation #474

merged 9 commits into from
Jun 11, 2021

Conversation

jdisanti
Copy link
Collaborator

@jdisanti jdisanti commented Jun 8, 2021

More work towards #161. These changes to the rust-runtime facilitate the JSON parser generator by making it possible to create custom errors, and also to expect certain value types or skip entire values in the token stream.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jdisanti jdisanti requested a review from rcoh June 8, 2021 22:59

/// Represents the location of a token
#[derive(Debug, Eq, PartialEq, Copy, Clone)]
pub struct TokenOffset(pub usize);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably just name this Offset. We're already in a module named Token

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

expect_fn!(expect_start_array, StartArray);

/// Expects a string or null token, and if its a string, returns its unescaped value.
pub fn expect_string_or_null(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little surprised to find no expect_string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think carefully about the null handling. Right now, it just passes off the Option to the builder's set function, and later, if that field was actually required, the fallible builder will fail. I'm not certain if this works correctly for, say, a primitive int field.

expect_fn!(expect_start_object, StartObject);
expect_fn!(expect_start_array, StartArray);

/// Expects a string or null token, and if its a string, returns its unescaped value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Expects a string or null token, and if its a string, returns its unescaped value.
/// Expects a string or null token. If the value was a string, its **unescaped** value will be returned

) -> Result<Option<String>, Error> {
match token.transpose()? {
Some(Token::ValueNull { .. }) => Ok(None),
Some(Token::ValueString { value, .. }) => Ok(Some(value.to_unescaped()?.to_string())),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you know you're definitely going to need a string at this point so there is no point in keeping at a Cow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. The only time this gets called is when we're moving a string value into a builder.

#[test]
fn skip_simple_value() {
let mut tokens = json_token_iter(b"null true");
skip_value(&mut tokens).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it seems like json_token_iter accepts invalid JSON? not necessarily a problem, just worth documenting what is accepted and what is not

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's necessarily invalid, but rather, multiple values. I could make it only read one value and then fail if there's more, but that's effectively what the codegen doing with it.

Ok(Token::ValueBool {
offset,
value: false,
})
}
_ => unreachable!(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth commenting why this is unreachable or putting a message in unreachable

Suggested change
_ => unreachable!(),
_ => unreachable!("this function MUST only be called when the next byte is t or f"),

/// Validation is done on the fly, so it is possible for it to parse an invalid JSON document
/// until it gets to the first [Error].
///
/// JSON string values are left unescaped in the [Token::ValueString] as an [EscapedStr],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean "escaped" here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Good catch!

Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving base64 & other refactorings LGTM!

@jdisanti jdisanti merged commit c35c9fa into smithy-lang:main Jun 11, 2021
@jdisanti jdisanti deleted the json-de-runtime branch June 11, 2021 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants