Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added rust-style raw string syntax support for the Rhai Tokenizer #908

Merged
merged 3 commits into from
Aug 29, 2024

Conversation

cellomath
Copy link
Contributor

Added support for Rust-style raw-string literal syntax in Rhai.

Example:

let example_raw_string: string = r##"I can use quotes and / and even one "# without escaping this string
How about we put a newline in it too!"##;

@schungx
Copy link
Collaborator

schungx commented Aug 27, 2024

Looks great, except that whether r### is the best syntax. Rhai already looks more like JavaScript than Rust, so best to keep to a consistent style.

Hopefully JS formatters can successfully format most of Rhai...

Do you know how other languages handle the same?

@cellomath
Copy link
Contributor Author

Python uses r"raw string" syntax to use string with no escapes in it, but I think Rust is pretty unique in allowing the programmer to specify, in place, the number of "#"'s they need to represent the data without any distracting escape characters.

I see what you're saying about JavaScript though. I think raw strings could supplement the verbatim ones while adding negligible overhead, and aligning the language more with standard Rust features.

@schungx
Copy link
Collaborator

schungx commented Aug 27, 2024

I do agree that the r### solution is quite elegant, just that the syntax doesn't jive very much with the rest of Rhai's syntax...

I agree that this can be very useful -- I have immediate uses for this myself in my projects!

Would you mind taking care of the review questions as well?

@cellomath
Copy link
Contributor Author

Could you clarify what/where the review questions are?

@schungx
Copy link
Collaborator

schungx commented Aug 27, 2024

image

Can't you see it?

@cellomath
Copy link
Contributor Author

Your review says "Pending", probably why I can't see it. Is there a "submit review" button?

src/tokenizer.rs Outdated Show resolved Hide resolved
tests/string.rs Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
@schungx
Copy link
Collaborator

schungx commented Aug 27, 2024

Your review says "Pending", probably why I can't see it. Is there a "submit review" button?

Ah OK! Pressed it!

@cellomath cellomath requested a review from schungx August 27, 2024 16:25
@schungx schungx merged commit a0c2ebc into rhaiscript:main Aug 29, 2024
38 of 39 checks passed
@schungx
Copy link
Collaborator

schungx commented Aug 29, 2024

I'll need to fix it up a bit because you cannot just return LexError on EOL... that would cause errors for editors which may be parsing a script line by line. However, this is more intricate so I'd do it.

@schungx
Copy link
Collaborator

schungx commented Aug 30, 2024

@cellomath Come to think of it, since # is only used to start an object map literal... can we just use #" ... "# as raw string literals instead? This way, we do away with the r and we'd always have hashes...

@cellomath
Copy link
Contributor Author

cellomath commented Aug 30, 2024

My objective was to match Rust's syntax as much as possible, and prefixing raw strings with an "r" is pretty common in other languages too. Changing the syntax would undermine this, but if you like, I wouldn't be opposed for your #"..."# syntax to alias r#"..."#.

@schungx
Copy link
Collaborator

schungx commented Aug 30, 2024

Rhai already deviates from Rust quite significantly, such as using switch instead of match. To me, r" ... " always seems incomplete because of the lack of symmetry, and it is not easy for the user to spot immediately whether something is a raw string. The small r is quite easy to miss.

One more benefit is that the code is cleaner because we no longer have any raw strings without hashes...

EDIT: Caught a position bug

@schungx
Copy link
Collaborator

schungx commented Aug 30, 2024

I have an implementation out and it seems to make the code quite a bit simpler. So we'd have:

#"hello world!"#

#"Multiple lines:
Second line
             Final line
"#

##"Lines starting with #'s are comments while "'s are quoted"##

####"You can do ###"hello"### and it'll work"####

@cellomath
Copy link
Contributor Author

Okay, I see your point. Maybe keeping the languages separated makes it more distinguishable anyways!
¯_(ツ)_/¯

@schungx
Copy link
Collaborator

schungx commented Aug 30, 2024

Now we just have to figure out a way to make syntax highlighting work... Not sure how the Rust grammar does it...

Do you know anything about this?

@cellomath
Copy link
Contributor Author

cellomath commented Aug 30, 2024

Not sure what you mean. Can you link to Rhai's existing syntax highlighting? I'm not too familiar with language servers, but I figured they used something similar to regular expressions.

If the syntax highlighter uses the Rhai parser, it should automatically appear as a string literal, since that's what the parser yields from the raw string literal syntax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants