-
-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle string literals within macro arguments #685
Conversation
I'm confident that this is a coherent change to the lexer, because the only test case that changed is the one that I expected to change: passing unclosed string literals to a macro. As expected, passing a single quote gives an unclosed string literal warning, and passing three quotes begins a multi-line string literal. However, this still needs test cases of its own to more thoroughly test the changed behavior, as well as review to verify that it's even a desirable change. (Depending on #683 discussion, some other solution might be taken.) |
It might be worth factoring this repeated pattern into a macro:
For example:
|
a040fe5
to
ee62f25
Compare
This now has test cases that demonstrate more macro arg lexing. |
ab2c719
to
a005df3
Compare
1378047
to
bc48737
Compare
bc48737
to
f362a89
Compare
f362a89
to
b65848c
Compare
b38bcce
to
14c1361
Compare
Alright, I think this now correctly implements a consistent model for passing macro args and expanding strings.
(As a side effect, this enables the quine2.asm example that I had in mind earlier, but that wasn't the goal: it's just what I had expected from my mental model of how these rgbasm features "ought to" work. Basically this PR fixes the inconsistencies that were preventing that, like if |
cb38c6f
to
e469a0b
Compare
Regarding compatibility with existing code: the tests verify this against pokecrystal, pokered, and ucity, and Polished Crystal builds correctly with three deprecation warnings about Edit: I still think it might be cleaner if |
Change summary:
I see this as essentially a bugfix PR for other post-0.4.2 changes:
|
fbbf148
to
e82670f
Compare
Another demo test case:
|
04fcec0
to
cb82a01
Compare
This all seems very reasonable, except for one part, that is actually not related to quoting:
Considering user-defined functions are coming eventually, this is really unintuitive. Commas inside function call parentheses (or if this is too complicated, inside any parentheses) should not delimit macro arguments. (Consider how the C preprocessor does the same thing, for instance.) |
ce5543d
to
bbfa417
Compare
Alright, I understand that this tries to make the expansion / escaping etc. model feel more consistent. (I have yet to look at the code.) Only change I'm not sure about, is I'll now start reviewing the actual code with what I've read in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
69f214a
to
0fad2d9
Compare
80afc18
to
ba3f700
Compare
Fixes gbdev#683 The lexer's raw mode for reading macro args already attempted to handle semicolons inside string literals, versus outside ones which start comments. This change reuses the same function for reading string literals in normal and raw modes, also handling: - Commas in strings versus between macro args - Character escapes - {Interpolations} and \1-\9 args inside vs. outside strings - Multi-line string literals All test case outputs remain identical except for passing `"""` as a macro argument, which now starts a multi-line string literal instead of terminating the passed argument at the carriage return.
In macro args, '\,' does not start a new argument, but passes a literal ',' instead. This adds more escapes: - '\\' does not start a line continuation - '\"' does not start a string literal This makes '\\' in a macro arg pass one backslash, not two, which affects a test result.
within macro args, string literals, and normal context - "{S}" should always equal the contents of S - "\1" should always act like quoting the value of \1 Fixes gbdev#691
`appendIfLiteral` appends a character to yylval.tzString if keepLiteral is true and it's not too long already.
Normal mode uses readString, and plain copy-til-'\0' loops for appending interpolations or macro arg values. Raw mode uses appendStringLiteral, and appendEscapedSubstring for appending interpolations or macro arg values.
ba3f700
to
d1d5ab6
Compare
Since Otherwise, I think this is good, I'll let you handle the rebasing or squashing. |
You're right; corrected. I'll go ahead and squash these after CI; they close a few issues + implement a few changes, but those are pretty closely tied together. |
Fixes #683, fixes #691
The lexer's raw mode for reading macro args already attempted to handle semicolons inside string literals, versus outside ones which start comments. This change reuses the same function for reading string literals in normal and raw modes, also handling:
{Interpolations}
and\1
-\9
args inside vs. outside stringsAll test case outputs remain identical except for passing
"""
as a macro argument, which now starts a multi-line string literalinstead of terminating the passed argument at the carriage return.
Related discussion in #668