Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexer: better handling of character and byte literals #1004

Merged
merged 6 commits into from
Sep 29, 2023

Conversation

feds01
Copy link
Contributor

@feds01 feds01 commented Sep 29, 2023

This PR deals with several issues arising from Unicode handling in the compiler. Firstly, the compiler would completely bork when it encountered combining marks. Column offsets and reporting rendering were completely incorrect, and the lexer error message was somewhat confusing. So, this PR does the following:

  • lexer: emit error for non-ASCII byte literals
  • source: correctly compute column offsets (accounting for Unicode)
  • reporting: account for combining marks when drawing span annotations
  • lexer: improve errors for combining marks in byte and character literals
  • fix reporting: wrap newlines correctly in reporting view #1003

@feds01 feds01 self-assigned this Sep 29, 2023
@feds01 feds01 added the parser Issues related with parsing sub-system. label Sep 29, 2023
@feds01 feds01 force-pushed the disallow-unicode-byte-lits branch from b428007 to 0d24647 Compare September 29, 2023 00:39
@feds01 feds01 force-pushed the disallow-unicode-byte-lits branch from 0d24647 to cb04ad4 Compare September 29, 2023 12:24
@feds01 feds01 added the error-reporting Error reporting sub-system issues label Sep 29, 2023
kontheocharis
kontheocharis previously approved these changes Sep 29, 2023
compiler/hash-reporting/src/render.rs Outdated Show resolved Hide resolved
compiler/hash-reporting/src/report.rs Show resolved Hide resolved
@feds01 feds01 force-pushed the disallow-unicode-byte-lits branch from eb65e69 to 235cac5 Compare September 29, 2023 14:03
@feds01 feds01 merged commit dcb5488 into main Sep 29, 2023
@feds01 feds01 deleted the disallow-unicode-byte-lits branch September 29, 2023 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error-reporting Error reporting sub-system issues parser Issues related with parsing sub-system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reporting: wrap newlines correctly in reporting view
2 participants