-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Started with the new lexer implementation #432
Conversation
Have been looking through this / how the lexer used to work and I think I have a basic understanding of where stuff is. Is there an area within the new lexer that I could look into working on? |
Basically, porting the old lexer to the new architecture would be nice. You can create PRs to this branch. If you find you need something new from the cursor, let me know, and I can add it. I might have some time this week to finish the unimplemented functions in the cursor. Then we need to have some extra logic to use the goal symbols. |
Have started by working my way through the old lex() function and moving across code for each of token types Moved across (if something wasn't implemented before I haven't implemented it yet, TODO's etc. remain)
|
When it comes to matching the start of a token it would be nice to keep the characters matched on as part of the same file that the lexing is done in i.e. in I think it would be cleaner to move the |
Co-authored-by: Iban Eguia <razican@protonmail.ch>
I see the cursor gets ASCII bytes - what about if unicode is used? |
The cursor goes through bytes, independently if they are ASCII or not. Then, there is a wrapper that converts them to Unicode if needed. |
@Razican is it possible to allow putting tokens back onto the cursor? It would be useful for handling cases like regex (or alternatively give the option to peek more than a single cursor ahead). |
Yep, we should be able peek at most 4 characters. Maybe during the weekend I have time to implement that in the cursor. |
Is this PR superseeded by #486 ? |
I have no further local changes, we can close this :) |
This Pull Request fixes #294.
It changes the following:
Read
. Ideally, we should use either aCursor<String>
if we are reading input from user (for the console, for example), or a buffered reader when reading from files./
and a regular expression literal starting with/
.Note that this is still WIP. I have only laid out the initials with the new cursor for the lexer, but I wanted to have it here in order to have benchmarks soon, and to receive feedback.