Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexer+parser: many improvements and cleanups #985

Merged
merged 20 commits into from
Sep 19, 2023
Merged

lexer+parser: many improvements and cleanups #985

merged 20 commits into from
Sep 19, 2023

Conversation

feds01
Copy link
Contributor

@feds01 feds01 commented Sep 18, 2023

  • token: remove DelimiterVariant
  • lexer: flatten token stream
  • parser: remove AstGenFrame::token_at()
  • parser: cleanup accesses to offset in AstGenFrame
  • token: introduce -> and => tokens to simplify parsing & error reporting
  • token: introduce :: tokens to simplify parsing access expr/ty/pats
  • parser: cleanup begins_pat() implementation
  • parser: avoid using peek_nth() in binary expression parsing
  • token: introduce .., ..<, ... tokens to simplify spread/range parsing
  • parser: fix typo in diagnostics
  • parser: several cleanups, and stricter use of token stream API
  • parser: Integrated new lexer into the parser
  • parser: directly use TokenCursor API
  • parser: avoid using confusing next_pos() function
  • analysis: use indexmap in pattern bind analysis to enforce stable error order
  • parser: name errors consistently, and remove a bunch of old un-used variants
  • parser: remove use of confusing next_pos() and replace with eof_pos() or expected_pos()
  • source: Change ByteRange to be inclusive on both ends
  • lexer: cleanup + greatly improve lexer errors

- This change moves away from the parser bits "directly" accessing
`TokenKind::Tree(..)`s in preparation for using the new lexing
system.

- Furthermore, the parser now uses either `skip_token()` which in
the future will be a "safe" variant of setting the cursor, or
`skip_fast()` which should be used to skip atomic tokens
(like `;` or `,`).

- Remove access to `backtrack()` completely.

- Use `peek_kind()` where possible to simplify token matching.

- Prepare for removing `peek_nth()`

- Prepare for switching over to a new `TokenCursor` API.
This commit switches over the sources of the `v2` lexer over the original
lexer source and hooks it up with the rest of the parsing pipeline.

This commit uses the new `v2` experimental lexer for the parser which
now accepts the "flat" version of the token stream. Since much work was
done before this commit to abstract away dealing with the token trees.
The migration was reasonably simple (with some minor span calculation
adjustments).
@feds01 feds01 self-assigned this Sep 18, 2023
@feds01 feds01 added parser Issues related with parsing sub-system. interface Issues that are regarding the compiler ui, specifically how the user interacts with the compiler labels Sep 18, 2023
kontheocharis
kontheocharis previously approved these changes Sep 19, 2023
compiler/hash-parser/src/parser/pat.rs Outdated Show resolved Hide resolved
compiler/hash-parser/src/parser/pat.rs Outdated Show resolved Hide resolved
@kontheocharis kontheocharis self-requested a review September 19, 2023 14:33
@kontheocharis kontheocharis dismissed their stale review September 19, 2023 14:34

Clicked the wrong button

@feds01 feds01 merged commit 54432fe into main Sep 19, 2023
1 check passed
@feds01 feds01 deleted the lexer-experiment branch September 19, 2023 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interface Issues that are regarding the compiler ui, specifically how the user interacts with the compiler parser Issues related with parsing sub-system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants