Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace autogenerated parser with hand-written parser #9152

Closed
wants to merge 122 commits into from
Closed
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
122 commits
Select commit Hold shift + click to select a range
6b37a85
Replace `String` with `SmolStr` for nodes `Identifier` and `ExprName`
LaBatata101 Nov 26, 2023
5eeef8b
Add `Invalid` node for `Expr` and `Pattern`
LaBatata101 Nov 24, 2023
f6ddb0d
Use `TextRange` instead of `TextSize` for the error location in `Lexi…
LaBatata101 Nov 26, 2023
41d4832
Replace autogenerated parser with hand-written parser
LaBatata101 Dec 1, 2023
8411685
Update code to use `SmolStr` instead of `String`
LaBatata101 Dec 13, 2023
2345381
Add `Invalid` to `FStringElement` AST node to handle invalid syntax i…
LaBatata101 Dec 13, 2023
66e3688
Remove unused `ParenthesizedExpr`
LaBatata101 Dec 13, 2023
f3559ca
Update parser to use new string and f-string AST nodes
LaBatata101 Dec 13, 2023
aad7cbd
Remove error handling from the lexer that should be in the parser
LaBatata101 Dec 14, 2023
d51c351
Prevent the parser from panicking when calling `parse_expression_star…
LaBatata101 Dec 14, 2023
28096d9
Add `FStringError` to `ParseErrorType`
LaBatata101 Dec 14, 2023
a4dda02
Add errors for when the Unicode escape sequence is missing the `{` or…
LaBatata101 Dec 14, 2023
995dd93
Move tests to tests directory
LaBatata101 Dec 14, 2023
9b5ffdb
fix warning
LaBatata101 Dec 15, 2023
82a03e5
Fix typo
LaBatata101 Dec 15, 2023
37b538f
Handle `Invalid` node
LaBatata101 Dec 15, 2023
7939485
Fix more typos
LaBatata101 Dec 15, 2023
0204644
Fix clippy warnings
LaBatata101 Dec 15, 2023
7c01284
Fix syntax error in fuzzer
LaBatata101 Dec 15, 2023
017fd87
Fix incorrect range in `Alias`'s `name`
LaBatata101 Dec 15, 2023
b9fe55e
Parse named expression in decorators
LaBatata101 Dec 19, 2023
c8202b4
Fix incorrect range for tuple parenthesized twice, e.g. `((a, b, c))`
LaBatata101 Dec 20, 2023
2c54d94
Fix parsing parenthesized `WithItem`s
LaBatata101 Dec 20, 2023
416c354
Improve error recovery
LaBatata101 Dec 22, 2023
4407135
Merge branch 'main' into new-parser
LaBatata101 Dec 22, 2023
a519eea
Handle `Invalid` node in `is_splittable_expresssion`
LaBatata101 Dec 22, 2023
23e8353
Fix type error
LaBatata101 Dec 22, 2023
8f02b4a
Fix formatting
LaBatata101 Dec 22, 2023
ea870c5
Fix incorrect precedence in `await` expression parsing
LaBatata101 Dec 23, 2023
c74ddd7
Fix binary expressions not being parsed in `PatternMatchMapping` key
LaBatata101 Dec 24, 2023
a12c2d0
Fix false positive `DefaultArgumentError` and fix incorrect ending to…
LaBatata101 Dec 24, 2023
1fe3d6e
Factor out `NamedExpr` parsing into it's own function
LaBatata101 Dec 26, 2023
b2b45e6
Improve the heuristic of `WithItem` parsing
LaBatata101 Dec 26, 2023
9e3d962
Add missing `From` keyword token to `SIMPLE_STMT_SET`
LaBatata101 Dec 26, 2023
d701e11
Revert change of `ExprName` and `Identifier` from `String` to `SmolStr`
LaBatata101 Dec 26, 2023
938c647
Merge branch 'main' into new-parser
LaBatata101 Dec 26, 2023
76dfc1e
Update Cargo.toml and Cargo.lock
LaBatata101 Dec 26, 2023
590b1f0
Add missing `Clone` to `FStringErrorType`
LaBatata101 Dec 26, 2023
b55d289
Bring back the old auto-generated parser code
LaBatata101 Dec 27, 2023
79d1075
Use `TextRange` in errors instead of `TextSize` in the auto-generated…
LaBatata101 Dec 27, 2023
f867fcc
Switch between the auto-generated parser and handwritten parser with …
LaBatata101 Dec 27, 2023
24aef17
Fix clippy lints
LaBatata101 Dec 27, 2023
82af1c1
Merge branch 'main' into new-parser
LaBatata101 Dec 28, 2023
63fa1a1
Handle `Invalid` node in `bit_count`
LaBatata101 Dec 28, 2023
dbccff1
Fix check for `value` in `return` statement parsing
LaBatata101 Dec 28, 2023
c6c7b59
Fix false positive error messages
LaBatata101 Dec 28, 2023
5510c50
Check for an expression before parsing `orelse` in `IfExpr`
LaBatata101 Dec 29, 2023
99d50c5
Revert some changes made in the lexer for the new parser
LaBatata101 Dec 30, 2023
de5ba58
Equivalency fuzzer for new parser
addisoncrump Jan 1, 2024
12c556f
Merge branch 'main' into new-parser
LaBatata101 Jan 1, 2024
5d0fcd9
Remove source path from the new parser
LaBatata101 Jan 1, 2024
7008216
Fix type error
LaBatata101 Jan 1, 2024
3940504
Fix field name
LaBatata101 Jan 1, 2024
8c4080e
Add missing `From<LexicalError>` impl in `ParseError`
LaBatata101 Jan 1, 2024
a4a2877
Remove unreferenced snapshots
LaBatata101 Jan 1, 2024
48b31e4
Update tests cases to use the error types produced by the autogenerat…
LaBatata101 Jan 1, 2024
c3a88e5
Add missing dependency
LaBatata101 Jan 2, 2024
e847e63
Fix clippy lint
LaBatata101 Jan 2, 2024
e7c93ae
Add `in` keyword token to `END_EXPR_SET`
LaBatata101 Jan 2, 2024
b389b42
Create error when the identifier is not found when parsing an identifier
LaBatata101 Jan 2, 2024
5e0bbc4
Add function to remove a specific `TokenKind` from the `TokenSet`
LaBatata101 Jan 2, 2024
c6be15a
Fix error not being created when the comma wasn't present after the f…
LaBatata101 Jan 2, 2024
097cb63
Update test expected error message
LaBatata101 Jan 3, 2024
221df0b
Merge branch 'main' into new-parser
LaBatata101 Jan 3, 2024
6034d81
Handle `Pattern::Invalid` node
LaBatata101 Jan 3, 2024
233c8d7
Make `parse_ok_tokens_lalrpop` and `_new` private
MichaReiser Jan 8, 2024
4367b13
Create `lalrpop` parser module.
MichaReiser Jan 8, 2024
1b0188d
Merge branch 'main' into new-parser
MichaReiser Jan 8, 2024
9883b6f
Merge remote-tracking branch 'origin/main' into new-parser
MichaReiser Jan 8, 2024
74793ff
Revert no longer needed changes for String -> SmolStr
MichaReiser Jan 8, 2024
d4a4d05
Expose API to switch between old/new parser and use it in the fuzzer
MichaReiser Jan 8, 2024
0dec5dd
Minor tweaks
LaBatata101 Jan 8, 2024
2735edb
Move function from `functions.rs` to `lib.rs` and move `parse_ok_toke…
LaBatata101 Jan 8, 2024
c5a4d0a
Replace test generation macros for functions
LaBatata101 Jan 8, 2024
3432c4e
Check for the `EndOfFile` token when handling unexpected indentation
LaBatata101 Jan 8, 2024
e5f11fb
Merge remote-tracking branch 'origin/main' into new-parser
MichaReiser Jan 9, 2024
38031a9
Reduce the use of `ParserCtxFlags`
LaBatata101 Jan 9, 2024
df64998
Defer the creation of the invalid node for the skipped unexpected tokens
LaBatata101 Jan 9, 2024
63fab65
Fix `needless_pass_by_value` clippy lint
LaBatata101 Jan 9, 2024
71404cc
Merge branch 'main' into new-parser
LaBatata101 Jan 10, 2024
6ce6beb
Fix invalid syntax check in annotated assignment parsing
LaBatata101 Jan 10, 2024
1752025
Remove unnecessary `ParserCtxFlags`
LaBatata101 Jan 10, 2024
996e4d6
Merge branch 'main' into new-parser
LaBatata101 Jan 11, 2024
eb2cdce
Replace `I` type parameter on parser with TokenSource
MichaReiser Jan 12, 2024
b66a65c
Create parser fixture test infrastructure
MichaReiser Jan 12, 2024
bfb4927
Make clippy happy and use `assert` instead of `panic`
MichaReiser Jan 12, 2024
583073e
Add `ParsedExpr` struct to allow knowing whether the parsed expressio…
LaBatata101 Jan 12, 2024
43c17fc
Merge branch 'main' into new-parser
LaBatata101 Jan 12, 2024
de1ca39
Update snapshots
LaBatata101 Jan 12, 2024
432d826
Provide infrastructure to automatically compute node range
MichaReiser Jan 15, 2024
2a6be10
Remove `StmtWithRange`
MichaReiser Jan 15, 2024
20acd2b
Refactor some expression parsing
MichaReiser Jan 16, 2024
1d98742
Remove `StmtWithRange`, `ExprWithRange` and all `(Node, TextRange)` r…
MichaReiser Jan 16, 2024
f3ebee4
Split parser into multiple files
MichaReiser Jan 17, 2024
7aacfe5
Move some methods around
MichaReiser Jan 17, 2024
a7f3d91
Introduce `ParserProgress` to detect when the parser is stuck in an i…
MichaReiser Jan 17, 2024
b1d94e3
Remove `expect_and_recover` and experiment with error recovery
MichaReiser Jan 17, 2024
9760623
Remove `parse_expression_with_recovery`
MichaReiser Jan 17, 2024
4e5718c
Fix starts_at parsing
MichaReiser Jan 18, 2024
69d6853
Merge remote-tracking branch 'origin/main' into parser-in-progress
MichaReiser Jan 18, 2024
bb641c6
Basic error recovery instrastructure for parsing lists
MichaReiser Jan 18, 2024
23dfb80
Remove `ctx_stack` from `Parser` by using the call stack
MichaReiser Jan 18, 2024
16a21ba
Assignment error recovery
MichaReiser Jan 18, 2024
01c31d0
parse delimited
MichaReiser Jan 18, 2024
08612bc
Import alias recovery, deprecate a few APIs
MichaReiser Jan 18, 2024
3ce93be
Fix panic in `self.node_range`
MichaReiser Jan 19, 2024
06dca98
Merge branch 'parser-in-progress' of github.com:astral-sh/ruff into p…
MichaReiser Jan 19, 2024
4999153
fixup! Import alias recovery, deprecate a few APIs
MichaReiser Jan 19, 2024
11b6903
Remove last_ctx
MichaReiser Jan 19, 2024
faa9122
Add some docs
LaBatata101 Jan 23, 2024
9fb4723
Add enum to represent the expression precedence instead of using u8 v…
LaBatata101 Jan 24, 2024
1a0b01c
Merge branch 'main' into new-parser
LaBatata101 Jan 24, 2024
ca50ef3
Update snapshots
LaBatata101 Jan 24, 2024
73007f2
Fix wrong `Precedence` for `CircumFlex` and `Amper` tokens
LaBatata101 Jan 26, 2024
5d6ec57
Fix false positive error when parsing `NamedExpr` in the `lower` part…
LaBatata101 Jan 27, 2024
a96a010
Add missing token in `END_EXPR_SET`
LaBatata101 Jan 27, 2024
e453fb9
Merge branch 'main' into new-parser
LaBatata101 Feb 9, 2024
c23c69a
Fix minor syntax error
LaBatata101 Feb 11, 2024
5f66f5e
Fix assert panic
LaBatata101 Feb 11, 2024
bdc9a6b
Solve FIXME, don't consume the token in `parse_match_pattern_literal`
LaBatata101 Feb 14, 2024
d502bbd
Small refactor in `parse_string_expression` to remove `clone`
LaBatata101 Feb 14, 2024
f8e577d
Merge branch 'main' into new-parser
LaBatata101 Feb 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 7 additions & 14 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 3 additions & 23 deletions crates/ruff_linter/src/logging.rs
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ impl DisplayParseError {
// Translate the byte offset to a location in the originating source.
let location =
if let Some(jupyter_index) = source_kind.as_ipy_notebook().map(Notebook::index) {
let source_location = source_code.source_location(error.offset);
let source_location = source_code.source_location(error.location.start());

ErrorLocation::Cell(
jupyter_index
Expand All @@ -187,7 +187,7 @@ impl DisplayParseError {
},
)
} else {
ErrorLocation::File(source_code.source_location(error.offset))
ErrorLocation::File(source_code.source_location(error.location.start()))
};

Self {
Expand Down Expand Up @@ -254,27 +254,7 @@ impl<'a> DisplayParseErrorType<'a> {

impl Display for DisplayParseErrorType<'_> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
match self.0 {
ParseErrorType::Eof => write!(f, "Expected token but reached end of file."),
ParseErrorType::ExtraToken(ref tok) => write!(
f,
"Got extraneous token: {tok}",
tok = TruncateAtNewline(&tok)
),
ParseErrorType::InvalidToken => write!(f, "Got invalid token"),
ParseErrorType::UnrecognizedToken(ref tok, ref expected) => {
if let Some(expected) = expected.as_ref() {
write!(
f,
"Expected '{expected}', but got {tok}",
tok = TruncateAtNewline(&tok)
)
} else {
write!(f, "Unexpected token {tok}", tok = TruncateAtNewline(&tok))
}
}
ParseErrorType::Lexical(ref error) => write!(f, "{error}"),
}
write!(f, "{}", TruncateAtNewline(&self.0))
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ fn is_empty_or_null_fstring_element(element: &ast::FStringElement) -> bool {
ast::FStringElement::Expression(ast::FStringExpressionElement { expression, .. }) => {
is_empty_or_null_string(expression)
}
ast::FStringElement::Invalid(_) => false,
}
}

Expand Down
4 changes: 2 additions & 2 deletions crates/ruff_linter/src/rules/pycodestyle/rules/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ pub(crate) fn syntax_error(
parse_error: &ParseError,
locator: &Locator,
) {
let rest = locator.after(parse_error.offset);
let rest = locator.after(parse_error.location.start());

// Try to create a non-empty range so that the diagnostic can print a caret at the
// right position. This requires that we retrieve the next character, if any, and take its length
Expand All @@ -95,6 +95,6 @@ pub(crate) fn syntax_error(
SyntaxError {
message: format!("{}", DisplayParseErrorType::new(&parse_error.error)),
},
TextRange::at(parse_error.offset, len),
TextRange::at(parse_error.location.start(), len),
));
}
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@ pub(crate) fn assert_on_string_literal(checker: &mut Checker, test: &Expr) {
ast::FStringElement::Literal(ast::FStringLiteralElement {
value, ..
}) => value.is_empty(),
ast::FStringElement::Expression(_) => false,
ast::FStringElement::Expression(_) | ast::FStringElement::Invalid(_) => {
false
}
})
}
}) {
Expand All @@ -89,7 +91,9 @@ pub(crate) fn assert_on_string_literal(checker: &mut Checker, test: &Expr) {
ast::FStringElement::Literal(ast::FStringLiteralElement {
value, ..
}) => !value.is_empty(),
ast::FStringElement::Expression(_) => false,
ast::FStringElement::Expression(_) | ast::FStringElement::Invalid(_) => {
false
}
})
}
}) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,7 @@ fn is_allowed_value(expr: &Expr) -> bool {
| Expr::YieldFrom(_)
| Expr::Starred(_)
| Expr::Slice(_)
| Expr::IpyEscapeCommand(_) => false,
| Expr::IpyEscapeCommand(_)
| Expr::Invalid(_) => false,
}
}
3 changes: 2 additions & 1 deletion crates/ruff_linter/src/rules/refurb/rules/bit_count.rs
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,8 @@ pub(crate) fn bit_count(checker: &mut Checker, call: &ExprCall) {
| Expr::NoneLiteral(_)
| Expr::EllipsisLiteral(_)
| Expr::Attribute(_)
| Expr::Subscript(_) => false,
| Expr::Subscript(_)
| Expr::Invalid(_) => false,
};

let replacement = if parenthesize {
Expand Down
5 changes: 4 additions & 1 deletion crates/ruff_linter/src/rules/ruff/rules/unreachable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,8 @@ fn is_wildcard(pattern: &MatchCase) -> bool {
| Pattern::MatchSequence(_)
| Pattern::MatchMapping(_)
| Pattern::MatchClass(_)
| Pattern::MatchStar(_) => false,
| Pattern::MatchStar(_)
| Pattern::Invalid(_) => false,
Pattern::MatchAs(PatternMatchAs { pattern, .. }) => pattern.is_none(),
Pattern::MatchOr(PatternMatchOr { patterns, .. }) => {
patterns.iter().all(is_wildcard_pattern)
Expand Down Expand Up @@ -648,6 +649,8 @@ impl<'stmt> BasicBlocksBuilder<'stmt> {
| Expr::Name(_)
| Expr::List(_)
| Expr::IpyEscapeCommand(_)
// NOTE: Is this the correct place to handle this node?
| Expr::Invalid(_)
| Expr::Tuple(_)
| Expr::Slice(_) => self.unconditional_next_block(after),
// TODO: handle these expressions.
Expand Down
6 changes: 6 additions & 0 deletions crates/ruff_python_ast/src/comparable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,7 @@ pub enum ComparablePattern<'a> {
MatchStar(PatternMatchStar<'a>),
MatchAs(PatternMatchAs<'a>),
MatchOr(PatternMatchOr<'a>),
Invalid,
}

impl<'a> From<&'a ast::Pattern> for ComparablePattern<'a> {
Expand Down Expand Up @@ -286,6 +287,7 @@ impl<'a> From<&'a ast::Pattern> for ComparablePattern<'a> {
patterns: patterns.iter().map(Into::into).collect(),
})
}
ast::Pattern::Invalid(_) => Self::Invalid,
}
}
}
Expand Down Expand Up @@ -513,6 +515,7 @@ impl<'a> From<&'a ast::ExceptHandler> for ComparableExceptHandler<'a> {
pub enum ComparableFStringElement<'a> {
Literal(&'a str),
FStringExpressionElement(FStringExpressionElement<'a>),
Invalid,
}

#[derive(Debug, PartialEq, Eq, Hash)]
Expand Down Expand Up @@ -540,6 +543,7 @@ impl<'a> From<&'a ast::FStringElement> for ComparableFStringElement<'a> {
.map(|spec| spec.elements.iter().map(Into::into).collect()),
})
}
ast::FStringElement::Invalid(_) => Self::Invalid,
}
}
}
Expand Down Expand Up @@ -864,6 +868,7 @@ pub enum ComparableExpr<'a> {
Tuple(ExprTuple<'a>),
Slice(ExprSlice<'a>),
IpyEscapeCommand(ExprIpyEscapeCommand<'a>),
Invalid,
}

impl<'a> From<&'a Box<ast::Expr>> for Box<ComparableExpr<'a>> {
Expand Down Expand Up @@ -1093,6 +1098,7 @@ impl<'a> From<&'a ast::Expr> for ComparableExpr<'a> {
kind: *kind,
value: value.as_str(),
}),
ast::Expr::Invalid(_) => Self::Invalid,
}
}
}
Expand Down
4 changes: 4 additions & 0 deletions crates/ruff_python_ast/src/expression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ pub enum ExpressionRef<'a> {
Tuple(&'a ast::ExprTuple),
Slice(&'a ast::ExprSlice),
IpyEscapeCommand(&'a ast::ExprIpyEscapeCommand),
Invalid(&'a ast::ExprInvalid),
}

impl<'a> From<&'a Box<Expr>> for ExpressionRef<'a> {
Expand Down Expand Up @@ -81,6 +82,7 @@ impl<'a> From<&'a Expr> for ExpressionRef<'a> {
Expr::Tuple(value) => ExpressionRef::Tuple(value),
Expr::Slice(value) => ExpressionRef::Slice(value),
Expr::IpyEscapeCommand(value) => ExpressionRef::IpyEscapeCommand(value),
Expr::Invalid(value) => ExpressionRef::Invalid(value),
}
}
}
Expand Down Expand Up @@ -285,6 +287,7 @@ impl<'a> From<ExpressionRef<'a>> for AnyNodeRef<'a> {
ExpressionRef::IpyEscapeCommand(expression) => {
AnyNodeRef::ExprIpyEscapeCommand(expression)
}
ExpressionRef::Invalid(expression) => AnyNodeRef::ExprInvalid(expression),
}
}
}
Expand Down Expand Up @@ -324,6 +327,7 @@ impl Ranged for ExpressionRef<'_> {
ExpressionRef::Tuple(expression) => expression.range(),
ExpressionRef::Slice(expression) => expression.range(),
ExpressionRef::IpyEscapeCommand(expression) => expression.range(),
ExpressionRef::Invalid(expression) => expression.range(),
}
}
}
Expand Down
10 changes: 6 additions & 4 deletions crates/ruff_python_ast/src/helpers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,8 @@ pub fn any_over_expr(expr: &Expr, func: &dyn Fn(&Expr) -> bool) -> bool {
| Expr::BooleanLiteral(_)
| Expr::NoneLiteral(_)
| Expr::EllipsisLiteral(_)
| Expr::IpyEscapeCommand(_) => false,
| Expr::IpyEscapeCommand(_)
| Expr::Invalid(_) => false,
}
}

Expand All @@ -277,7 +278,7 @@ pub fn any_over_pattern(pattern: &Pattern, func: &dyn Fn(&Expr) -> bool) -> bool
Pattern::MatchValue(ast::PatternMatchValue { value, range: _ }) => {
any_over_expr(value, func)
}
Pattern::MatchSingleton(_) => false,
Pattern::MatchSingleton(_) | Pattern::Invalid(_) => false,
Pattern::MatchSequence(ast::PatternMatchSequence { patterns, range: _ }) => patterns
.iter()
.any(|pattern| any_over_pattern(pattern, func)),
Expand Down Expand Up @@ -310,7 +311,7 @@ pub fn any_over_pattern(pattern: &Pattern, func: &dyn Fn(&Expr) -> bool) -> bool

pub fn any_over_f_string_element(element: &FStringElement, func: &dyn Fn(&Expr) -> bool) -> bool {
match element {
FStringElement::Literal(_) => false,
FStringElement::Literal(_) | FStringElement::Invalid(_) => false,
FStringElement::Expression(ast::FStringExpressionElement {
expression,
format_spec,
Expand Down Expand Up @@ -1184,6 +1185,7 @@ impl Truthiness {
value, ..
}) => !value.is_empty(),
ast::FStringElement::Expression(_) => true,
ast::FStringElement::Invalid(_) => false,
})
{
Self::Truthy
Expand Down Expand Up @@ -1427,7 +1429,7 @@ mod tests {
fn any_over_stmt_type_alias() {
let seen = RefCell::new(Vec::new());
let name = Expr::Name(ExprName {
id: "x".to_string(),
id: "x".into(),
range: TextRange::default(),
ctx: ExprContext::Load,
});
Expand Down
Loading