Allow token-based rules to work on source code with syntax errors #11915

dhruvmanila · 2024-06-18T04:50:27Z

Currently, the rules which work with the tokens doesn't emit diagnostics after the location of the first syntax error. For example:

foo;

"hello world

bar;

This will only raise the useless-semicolon for foo and not bar because there's an unterminated string literal in between them.

Playground: https://play.ruff.rs/4c17a92c-0189-4b98-b961-27b04db14599

The task here is to disable this limit and allow token-based rules to check all the tokens. This raises a question: Now that the parser can now recover from an unclosed parenthesis (#11845), how to make sure that the rule logic knows about this and has the correct information about the nesting level? Should we reduce the nesting level if we encounter a Newline token?

We also need to make sure that this doesn't panic in any scenarios (valid or invalid source code). This can be done via lots of fuzzing.

The text was updated successfully, but these errors were encountered:

## Summary This PR updates the linter, specifically the token-based rules, to work on the tokens that come after a syntax error. For context, the token-based rules only diagnose the tokens up to the first lexical error. This PR builds up an error resilience by introducing a `TokenIterWithContext` which updates the `nesting` level and tries to reflect it with what the lexer is seeing. This isn't 100% accurate because if the parser recovered from an unclosed parenthesis in the middle of the line, the context won't reduce the nesting level until it sees the newline token at the end of the line. resolves: #11915 ## Test Plan * Add test cases for a bunch of rules that are affected by this change. * Run the fuzzer for a long time, making sure to fix any other bugs.

dhruvmanila added the linter Related to the linter label Jun 18, 2024

dhruvmanila self-assigned this Jun 18, 2024

This was referenced Jun 18, 2024

Improve error recovery for strings and f-strings #11916

Open

Enable token-based rules on source with syntax errors #11950

Merged

dhruvmanila closed this as completed in #11950 Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow token-based rules to work on source code with syntax errors #11915

Allow token-based rules to work on source code with syntax errors #11915

dhruvmanila commented Jun 18, 2024

Allow token-based rules to work on source code with syntax errors #11915

Allow token-based rules to work on source code with syntax errors #11915

Comments

dhruvmanila commented Jun 18, 2024