Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pycodestyle] Add too many blank lines (E303) #4635

Closed

Conversation

hoel-bagard
Copy link
Contributor

Summary

This PR is part of #2402, it adds the E303 error rule with its fix.

Test Plan

It was tested on the added fixture: crates/ruff/resources/test/fixtures/pycodestyle/E303.py.

let range = locator.full_lines_range(TextRange::new(line.start(), last_blank_line));
let mut diagnostic = Diagnostic::new(TooManyBlankLines(nb_blank_lines), range);
diagnostic.set_fix(Fix::suggested(Edit::range_replacement(
"\n\n".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use stylist (should be on checker) to use the user's preferred newline character.

Comment on lines +61 to +64
&& !locator
.line(TextSize::new(line.start().to_u32() - 1))
.trim()
.is_empty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Determining the line is fairly expensive using. You may also be unlucky if the previous line ends with \r\n because you then index right between the \r and \n and there's no good way for the locator to know that it is now between a \r\n.

I'm not quite sure what the best way is to implement this rule. Ideally, it could keep a state between each line, counting the empty lines to this point (CC: @charliermarsh).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe pycodestyle tracks the number of empty lines while building up logical lines -- we should probably do the same? That's now straightforward because we emit NonLogicalNewline whenever we see an empty line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, but the physical lines rule operates on the String and I think the logical lines check intentionally skips empty lines... Where would you place this rule? Would this be something new? A rule that directly works on the tokens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd place it at the next logical line -- you'd check the number of preceding empty lines, I think?

Copy link
Contributor Author

@hoel-bagard hoel-bagard May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the pycodestyle rule again, my implementation is wrong anyway. The following code should generate an error:

class MyClass(object):
    def func1():
        pass


    def func2():
        pass

-> E303 too many blank lines (2)

I'll try again using logical lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaReiser @charliermarsh I've had another go using logical lines. It's a very, very rough draft (that might have broken a few other rules), but could you tell me if you think it's worth continuing on that path ?

The modifications I made to the logical_lines.rs file can be seen here, and the checking function is here. I've also had to remove the explicit skip of empty lines here.

I still haven't solved the issue with \r\n, but maybe I could count the number of characters to remove along with the number of blank lines ?

Comment on lines +70 to +71
previous_line_end = locator.full_line_end(previous_line_end);
let previous_line = locator.line(previous_line_end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line will have to repeat the search for the start and end of the line, which is fairly expensive.

Is there a way that we would only need to find the end (or start) and can then construct the TextRange ourselves (because we know what e.g. the end of the current line is?).

@github-actions
Copy link
Contributor

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
linter/all-rules/large/dataset.py          1.00     14.9±0.03ms     2.7 MB/sec    1.01     15.0±0.03ms     2.7 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.6±0.01ms     4.6 MB/sec    1.00      3.7±0.00ms     4.5 MB/sec
linter/all-rules/numpy/globals.py          1.00    375.2±2.92µs     7.9 MB/sec    1.00    376.1±0.93µs     7.8 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.3±0.01ms     4.1 MB/sec    1.01      6.3±0.01ms     4.0 MB/sec
linter/default-rules/large/dataset.py      1.01      7.5±0.02ms     5.4 MB/sec    1.00      7.5±0.01ms     5.4 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00   1597.4±3.56µs    10.4 MB/sec    1.00   1591.0±1.98µs    10.5 MB/sec
linter/default-rules/numpy/globals.py      1.00    173.8±0.23µs    17.0 MB/sec    1.02    176.8±6.95µs    16.7 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.4±0.01ms     7.5 MB/sec    1.00      3.4±0.01ms     7.5 MB/sec
parser/large/dataset.py                    1.00      5.7±0.01ms     7.2 MB/sec    1.02      5.8±0.01ms     7.0 MB/sec
parser/numpy/ctypeslib.py                  1.00   1126.2±0.69µs    14.8 MB/sec    1.01   1134.6±0.61µs    14.7 MB/sec
parser/numpy/globals.py                    1.00    115.3±0.31µs    25.6 MB/sec    1.02    117.4±0.38µs    25.1 MB/sec
parser/pydantic/types.py                   1.00      2.5±0.01ms    10.4 MB/sec    1.01      2.5±0.01ms    10.3 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
linter/all-rules/large/dataset.py          1.00     22.3±0.71ms  1867.3 KB/sec    1.02     22.7±0.83ms  1832.9 KB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      5.6±0.22ms     3.0 MB/sec    1.04      5.8±0.23ms     2.9 MB/sec
linter/all-rules/numpy/globals.py          1.00   668.5±31.94µs     4.4 MB/sec    1.04   693.1±40.81µs     4.3 MB/sec
linter/all-rules/pydantic/types.py         1.00      9.6±0.50ms     2.6 MB/sec    1.01      9.7±0.34ms     2.6 MB/sec
linter/default-rules/large/dataset.py      1.01     11.3±0.46ms     3.6 MB/sec    1.00     11.2±0.33ms     3.6 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00      2.4±0.09ms     7.0 MB/sec    1.00      2.4±0.11ms     7.0 MB/sec
linter/default-rules/numpy/globals.py      1.02   279.5±14.61µs    10.6 MB/sec    1.00   274.3±14.48µs    10.8 MB/sec
linter/default-rules/pydantic/types.py     1.01      5.1±0.25ms     5.0 MB/sec    1.00      5.0±0.20ms     5.1 MB/sec
parser/large/dataset.py                    1.00      9.0±0.35ms     4.5 MB/sec    1.17     10.5±0.38ms     3.9 MB/sec
parser/numpy/ctypeslib.py                  1.00  1717.3±72.78µs     9.7 MB/sec    1.14  1952.4±63.56µs     8.5 MB/sec
parser/numpy/globals.py                    1.00   174.5±11.14µs    16.9 MB/sec    1.12    194.6±8.65µs    15.2 MB/sec
parser/pydantic/types.py                   1.00      3.8±0.17ms     6.7 MB/sec    1.15      4.4±0.13ms     5.9 MB/sec

Comment on lines +16 to +25
/// ## Example
/// ```python
/// def func1():
/// pass
///
///
///
/// def func2():
/// pass
/// ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this issue is fixed by black, the rule name should be added to the KNOWN_FORMATTING_VIOLATIONS list to stop CI flagging that it should be fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a list somewhere with black's rules names ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, you would add too_many_blank_lines to the list as KNOWN_FORMATTING_VIOLATIONS relates to Ruff's rules rather than black.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks!

@hoel-bagard
Copy link
Contributor Author

I'm closing this PR since I'm redoing it using logical lines instead of physical lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants