-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docstring code formatter: remove "invalid Python" check #8857
Comments
Do we have a benchmark that tests the performance of docstring formatting? Adding one could help to understand the performance characteristics better and validate any improvements to it. Do we always perform the re-parse today or do we use a heuristic when the re-parse is necessary (only when the source text contains triple quoted strings?). A text search should be multiple times faster than a full re-parse. |
There aren't any benchmarks for code snippet formatting yet. Using a heuristic to avoid a re-parse seems plausible, but only if we buy that the cases where invalid Python can be produced are limited to triple quoting. (I think I buy that, otherwise I don't see how it could break out of the enclosing docstring.) For an ad hoc benchmark, if I run the formatter with and without |
I assume you did that for a project that has docstrings? If you happen to have a test file with code snippets, then you could add it to this benchmark: ruff/crates/ruff_benchmark/benches/formatter.rs Lines 28 to 42 in e62e245
And change the options to always enable the docstring code option here
The benchmark will then run automatically as part of codspeed. |
Yeah. I ran it on CPython and polars. I added a new issue tracking adding benchmarks specifically targeted to code snippet reformatting: #8909 |
#8811 added a docstring code snippet formatter. As part of the initial implementation, it is actually possible for the reformatter to transform valid Python to invalid Python, usually as a result of corner cases related to triple quoting. Since these are odd cases, for expediency, the initial implementation checks if the reformatted code is valid. If it isn't, then it bails out of reformatting and skips the code snippets entirely.
Ideally, we would be able to have more confidence in our code snippet reformatter to the point that we could remove this check for invalid Python code. Doing this will likely require some refactoring for how nested triple quotes are handled.
Here's a good example from the tests that doesn't work today:
ruff/crates/ruff_python_formatter/resources/test/fixtures/ruff/docstring_code_examples.py
Lines 307 to 316 in d6148b6
Namely, the code snippet there ought to be formatted, but today, it is skipped because the reformatting currently generates invalid Python code.
The text was updated successfully, but these errors were encountered: