Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic string formatting #5297

Merged
merged 3 commits into from
Jun 23, 2023
Merged

Basic string formatting #5297

merged 3 commits into from
Jun 23, 2023

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jun 22, 2023

Summary

This PR implements formatting for non-f-string Strings that do not use implicit concatenation.

Docstring formatting is out of the scope of this PR.

Test Plan

I added a few tests for simple string literals.

Performance

Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:

  1. To detect if it is an implicit string continuation
  2. To detect if the string contains any new lines
  3. To detect the preferred quote
  4. To normalize the string

Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.

This was referenced Jun 22, 2023
@MichaReiser
Copy link
Member Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 22, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.01      7.4±0.35ms     5.5 MB/sec    1.00      7.3±0.32ms     5.6 MB/sec
formatter/numpy/ctypeslib.py               1.00  1723.0±93.08µs     9.7 MB/sec    1.01  1741.2±93.18µs     9.6 MB/sec
formatter/numpy/globals.py                 1.00   207.5±13.20µs    14.2 MB/sec    1.02   210.7±13.52µs    14.0 MB/sec
formatter/pydantic/types.py                1.04      3.9±0.40ms     6.5 MB/sec    1.00      3.8±0.21ms     6.7 MB/sec
linter/all-rules/large/dataset.py          1.07     15.7±0.92ms     2.6 MB/sec    1.00     14.7±0.46ms     2.8 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.7±0.15ms     4.5 MB/sec    1.01      3.7±0.18ms     4.5 MB/sec
linter/all-rules/numpy/globals.py          1.00   472.5±24.15µs     6.2 MB/sec    1.04   491.9±28.96µs     6.0 MB/sec
linter/all-rules/pydantic/types.py         1.01      6.6±0.31ms     3.8 MB/sec    1.00      6.6±0.33ms     3.9 MB/sec
linter/default-rules/large/dataset.py      1.00      7.2±0.25ms     5.7 MB/sec    1.07      7.7±0.45ms     5.3 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1580.9±77.03µs    10.5 MB/sec    1.04  1650.6±122.86µs    10.1 MB/sec
linter/default-rules/numpy/globals.py      1.00   189.4±10.17µs    15.6 MB/sec    1.01   191.5±12.14µs    15.4 MB/sec
linter/default-rules/pydantic/types.py     1.01      3.4±0.18ms     7.5 MB/sec    1.00      3.4±0.14ms     7.5 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00      8.0±0.09ms     5.1 MB/sec    1.00      8.0±0.07ms     5.1 MB/sec
formatter/numpy/ctypeslib.py               1.00  1773.0±15.83µs     9.4 MB/sec    1.00  1777.6±17.46µs     9.4 MB/sec
formatter/numpy/globals.py                 1.00    203.9±5.46µs    14.5 MB/sec    1.00    203.9±4.40µs    14.5 MB/sec
formatter/pydantic/types.py                1.00      4.1±0.04ms     6.3 MB/sec    1.01      4.1±0.06ms     6.2 MB/sec
linter/all-rules/large/dataset.py          1.00     15.2±0.12ms     2.7 MB/sec    1.00     15.3±0.08ms     2.7 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      4.1±0.10ms     4.1 MB/sec    1.00      4.1±0.02ms     4.1 MB/sec
linter/all-rules/numpy/globals.py          1.00    427.5±6.56µs     6.9 MB/sec    1.00    428.0±8.00µs     6.9 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.9±0.04ms     3.7 MB/sec    1.01      6.9±0.06ms     3.7 MB/sec
linter/default-rules/large/dataset.py      1.00      8.0±0.09ms     5.1 MB/sec    1.00      8.1±0.05ms     5.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1671.0±15.33µs    10.0 MB/sec    1.00  1677.3±22.38µs     9.9 MB/sec
linter/default-rules/numpy/globals.py      1.00    181.6±3.37µs    16.2 MB/sec    1.00    181.5±1.88µs    16.3 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.6±0.02ms     7.0 MB/sec    1.01      3.7±0.02ms     7.0 MB/sec

@MichaReiser MichaReiser marked this pull request as ready for review June 22, 2023 14:50
@MichaReiser MichaReiser force-pushed the format-string branch 2 times, most recently from 0bea2d4 to a78246b Compare June 22, 2023 15:16
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/string.rs Outdated Show resolved Hide resolved
@@ -342,29 +377,8 @@ if True:
let printed = format_module(&content)?;
let formatted_code = printed.as_code();

let reformatted =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that gone?

Copy link
Member Author

@MichaReiser MichaReiser Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that ensure_stability_when_formatting_twice now already asserts that reformatting the file twice yields the same result. Thanks to your refactor :)

@MichaReiser MichaReiser added internal An internal refactor or improvement formatter Related to the formatter labels Jun 22, 2023
@MichaReiser MichaReiser force-pushed the 06-21-Format_Compare_Op branch from 895d416 to 54b2b11 Compare June 23, 2023 07:18
Base automatically changed from 06-21-Format_Compare_Op to main June 23, 2023 07:35
@MichaReiser
Copy link
Member Author

Graphite rebased this pull request after merging its parent, because this pull request is set to merge when ready.

@MichaReiser MichaReiser merged commit c52aa8f into main Jun 23, 2023
@MichaReiser MichaReiser deleted the format-string branch June 23, 2023 07:46
@MichaReiser
Copy link
Member Author

@MichaReiser merged this pull request with Graphite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants