Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format ExprJoinedStr #5932

Merged
merged 6 commits into from
Aug 1, 2023
Merged

Conversation

davidszotten
Copy link
Contributor

@davidszotten davidszotten commented Jul 20, 2023

reuses most of the (constant) string formatting, adding ability to skip changing quote style, if that would require escaping quotes inside formatted values which isn't allowed (in py < 3.12)

fixes #5913

@github-actions
Copy link
Contributor

github-actions bot commented Jul 20, 2023

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00     10.1±0.26ms     4.0 MB/sec    1.01     10.2±0.35ms     4.0 MB/sec
formatter/numpy/ctypeslib.py               1.00  1986.8±66.24µs     8.4 MB/sec    1.02      2.0±0.12ms     8.2 MB/sec
formatter/numpy/globals.py                 1.00   225.8±19.42µs    13.1 MB/sec    1.00    225.3±7.48µs    13.1 MB/sec
formatter/pydantic/types.py                1.00      4.3±0.11ms     6.0 MB/sec    1.01      4.3±0.17ms     5.9 MB/sec
linter/all-rules/large/dataset.py          1.00     13.9±0.22ms     2.9 MB/sec    1.02     14.1±0.29ms     2.9 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.5±0.07ms     4.8 MB/sec    1.02      3.5±0.08ms     4.7 MB/sec
linter/all-rules/numpy/globals.py          1.00   487.3±17.97µs     6.1 MB/sec    1.02   499.2±26.02µs     5.9 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.3±0.18ms     4.0 MB/sec    1.02      6.4±0.27ms     4.0 MB/sec
linter/default-rules/large/dataset.py      1.00      7.4±0.17ms     5.5 MB/sec    1.03      7.6±0.15ms     5.3 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1545.6±56.34µs    10.8 MB/sec    1.02  1581.0±62.14µs    10.5 MB/sec
linter/default-rules/numpy/globals.py      1.00    180.0±6.65µs    16.4 MB/sec    1.04   186.7±11.27µs    15.8 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.3±0.10ms     7.8 MB/sec    1.03      3.4±0.10ms     7.6 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00     10.1±0.09ms     4.0 MB/sec    1.11     11.2±0.11ms     3.6 MB/sec
formatter/numpy/ctypeslib.py               1.00  1939.5±18.75µs     8.6 MB/sec    1.10      2.1±0.04ms     7.8 MB/sec
formatter/numpy/globals.py                 1.00    211.9±4.02µs    13.9 MB/sec    1.07    227.3±5.91µs    13.0 MB/sec
formatter/pydantic/types.py                1.00      4.3±0.05ms     6.0 MB/sec    1.11      4.7±0.07ms     5.4 MB/sec
linter/all-rules/large/dataset.py          1.00     13.6±0.11ms     3.0 MB/sec    1.00     13.6±0.12ms     3.0 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      3.6±0.05ms     4.6 MB/sec    1.00      3.6±0.03ms     4.7 MB/sec
linter/all-rules/numpy/globals.py          1.00    436.6±5.10µs     6.8 MB/sec    1.00    438.5±7.65µs     6.7 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.1±0.09ms     4.2 MB/sec    1.00      6.1±0.08ms     4.2 MB/sec
linter/default-rules/large/dataset.py      1.00      7.5±0.05ms     5.4 MB/sec    1.00      7.5±0.10ms     5.4 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1524.8±14.55µs    10.9 MB/sec    1.00  1531.8±26.14µs    10.9 MB/sec
linter/default-rules/numpy/globals.py      1.00    171.2±2.34µs    17.2 MB/sec    1.00    171.1±2.54µs    17.2 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.3±0.03ms     7.8 MB/sec    1.00      3.3±0.03ms     7.8 MB/sec

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. Thanks for being so bold in tackling an as complicated formatting as f-strings.

@davidszotten
Copy link
Contributor Author

oh. not sure how i managed to miss this, but black doesn't (yet) format expressions inside f-strings

@davidszotten davidszotten force-pushed the format-expr-joined-str branch from 83da9a0 to ef0aaec Compare July 30, 2023 12:22
@davidszotten
Copy link
Contributor Author

so this approach works.. not sure what we think of it though

note that FormattedValue isn't used at all with this impl and none of my related refactorings are needed (though i think we still want them). That will all be needed when we decide to start formatting things inside formatted values

@davidszotten davidszotten marked this pull request as ready for review July 30, 2023 14:29
@davidszotten davidszotten force-pushed the format-expr-joined-str branch from a1f19c0 to 40a06c4 Compare July 31, 2023 09:08
@davidszotten davidszotten requested a review from MichaReiser July 31, 2023 09:40
if joined_str.values.iter().any(|value| match value {
Expr::FormattedValue(ast::ExprFormattedValue { range, .. }) => {
let string_content = locator.slice(*range);
string_content.contains('"') || string_content.contains('\'')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: To avoid iterating over the string twice

Suggested change
string_content.contains('"') || string_content.contains('\'')
string_content.contains(['"', '\''])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah thanks, was hoping something like this existed but couldn't find it

Comment on lines +22 to +26
enum Quoting {
CanChange,
Preserve,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enum Quoting {
CanChange,
Preserve,
}
#[derive(Copy, Clone)]
enum Quoting {
CanChange,
Preserve,
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open to better names (of the enum) btw, there are lots of related things around quoting behaviour

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Preserve and tried to come up with a better name than CanChange but failed. I thought about Normalize to align with normalize_string but this isn't disabling normalization, it only affects the quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant more to replace Quoting (since we already have StringQuotes and QuoteStyle)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. The use is very local and it conveys way more information than a plain boolean.

@davidszotten davidszotten force-pushed the format-expr-joined-str branch from 40a06c4 to 32d1060 Compare July 31, 2023 13:52
@davidszotten
Copy link
Contributor Author

hey @konstin did your recent ecosystem test pr also change (fix) it failing on errors? it looks like the most recently merged pr after that #6152 fails the checks (i was trying to figure out why i'm failing after rebasing onto main)

@konstin
Copy link
Member

konstin commented Jul 31, 2023

yes, sorry, fixed in #6202. Part of the problem is that i forgot to make the formatter ecosystem checks required for merging (but i don't think github properly shows in the PR UI that we could still merge your PR even with that check failing).

@davidszotten davidszotten force-pushed the format-expr-joined-str branch from 32d1060 to 6f79f32 Compare July 31, 2023 18:40
@MichaReiser
Copy link
Member

MichaReiser commented Aug 1, 2023

This is awesome. Are you interested in looking into "proper" f-string formatting? It isn't as high a priority to us as increasing coverage on other nodes (see August goals), but it would be a real differentiator to black

@MichaReiser MichaReiser merged commit 07468f8 into astral-sh:main Aug 1, 2023
@davidszotten
Copy link
Contributor Author

looking into "proper" f-string formatting

sure. do we need some mechanism to opt out of this behaviour for all the tests that are comparing to black?

@MichaReiser
Copy link
Member

looking into "proper" f-string formatting

sure. do we need some mechanism to opt out of this behaviour for all the tests that are comparing to black?

Are you expecting many differences? If not, then I don't think it's worth bothering. We can otherwise make it an option on PyFormatOptions which we already support loading for tests.

@davidszotten
Copy link
Contributor Author

i thought i'd seen a bunch but maybe i'm misremembering

i started looking (maybe we can make an issue to have these discussions there instead) and came across something to run by you:

currently f"{{" is parsed as

ExprJoinedStr {values: [Constant(ExprConstant {
    value: Str(
        "{",

ie the escaped curly is rendered into the constant. is that what we want? if so i guess we need special handling when rendering to reproduce the escaping. or do we want to parse as something closer to the source and store the string "{{" instead?

do we do anything corresponding for regular string escapes like '\"'?

@MichaReiser
Copy link
Member

A discussion would be good because I tend to not look at merged PRs and the discussion also goes beyond of what this PR addressed.

Generally, it is very common for ASTs to undo string escapes, because the goal is to represent the semantic of the program and not the representation of the program in the source (this is different to a CST that preserves the code as is).

You can see that Ruff's parser removes the escape sequences of quotes too: Playground. I assume this happens inside of parse_string because the lexer isn't performing this operation (But I think it would be more appropriate for the lexer to perform this, but that requires Python's 3.12 f string parsing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Formatter: F-Strings (FormattedValue and JoinedStr)
3 participants