Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update line break heuristic when f-string has format specifier #10040

Closed
dhruvmanila opened this issue Feb 19, 2024 · 4 comments · Fixed by #11123
Closed

Update line break heuristic when f-string has format specifier #10040

dhruvmanila opened this issue Feb 19, 2024 · 4 comments · Fixed by #11123
Assignees
Labels
bug Something isn't working formatter Related to the formatter preview Related to preview mode features

Comments

@dhruvmanila
Copy link
Member

dhruvmanila commented Feb 19, 2024

Given:

aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { 
    aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f} cccccccccc"

Ruff (with --preview) would format is as:

aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
    aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"

Here, we're changing the format specifier from .3f to .3f\n.

Now, if the user had the following code:

aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
    aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"

Then, with the new heuristic to not have line breaks would mean that we'd format the above code as:

aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"

The newline is already present in the format specifier .3f\n in the original source code.

Local or Global?

Another question is whether this heuristic should be applied globally or local to an individual expression element. For example, in the following code snippet,

f"""fooooooooooooooooooo barrrrrrrrrrrrrrrrrrr {
        xxxxxxxxxxxxxxx:.3f} aaaaaaaaaaaaaaaaa { xxxxxxx } bbbbbbbbbbbb { 
xxxxxxxxxxx + xxxxxxxxxxx } end"""

We've an expression element which has a line break and doesn't contain a format specifier (third field). This means that the f-string layout is multiline. So, the first element cannot have line breaks while the second and third can have. If so, then we'd potentially format it as:

f"""fooooooooooooooooooo barrrrrrrrrrrrrrrrrrr {xxxxxxxxxxxxxxx:.3f} aaaaaaaaaaaaaaaaa {xxxxxxx} bbbbbbbbbbbb {
	xxxxxxxxxxx + xxxxxxxxxxx
} end"""

If we apply the heuristic globally, then the formatter would collapse all of the expression elements.

Playground: https://play.ruff.rs/c2d0cc39-30ab-4c17-ac8c-0f4d8a9afbb6

@dhruvmanila dhruvmanila added bug Something isn't working formatter Related to the formatter preview Related to preview mode features labels Feb 19, 2024
@MichaReiser
Copy link
Member

Thanks for opening this as an issue. Is this something you plan to work on?

@dhruvmanila
Copy link
Member Author

Yeah, although not right away. It should be a simple fix.

@dhruvmanila dhruvmanila self-assigned this Apr 24, 2024
@dhruvmanila
Copy link
Member Author

I think with #7787, this issue isn't relevant anymore because both unformatted and formatted code produces the same AST as the lexer would stop at the newline token.

@dhruvmanila
Copy link
Member Author

Right, but only for single-quoted string. It's still relevant for triple-quoted string.

dhruvmanila added a commit that referenced this issue Apr 26, 2024
## Summary

This PR fixes the bug where the formatter would format an f-string and
could potentially change the AST.

For a triple-quoted f-string, the element can't be formatted into
multiline if it has a format specifier because otherwise the newline
would be treated as part of the format specifier.

Given the following f-string:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f} ddddddddddddddd eeeeeeee"""
```

The formatter sees that the f-string is already multiline so it assumes
that it can contain line breaks i.e., broken into multiple lines. But,
in this specific case we can't format it as:

```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```
                     
Because the format specifier string would become ".3f\n", which is not
the original string (`.3f`).

If the original source code already contained a newline, they'll be
preserved. For example:
```python
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {
    variable:.3f
} ddddddddddddddd eeeeeeee"""
```

The above will be formatted as:
```py
f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {variable:.3f
} ddddddddddddddd eeeeeeee"""
```

Note that the newline after `.3f` is part of the format specifier which
needs to be preserved.
The Python version is irrelevant in this case.

fixes: #10040 

## Test Plan

Add some test cases to verify this behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working formatter Related to the formatter preview Related to preview mode features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants