-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix f-string formatting in assignment statement #14454
Conversation
4f24b97
to
9f968a5
Compare
|
This is looking good! |
This comment was marked as resolved.
This comment was marked as resolved.
b7aa489
to
c2d8b38
Compare
19e6a99
to
4c71b5d
Compare
c2d8b38
to
55574a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall looks good, but I think we're now using the BestFit
layout in too many places, which leads to bad formatting in clause headers (see my inline comment).
I think we should only use the BestFit
layout if the FString
uses the flat layout. In all other cases, we should use the Multiline
layout.
We should add some more tests that cover BestFit
layout usages in non assignment positions to verify that we got the needs_parentheses
changes correct.
See
Lines 296 to 321 in 73ee72b
############################################################################## | |
# Regressions | |
############################################################################## | |
LEEEEEEEEEEEEEEEEEEEEEEFT = RRRRRRRRIIIIIIIIIIIIGGGGGHHHT | { | |
"entityNameeeeeeeeeeeeeeeeee", # comment must be long enough to | |
"some long implicit concatenated string" "that should join" | |
} | |
# Ensure that flipping between Multiline and BestFit layout results in stable formatting | |
# when using IfBreaksParenthesized layout. | |
assert False, "Implicit concatenated string" "uses {} layout on {} format".format( | |
"Multiline", "first" | |
) | |
assert False, await "Implicit concatenated string" "uses {} layout on {} format".format( | |
"Multiline", "first" | |
) | |
assert False, "Implicit concatenated stringuses {} layout on {} format"[ | |
aaaaaaaaa, bbbbbb | |
] | |
assert False, +"Implicit concatenated string" "uses {} layout on {} format".format( | |
"Multiline", "first" | |
) |
and
Lines 128 to 224 in 73ee72b
# Fits | |
with "aa" "bbb" "cccccccccccccccccccccccccccccccccccccccccccccc": | |
pass | |
# Parenthesize single-line | |
with "aa" "bbb" "ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc": | |
pass | |
# Multiline | |
with "aa" "bbb" "cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc": | |
pass | |
with f"aaaaaaa{expression}bbbb" f"ccc {20999}" "more": | |
pass | |
############################################################################## | |
# For loops | |
############################################################################## | |
# Flat | |
for a in "aaaaaaaaa" "bbbbbbbbb" "ccccccccc" "dddddddddd": | |
pass | |
# Parenthesize single-line | |
for a in "aaaaaaaaa" "bbbbbbbbb" "ccccccccc" "dddddddddd" "eeeeeeeeeeeeeee" "fffffffffffff" "ggggggggggggggg" "hh": | |
pass | |
# Multiline | |
for a in "aaaaaaaaa" "bbbbbbbbb" "ccccccccc" "dddddddddd" "eeeeeeeeeeeeeee" "fffffffffffff" "ggggggggggggggg" "hhhh": | |
pass | |
############################################################################## | |
# Assert statement | |
############################################################################## | |
# Fits | |
assert "aaaaaaaaa" "bbbbbbbbbbbb", "cccccccccccccccc" "dddddddddddddddd" | |
# Wrap right | |
assert "aaaaaaaaa" "bbbbbbbbbbbb", "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffff" | |
# Right multiline | |
assert "aaaaaaaaa" "bbbbbbbbbbbb", "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffffffff" "ggggggggggggg" "hhhhhhhhhhh" | |
# Wrap left | |
assert "aaaaaaaaa" "bbbbbbbbbbbb" "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffffffff", "ggggggggggggg" "hhhhhhhhhhh" | |
# Left multiline | |
assert "aaaaaaaaa" "bbbbbbbbbbbb" "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffffffff" "ggggggggggggg", "hhhhhhhhhhh" | |
# wrap both | |
assert "aaaaaaaaa" "bbbbbbbbbbbb" "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffffffff", "ggggggggggggg" "hhhhhhhhhhh" "iiiiiiiiiiiiiiiiii" "jjjjjjjjjjjjj" "kkkkkkkkkkkkkkkkk" "llllllllllll" | |
# both multiline | |
assert "aaaaaaaaa" "bbbbbbbbbbbb" "cccccccccccccccc" "dddddddddddddddd" "eeeeeeeeeeeee" "fffffffffffffff" "ggggggggggggg", "hhhhhhhhhhh" "iiiiiiiiiiiiiiiiii" "jjjjjjjjjjjjj" "kkkkkkkkkkkkkkkkk" "llllllllllll" "mmmmmmmmmmmmmm" | |
############################################################################## | |
# In clause headers (can_omit_optional_parentheses) | |
############################################################################## | |
# Use can_omit_optional_parentheses layout to avoid an instability where the formatter | |
# picks the can_omit_optional_parentheses layout when the strings are joined. | |
if ( | |
f"implicit" | |
"concatenated" | |
"string" + f"implicit" | |
"concaddddddddddded" | |
"ring" | |
* len([aaaaaa, bbbbbbbbbbbbbbbb, cccccccccccccccccc, ddddddddddddddddddddddddddd]) | |
): | |
pass | |
# Keep parenthesizing multiline - implicit concatenated strings | |
if ( | |
f"implicit" | |
"""concatenate | |
d""" | |
"string" + f"implicit" | |
"concaddddddddddded" | |
"ring" | |
* len([aaaaaa, bbbbbbbbbbbbbbbb, cccccccccccccccccc, ddddddddddddddddddddddddddd]) | |
): | |
pass | |
if ( | |
[ | |
aaaaaa, | |
bbbbbbbbbbbbbbbb, | |
cccccccccccccccccc, | |
ddddddddddddddddddddddddddd, | |
] | |
+ "implicitconcat" | |
"enatedstriiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiing" | |
): | |
pass |
// This isn't decided yet, refer to the relevant discussion: | ||
// https://github.com/astral-sh/ruff/discussions/9785 | ||
else if StringLike::FString(self).is_multiline(context.source()) { | ||
} else if StringLike::FString(self).is_multiline(context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using BestFit
for all f-strings that contain multiline expressions isn't correct. For example. It results in
if f"aaaaaaaaaaa { ttttteeeeeeeeest} more {
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
}": pass
being formatted as
if f"aaaaaaaaaaa {ttttteeeeeeeeest} more {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}":
pass
Which seems worse than before.
I think we should keep using Multiline
if the f-string has any multiline expression to avoid collapsing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason it was formatted like that previously is because the needs_parentheses
would return OptionalParentheses::Never
. If we would return Multiline
then I think parentheses will be added making it:
if (
f"aaaaaaaaaaa {ttttteeeeeeeeest} more {
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
}"
):
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also decide to return Never
. It's not entirely clear to me what the better and more consistent layout is. I would have to play with a few layouts but I don't think it should be what it is now.
Using Never has the advantage that it avoids unnecessary parentheses and is closer to what we had today (and no one complained?). Adding parentheses is similar to having "aaaaa" + tttttt + "more(aaaaaaaaaaaaaaaaaaaaaaaaaa)
. It might be good to play with the formatting in other positions where we use BestFit
to decide what the best layout is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good idea, I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test for this
if string.is_implicit_concatenated() || !string.is_multiline(context) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your change is an improvement to what we had before.
For example, this was correctly formatted code before
call(f"{
testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
}")
But I don't think it's correct to use the "hug" layout if an inner expression is multiline
call(f"{
aaaaaa
+ '''test
more'''
}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow here. Are you saying that both the code snippet should result in the formatting where the f-string is on the same line as the call expression (call(f"{
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not directly related to your PR, but I noticed it when reviewing it because you changed is_multiline
. We have to make a decision on the idiomatic formatting for "hugging multiline strings" for f-strings.
I'm leaning towards that we should only "hug" if the f-string itself contains any multiline string literal, but not if any inner-expression is multiline. Similar to prettier:
call(
`${
aaaaaaaaaaaaaaaaaaaaaaaaaaaa +
`test more
aaa` +
morrrrrrrrrrrrrrrrrrrrrr
}`,
);
So what I'm saying is that we should probably not hug for the both cases above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we add an explicit is_triple_quoted
check here. I don't think we ever want this to apply to any non-triple quoted strings. And let's add some test cases that cover f-strings too (cases where we don't want the layout to apply as well as cases where it should apply)
crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/fstring.py
Show resolved
Hide resolved
03896f0
to
4ce0c42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I still think we should add more tests for f-strings in clause headers and assert statements. For example:
if (f"teaaaaaaaaaaaaaaaaaaaaaaaa{
expressioneeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee # comment
}aast".contains('a')
):
pass
We now remove parentheses from:
aaaaaaaaaaaaaaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{[a, b,
# comment
]}moee" # comment
)
but keep them for
aaaaaaaaaaaaaaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{[a, b,
]}moee"
# comment
)
The second case is consistent with how we handle parentheses for all other non-fstring assignments.
But I think what you have now is consistent with how e.g. lists are formatted where the parentheses are removed as well
aaaaaaaaaaaaaaaaaa = (
[testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee,
# comment
]
)
if is_f_string_formatting_enabled(context) { | ||
// Expressions containing comments can't be joined. | ||
// | ||
// Format specifiers needs to be checked as well. For example, the | ||
// following should be considered multiline because the literal | ||
// part of the format specifier contains a newline at the end | ||
// (`.3f\n`): | ||
// | ||
// ```py | ||
// x = f"hello {a + b + c + d:.3f | ||
// } world" | ||
// ``` | ||
context.comments().contains_comments(expression.into()) | ||
|| expression.format_spec.as_deref().is_some_and(|spec| { | ||
contains_line_break_or_comments(&spec.elements, context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider debug expressions too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. If debug expressions are present, then there are no breakpoints in that f-string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But they could be multiline, no? The logic here isn't specific to the assignment formatting but is generally used to determine if a string is known to be multiline and it's unclear to me how a newline in a f-string literal is different from a newline in the debug expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so something like the following which cannot be flattened:
aaaaaaaa = f"aaaaaaaaaa {
aaaaaaa + bbbbbbb + cccccccc = } dddddddddd"
So, I think we should check if there's a debug expression and if so then check for line breaks in the expression itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think we also have to test in the debug expression because the above has no line break in the expression, only in the debug expression but the whole f-string should be considered multiline (whether it can be flattened is irrelevant here because this is a general purpose method to determine if a string contains a hard line break)
if string.is_implicit_concatenated() || !string.is_multiline(context) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we add an explicit is_triple_quoted
check here. I don't think we ever want this to apply to any non-triple quoted strings. And let's add some test cases that cover f-strings too (cases where we don't want the layout to apply as well as cases where it should apply)
1e63fdc
to
0587004
Compare
I've added a bunch of test cases for f-strings in various positions. The formatting is different between the one in assignment vs e.g., a So, the following: aaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
pass will get formatted to: aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
pass I wonder if this would need to be done instead at the f-string expression level. |
The for case seems fine to me. It's mostly consistent with the formatting in if-statements: for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
pass for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass This seems fine to me and we can iterate on the exact formatting in future style guides based on actual user feedback |
The ecosystem change looks correct to me and is similar to what we do for regular strings: --- /Users/dhruv/playground/ruff/formatter/preview.py
+++ /Users/dhruv/playground/ruff/formatter/preview.py
@@ -1,6 +1,10 @@
if 1:
if 1:
- print("Expected num_entities: {len(self.batchs)*num_entities}. \
- Acutal num_entites: {self.get_thread_local_collection().num_entities}")
- print(f"Expected num_entities: {len(self.batchs)*num_entities}. \
- Acutal num_entites: {self.get_thread_local_collection().num_entities}")
+ print(
+ "Expected num_entities: {len(self.batchs)*num_entities}. \
+ Acutal num_entites: {self.get_thread_local_collection().num_entities}"
+ )
+ print(
+ f"Expected num_entities: {len(self.batchs) * num_entities}. \
+ Acutal num_entites: {self.get_thread_local_collection().num_entities}"
+ ) Going to merge this now. |
## Summary This PR fixes a bug in the f-string formatting to not consider the escaped newlines for `is_multiline`. This is done by checking if the f-string is triple-quoted or not similar to normal string literals. This is not required to be gated behind preview because the logic change for `is_multiline` was added in #14454. ## Test Plan Add a test case which formats differently on `main`: https://play.ruff.rs/ea3c55c2-f0fe-474e-b6b8-e3365e0ede5e
Summary
fixes: #13813
This PR fixes a bug in the formatting assignment statement when the value is an f-string.
This is resolved by using custom best fit layouts if the f-string is (a) not already a flat f-string (thus, cannot be multiline) and (b) is not a multiline string (thus, cannot be flattened). So, it is used in cases like the following:
Which is (a)
FStringLayout::Multiline
and (b) not a multiline.There are various other examples in the PR diff along with additional explanation and context as code comments.
Test Plan
Add multiple test cases for various scenarios.