-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parenthesize conditional expressions #2278
Parenthesize conditional expressions #2278
Conversation
I recognize that this is a moderate styling change and I know the Black maintainers are conservative when it comes to those, but anecdotally this issue has been a source of style nitpicking at my organization and (obviously 😄) we use Black to try to avoid that. Let me know if this is worth pursuing further |
Hi, We haven't been able to review your changeset just yet. Sorry about that, sometimes our time as maintainers ends up somewhere else leaving other important behind. At a first glance, I like the new formatting, my only major concern would be the level of disruptive this change may change looking at the black-primer results. Don't worry much about the primer or uvloop workflows failing, they fail when there is changes in projects we watch. All you will need to do to fix them is to flip the changes expected flag for relevant projects in Hopefully we will be able to review your work soon, thanks for submitting it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the submission! I'd be more inclined to accept this if the tests contained counterexamples and other cases as well. Like imploding a short enough line or exploding from a single line (not already multiline like in the test). How about test expressions elsewhere, like positional arguments, or in entirely different contexts? Anyways, I think this is a solid addition to the style!
Also, I'm not entirely familiar with the visiting code, should the new visit_test
be called somewhere? This is really TODO for me to dig around more than anything.
I'm not the one to give a go-ahead with the changes though, especially if this is considered disruptive. So take it with a grain of salt, and maybe wait until other maintainers express their opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, the formatting changes are a strong +1 personally. I found this example in the primer output and it's so much more readable now:
--- pyanalyze/typeshed.py 2021-05-30 03:06:13.755715 +0000
+++ pyanalyze/typeshed.py 2021-05-30 03:06:32.534930 +0000
@@ -448,13 +448,13 @@
arg = arg.replace(kind=SigParameter.POSITIONAL_OR_KEYWORD)
cleaned_arguments.append(arg)
return Signature.make(
cleaned_arguments,
callable=obj,
- return_annotation=GenericValue(Awaitable, [return_value])
- if is_async_fn
- else return_value,
+ return_annotation=(
+ GenericValue(Awaitable, [return_value]) if is_async_fn else return_value
+ ),
)
def _parse_param_list(
self,
args: Iterable[ast3.arg],
There are a few instances where heavy nesting gets even more nested with this logic but those are uncommon and aren't unreadable so I think this is still an overall improvement.
Looking at the code, it looks logically sound (and was quite easy to read!), although like Felix, I have a lack of experience regarding this part of the codebase so no clue whether it's performant or efficient. Speaking of performance, a quick benchmark of this PR against main shows a 0% change in overall performance (although it should be noted blackbench, a benchmarking suite, is still in development and isn't as representative as it could be):
❯ pyperf compare_to main-14072be.json pr-2278.json -G -v
Slower (4):
- [format]-[black/numerics.py]: 105 ms +- 2 ms -> 108 ms +- 4 ms: 1.03x slower
- [format]-[black/__init__.py]: 1.86 sec +- 0.04 sec -> 1.92 sec +- 0.06 sec: 1.03x slower
- [format]-[black/cache.py]: 136 ms +- 5 ms -> 139 ms +- 3 ms: 1.02x slower
- [format]-[black/linegen.py]: 1.81 sec +- 0.05 sec -> 1.84 sec +- 0.05 sec: 1.02x slower
Faster (1):
- [format]-[black/nodes.py]: 1.48 sec +- 0.11 sec -> 1.42 sec +- 0.03 sec: 1.05x faster
Benchmark hidden because not significant (15): [format]-[strings-list.py], [format]-[black/brackets.py], [format]-[black/comments.py], [format]-[black/concurrency.py], [format]-[black/const.py], [format]-[black/debug.py], [format]-[black/files.py], [format]-[black/lines.py], [format]-[black/mode.py], [format]-[black/output.py], [format]-[black/parsing.py], [format]-[black/report.py], [format]-[black/rusty.py], [format]-[black/strings.py], [format]-[black/trans.py]
Geometric mean: 1.00x slower
(I bet those slower / faster benchmarks are just due to noise)
I'd like to see more tests to exercise more cases (especially edge cases) but the output doesn't look like there's anything particularly bad. For the primer failure, just mark pyanalyze
and sqlalchemy
as expecting changes in src/black_primer/primer.json
.
Regarding the disruptiveness of this change, it doesn't actually look that bad. This is not for me to really comment on though.
Thank you for submitting the PR! I look forward to its merge. Also apologies for the long wait!
@felix-hilden while I don't know this visitor code well either, if you take a look at src/black/nodes.py
for the Visitor
class (which LineGenerator
subclasses), the visit
method is the one that dispatches to specialized visitor methods based off the node's type. The test
node is defined in the grammar here:
black/src/blib2to3/Grammar.txt
Line 135 in 14072be
test: or_test ['if' or_test 'else' test] | lambdef |
So when a node of type test
is visited, Visitor.visit
will dispatch to LineGenerator.visit_test
. For more information, you're better off asking Jelle :)
As I understand it, performance should only be affected when actually visiting a I'd be happy to add some more tests though! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, agree that this is an improvement in formatting. I might just go edit pyanalyze now to apply this formatting :)
I feel like this could cause stability problems if Black inserts newlines in the first run, then parens in the second. Could you add some tests where everything is in one line in the original input?
The str(node)
is indeed a bit awkward. An alternative could be to search for something like newline tokens in the children of the node, but I feel that that may be more fragile.
@JelleZijlstra yeah, I see what you're saying about stability. The test runner discovers that the following isn't stable (wrapping on the first call, then parenthesizing on the second) long_kwargs_single_line = my_function(
foo="test, this is a sample value",
bar=some_long_value_name_foo_bar_baz if some_boolean_variable else some_fallback_value_foo_bar_baz,
baz="hello, this is a another value",
) Interestingly, So when running it for real we do get the stable output long_kwargs_single_line = my_function(
foo="test, this is a sample value",
bar=some_long_value_name_foo_bar_baz if some_boolean_variable else some_fallback_value_foo_bar_baz,
baz="hello, this is a another value",
)
# output
long_kwargs_single_line = my_function(
foo="test, this is a sample value",
bar=(
some_long_value_name_foo_bar_baz
if some_boolean_variable
else some_fallback_value_foo_bar_baz
),
baz="hello, this is a another value",
) I'll do some more digging to see if I can make it stable in a single pass. Maybe it's possible to trigger the line wrapping logic before trying to visit the test node? |
Yes, we run it twice in normal mode to deal with some other intractable stability issues. I guess we could also rely on the double run for this PR, but then we could run into trouble if we trigger the other condition that causes stability problems. |
Circling back around to this. I was hoping to find a way to have the visitor always insert parenthesis, and then have the existing parenthesis-stripping logic deal with cases where the parenthesis are redundant (single line expressions). It seems like the "invisible parenthesis" mechanism may be suited for that (?). Anyway I've pushed an update removing the multiline check, and it seems stable in my local testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long! I'm pretty convinced that this is indeed doing the right thing now 👍 I'm still a bit hesitant to give you the go-ahead because of my ongoing lack of knowledge about the visitor code, but the logic seems simple enough and I trust the tests! But let's wait for another review from Jelle.
A small adjustment though: please move the changelog entry to the latest heading.
diff-shades report: https://gist.github.com/ichard26/c03fd345f8096cf800eb1d042c536059. |
hm this seems wrong in @ichard26's report (in django somewhere):
|
This one is also odd:
|
I still like most of what this change does, but there are a few weird cases (noted in my comment above). Also, we'll now have to make this go into the |
diff-shades results comparing this PR (2fecc15) to main (0abe85e). The full diff is available in the logs under the "Generate HTML diff report" step.
|
Sorry for the long delay, coming back around to this now :) @JelleZijlstra it should respect Responding to edge cases: for sql in (
- 'OFFSET %d ROWS' % offset if offset else None,
- 'FETCH FIRST %d ROWS ONLY' % fetch if fetch else None,
+ ('OFFSET %d ROWS' % offset) if offset else None,
+ ('FETCH FIRST %d ROWS ONLY' % fetch) if fetch else None,
) I wasn't able to reproduce this change. It actually seems like Django has accepted the new formatting as part of their black reformat. clone._iterable_class = (
NamedValuesListIterable
if named
- else FlatValuesListIterable
- if flat
- else ValuesListIterable
+ else FlatValuesListIterable if flat else ValuesListIterable
) I think what's happening here is that we're adding the invisible parens around the second test and then flattening it out because it's shorter than the line length limit. If we force it to be longer it formats a little bit differently: def something():
clone._iterable_class = (
NamedValuesListIterable
if named
else (
FlatValuesListIterable
if flat
else ValuesListIterableeeeeeeeeeeeeeeeeeeeeeeee
)
) I'm not sure exactly how to fix this. It might require a special case for chained tests if we want to try to keep them all at the same level of indentation. Either way I've added both these cases to the test data. |
Co-authored-by: Jordan Ephron <JEphron@users.noreply.github.com> Co-authored-by: Richard Si <63936253+ichard26@users.noreply.github.com> Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Noticeable style changes: 1. Parenthesize multiple context managers psf#3489. The following style changes are temporarily disabled when `--preview` is used together with `--pyink`: 2. Format unicode escape sequences psf#2916. 3. Parenthesize conditional expressions psf#2278. PiperOrigin-RevId: 507485670
Just chiming in to say that this edge case (which happens surprisingly often in my experience) is visually confusing due to the mismatched indent. Would be good to see if we can brainstorm an approach that maintains the same level of indenting. Will think on it... |
Context: #2248