Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parenthesize conditional expressions #2278

Merged
merged 19 commits into from
Dec 29, 2022

Conversation

JEphron
Copy link
Contributor

@JEphron JEphron commented May 30, 2021

Context: #2248

@JEphron
Copy link
Contributor Author

JEphron commented May 30, 2021

I recognize that this is a moderate styling change and I know the Black maintainers are conservative when it comes to those, but anecdotally this issue has been a source of style nitpicking at my organization and (obviously 😄) we use Black to try to avoid that.

Let me know if this is worth pursuing further

@ichard26
Copy link
Collaborator

ichard26 commented Jun 1, 2021

Hi,

We haven't been able to review your changeset just yet. Sorry about that, sometimes our time as maintainers ends up somewhere else leaving other important behind. At a first glance, I like the new formatting, my only major concern would be the level of disruptive this change may change looking at the black-primer results.

Don't worry much about the primer or uvloop workflows failing, they fail when there is changes in projects we watch. All you will need to do to fix them is to flip the changes expected flag for relevant projects in src/black_primer/primer.json.

Hopefully we will be able to review your work soon, thanks for submitting it!

Copy link
Collaborator

@felix-hilden felix-hilden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the submission! I'd be more inclined to accept this if the tests contained counterexamples and other cases as well. Like imploding a short enough line or exploding from a single line (not already multiline like in the test). How about test expressions elsewhere, like positional arguments, or in entirely different contexts? Anyways, I think this is a solid addition to the style!

Also, I'm not entirely familiar with the visiting code, should the new visit_test be called somewhere? This is really TODO for me to dig around more than anything.


I'm not the one to give a go-ahead with the changes though, especially if this is considered disruptive. So take it with a grain of salt, and maybe wait until other maintainers express their opinions.

@ichard26 ichard26 self-requested a review July 6, 2021 00:46
Copy link
Collaborator

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, the formatting changes are a strong +1 personally. I found this example in the primer output and it's so much more readable now:

--- pyanalyze/typeshed.py	2021-05-30 03:06:13.755715 +0000
+++ pyanalyze/typeshed.py	2021-05-30 03:06:32.534930 +0000
@@ -448,13 +448,13 @@
                 arg = arg.replace(kind=SigParameter.POSITIONAL_OR_KEYWORD)
             cleaned_arguments.append(arg)
         return Signature.make(
             cleaned_arguments,
             callable=obj,
-            return_annotation=GenericValue(Awaitable, [return_value])
-            if is_async_fn
-            else return_value,
+            return_annotation=(
+                GenericValue(Awaitable, [return_value]) if is_async_fn else return_value
+            ),
         )
 
     def _parse_param_list(
         self,
         args: Iterable[ast3.arg],

There are a few instances where heavy nesting gets even more nested with this logic but those are uncommon and aren't unreadable so I think this is still an overall improvement.

Looking at the code, it looks logically sound (and was quite easy to read!), although like Felix, I have a lack of experience regarding this part of the codebase so no clue whether it's performant or efficient. Speaking of performance, a quick benchmark of this PR against main shows a 0% change in overall performance (although it should be noted blackbench, a benchmarking suite, is still in development and isn't as representative as it could be):

❯ pyperf compare_to main-14072be.json pr-2278.json -G -v
Slower (4):
- [format]-[black/numerics.py]: 105 ms +- 2 ms -> 108 ms +- 4 ms: 1.03x slower
- [format]-[black/__init__.py]: 1.86 sec +- 0.04 sec -> 1.92 sec +- 0.06 sec: 1.03x slower
- [format]-[black/cache.py]: 136 ms +- 5 ms -> 139 ms +- 3 ms: 1.02x slower
- [format]-[black/linegen.py]: 1.81 sec +- 0.05 sec -> 1.84 sec +- 0.05 sec: 1.02x slower

Faster (1):
- [format]-[black/nodes.py]: 1.48 sec +- 0.11 sec -> 1.42 sec +- 0.03 sec: 1.05x faster

Benchmark hidden because not significant (15): [format]-[strings-list.py], [format]-[black/brackets.py], [format]-[black/comments.py], [format]-[black/concurrency.py], [format]-[black/const.py], [format]-[black/debug.py], [format]-[black/files.py], [format]-[black/lines.py], [format]-[black/mode.py], [format]-[black/output.py], [format]-[black/parsing.py], [format]-[black/report.py], [format]-[black/rusty.py], [format]-[black/strings.py], [format]-[black/trans.py]

Geometric mean: 1.00x slower

(I bet those slower / faster benchmarks are just due to noise)

I'd like to see more tests to exercise more cases (especially edge cases) but the output doesn't look like there's anything particularly bad. For the primer failure, just mark pyanalyze and sqlalchemy as expecting changes in src/black_primer/primer.json.

Regarding the disruptiveness of this change, it doesn't actually look that bad. This is not for me to really comment on though.

Thank you for submitting the PR! I look forward to its merge. Also apologies for the long wait!


@felix-hilden while I don't know this visitor code well either, if you take a look at src/black/nodes.py for the Visitor class (which LineGenerator subclasses), the visit method is the one that dispatches to specialized visitor methods based off the node's type. The test node is defined in the grammar here:

test: or_test ['if' or_test 'else' test] | lambdef

So when a node of type test is visited, Visitor.visit will dispatch to LineGenerator.visit_test. For more information, you're better off asking Jelle :)

@felix-hilden
Copy link
Collaborator

@ichard26 Oh yeah, thanks for the pointer! And here's where the call is made in the visitor I think.

@JEphron
Copy link
Contributor Author

JEphron commented Jul 6, 2021

As I understand it, performance should only be affected when actually visiting a test node. Since they're a relatively uncommon bit of syntax, they're probably not contributing much to the benchmark numbers. The stringification of the node in order to search for newlines is the only thing that stood out to me as a potential performance sink. I don't have quantitative data on this but I'd wager that it's pretty unusual to have a test node big enough for that to matter (I couldn't find a way to check for multiline-ness without calling str, though if one existed it'd probably be better to avoid stringifying)

I'd be happy to add some more tests though!

Copy link
Collaborator

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, agree that this is an improvement in formatting. I might just go edit pyanalyze now to apply this formatting :)

I feel like this could cause stability problems if Black inserts newlines in the first run, then parens in the second. Could you add some tests where everything is in one line in the original input?

The str(node) is indeed a bit awkward. An alternative could be to search for something like newline tokens in the children of the node, but I feel that that may be more fragile.

@JEphron
Copy link
Contributor Author

JEphron commented Jul 10, 2021

@JelleZijlstra yeah, I see what you're saying about stability. The test runner discovers that the following isn't stable (wrapping on the first call, then parenthesizing on the second)

long_kwargs_single_line = my_function(
    foo="test, this is a sample value",
    bar=some_long_value_name_foo_bar_baz if some_boolean_variable else some_fallback_value_foo_bar_baz,
    baz="hello, this is a another value",
)

Interestingly, black seems to perform two passes when called from the command line
image

So when running it for real we do get the stable output

long_kwargs_single_line = my_function(
    foo="test, this is a sample value",
    bar=some_long_value_name_foo_bar_baz if some_boolean_variable else some_fallback_value_foo_bar_baz,
    baz="hello, this is a another value",
)

# output

long_kwargs_single_line = my_function(
    foo="test, this is a sample value",
    bar=(
        some_long_value_name_foo_bar_baz
        if some_boolean_variable
        else some_fallback_value_foo_bar_baz
    ),
    baz="hello, this is a another value",
)

I'll do some more digging to see if I can make it stable in a single pass. Maybe it's possible to trigger the line wrapping logic before trying to visit the test node?

@JelleZijlstra
Copy link
Collaborator

Yes, we run it twice in normal mode to deal with some other intractable stability issues. I guess we could also rely on the double run for this PR, but then we could run into trouble if we trigger the other condition that causes stability problems.

@ichard26 ichard26 added the F: parentheses Too many parentheses, not enough parentheses, and so on. label Jul 16, 2021
@JEphron
Copy link
Contributor Author

JEphron commented Aug 7, 2021

@JelleZijlstra

Circling back around to this. I was hoping to find a way to have the visitor always insert parenthesis, and then have the existing parenthesis-stripping logic deal with cases where the parenthesis are redundant (single line expressions). It seems like the "invisible parenthesis" mechanism may be suited for that (?). Anyway I've pushed an update removing the multiline check, and it seems stable in my local testing.

@ichard26 ichard26 assigned ichard26 and unassigned ichard26 Aug 21, 2021
Copy link
Collaborator

@felix-hilden felix-hilden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long! I'm pretty convinced that this is indeed doing the right thing now 👍 I'm still a bit hesitant to give you the go-ahead because of my ongoing lack of knowledge about the visitor code, but the logic seems simple enough and I trust the tests! But let's wait for another review from Jelle.

A small adjustment though: please move the changelog entry to the latest heading.

@ichard26
Copy link
Collaborator

ichard26 commented Dec 2, 2021

diff-shades report: https://gist.github.com/ichard26/c03fd345f8096cf800eb1d042c536059. please note that the removals in the diff are from #2472 which was recently merged (I didn't update my base data). I fixed it. a quick scan of it looks pretty good to me! the run also didn't report any crashes so that's good 🙂

@JelleZijlstra
Copy link
Collaborator

hm this seems wrong in @ichard26's report (in django somewhere):

             for sql in (
-                'OFFSET %d ROWS' % offset if offset else None,
-                'FETCH FIRST %d ROWS ONLY' % fetch if fetch else None,
+                ('OFFSET %d ROWS' % offset) if offset else None,
+                ('FETCH FIRST %d ROWS ONLY' % fetch) if fetch else None,
             )

@JelleZijlstra
Copy link
Collaborator

This one is also odd:

         clone._iterable_class = (
             NamedValuesListIterable
             if named
-            else FlatValuesListIterable
-            if flat
-            else ValuesListIterable
+            else FlatValuesListIterable if flat else ValuesListIterable
         )

@ichard26 ichard26 added this to the Release 22.0 milestone Dec 12, 2021
@ichard26 ichard26 removed this from the Release 22.0 milestone Jan 1, 2022
@JelleZijlstra
Copy link
Collaborator

I still like most of what this change does, but there are a few weird cases (noted in my comment above).

Also, we'll now have to make this go into the --preview style only.

@github-actions
Copy link

github-actions bot commented Feb 21, 2022

diff-shades results comparing this PR (2fecc15) to main (0abe85e). The full diff is available in the logs under the "Generate HTML diff report" step.

╭────────────────────────── Summary ──────────────────────────╮
│ 13 projects & 142 files changed / 2265 changes [+1363/-902] │
│                                                             │
│ ... out of 2 363 829 lines, 11 046 files & 23 projects      │
╰─────────────────────────────────────────────────────────────╯

Differences found.

What is this? | Workflow run | diff-shades documentation

@JEphron
Copy link
Contributor Author

JEphron commented Feb 21, 2022

Sorry for the long delay, coming back around to this now :)

@JelleZijlstra it should respect --preview now.

Responding to edge cases:

             for sql in (
-                'OFFSET %d ROWS' % offset if offset else None,
-                'FETCH FIRST %d ROWS ONLY' % fetch if fetch else None,
+                ('OFFSET %d ROWS' % offset) if offset else None,
+                ('FETCH FIRST %d ROWS ONLY' % fetch) if fetch else None,
             )

I wasn't able to reproduce this change. It actually seems like Django has accepted the new formatting as part of their black reformat.


         clone._iterable_class = (
             NamedValuesListIterable
             if named
-            else FlatValuesListIterable
-            if flat
-            else ValuesListIterable
+            else FlatValuesListIterable if flat else ValuesListIterable
         )

I think what's happening here is that we're adding the invisible parens around the second test and then flattening it out because it's shorter than the line length limit. If we force it to be longer it formats a little bit differently:

def something():
    clone._iterable_class = (
        NamedValuesListIterable
        if named
        else (
            FlatValuesListIterable
            if flat
            else ValuesListIterableeeeeeeeeeeeeeeeeeeeeeeee
        )
    )

I'm not sure exactly how to fix this. It might require a special case for chained tests if we want to try to keep them all at the same level of indentation.

Either way I've added both these cases to the test data.

@ichard26 ichard26 added this to the Release 22.8.0 milestone Aug 1, 2022
@ichard26 ichard26 removed this from the Release 22.8.0 milestone Aug 15, 2022
@JelleZijlstra JelleZijlstra self-assigned this Dec 29, 2022
@JelleZijlstra JelleZijlstra merged commit 4e3303f into psf:main Dec 29, 2022
hugovk pushed a commit to hugovk/black that referenced this pull request Jan 16, 2023
Co-authored-by: Jordan Ephron <JEphron@users.noreply.github.com>
Co-authored-by: Richard Si <63936253+ichard26@users.noreply.github.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
copybara-service bot pushed a commit to google/pyink that referenced this pull request Feb 6, 2023
Noticeable style changes:

1. Parenthesize multiple context managers psf#3489.

The following style changes are temporarily disabled when `--preview` is used together with `--pyink`:

2. Format unicode escape sequences psf#2916.
3. Parenthesize conditional expressions psf#2278.

PiperOrigin-RevId: 507485670
@Andrew-S-Rosen
Copy link

I'm not sure exactly how to fix this. It might require a special case for chained tests if we want to try to keep them all at the same level of indentation.

Just chiming in to say that this edge case (which happens surprisingly often in my experience) is visually confusing due to the mismatched indent. Would be good to see if we can brainstorm an approach that maintains the same level of indenting. Will think on it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: parentheses Too many parentheses, not enough parentheses, and so on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants