Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve String Handling #1132

Merged
merged 381 commits into from
May 8, 2020
Merged

Improve String Handling #1132

merged 381 commits into from
May 8, 2020

Conversation

bbugyi200
Copy link
Contributor

@bbugyi200 bbugyi200 commented Nov 2, 2019

This pull request's main intention is to wrap long strings (as requested by #182); however, it also provides better string handling in general and, in doing so, closes the following issues:

Closes #26
Closes #182
Closes #933
Closes #1183
Closes #1243

Examples

  • f-strings will be split if they are too long (just like a normal long string would be) and the f prefix will be dropped when possible:
##### INPUT
fstring = f"f-strings definitely make things more {difficult} than they need to be for {{black}}. But boy they sure are handy. The problem is that some lines will need to have the 'f' whereas others do not. This {line}, for example, needs one."

##### OLD OUTPUT
fstring = f"f-strings definitely make things more {difficult} than they need to be for {{black}}. But boy they sure are handy. The problem is that some lines will need to have the 'f' whereas others do not. This {line}, for example, needs one."

##### NEW OUTPUT
fstring = (
    f"f-strings definitely make things more {difficult} than they need to be for"
    " {black}. But boy they sure are handy. The problem is that some lines will need"
    f" to have the 'f' whereas others do not. This {line}, for example, needs one."
)
  • Manual user splits will be respected when it is possible to do so without violating the line length limit:
##### INPUT
good_split_func(
    xxx, yyy, zzz,
    long_string_kwarg="But what should happen when code has already "
                      "been formatted but in the wrong way? Like "
                      "with a space at the end instead of the "
                      "beginning. Or what about when it is split too "
                      "soon?",
)

##### OLD OUTPUT
good_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg="But what should happen when code has already "
    "been formatted but in the wrong way? Like "
    "with a space at the end instead of the "
    "beginning. Or what about when it is split too "
    "soon?",
)

##### NEW OUTPUT
good_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg=(
        "But what should happen when code has already "
        "been formatted but in the wrong way? Like "
        "with a space at the end instead of the "
        "beginning. Or what about when it is split too "
        "soon?"
    ),
)
  • If a manual user split violates the line length limit, however, it will NOT be respected:
##### INPUT
bad_split_func(
    xxx, yyy, zzz,
    long_string_kwarg="But what should happen when code has already been formatted but in the wrong way? Like with "
                      "a space at the end instead of the beginning. Or what about when it is split too soon?",
)

##### OLD OUTPUT
bad_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg="But what should happen when code has already been formatted but in the wrong way? Like with "
    "a space at the end instead of the beginning. Or what about when it is split too soon?",
)

##### NEW OUTPUT
bad_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg=(
        "But what should happen when code has already been formatted but in the wrong"
        " way? Like with a space at the end instead of the beginning. Or what about"
        " when it is split too soon?"
    ),
)
  • Line continuation backslashes inside of strings will no longer be tolerated:
##### INPUT
bad_split = "\
But what should happen when code has already\
 been formatted but in the wrong way? Like\
 with a space at the beginning instead of the\
 end. Or what about when it is split too\
 soon? In the case of a split that is too\
 short, black will try to honer the custom\
 split.\
"

##### OLD OUTPUT
bad_split = "\
But what should happen when code has already\
 been formatted but in the wrong way? Like\
 with a space at the beginning instead of the\
 end. Or what about when it is split too\
 soon? In the case of a split that is too\
 short, black will try to honer the custom\
 split.\
"

##### NEW OUTPUT
bad_split = (
    "But what should happen when code has already been formatted but in the wrong way?"
    " Like with a space at the end instead of the beginning. Or what about when it is"
    " split too soon? In the case of a split that is too short, black will try to honer"
    " the custom split."
)
  • Unnecessary surrounding parens will be stripped from strings and adjacent short strings will be merged when possible:
##### INPUT
func_call(
    sss=(
        "Some "
        '"short" '
        f"{string}."
    )
)

##### OLD OUTPUT
func_call(sss=("Some " '"short" ' f"{string}."))

##### NEW OUTPUT
func_call(sss=f'Some "short" {string}.')

@bbugyi200 bbugyi200 changed the title Wrap long strings [WIP] Wrap long strings Nov 2, 2019
@bbugyi200 bbugyi200 force-pushed the 182-wrap-long-strings branch from 274715a to 8592d46 Compare November 3, 2019 18:09
bbugyi200 added a commit to bbugyi200/black that referenced this pull request Nov 12, 2019
This commit signifigantly dirties up the code introduced in this PR
(psf#1132). I plan to clean all of this up considerably before merging with
master.
@bbugyi200 bbugyi200 force-pushed the 182-wrap-long-strings branch from c74192d to 4d42b81 Compare December 7, 2019 20:42
@bbugyi200 bbugyi200 force-pushed the 182-wrap-long-strings branch from 2a93aa3 to e0eec91 Compare December 29, 2019 22:03
@ichard26
Copy link
Collaborator

@bbugyi200, sorry for the ping. I have a question if this formatting is intended or not.

(black) richard-26@ubuntu-laptop:~/programming/black$ black test.py --diff --color
--- test.py	2020-07-20 00:40:48.843094 +0000
+++ test.py	2020-07-20 00:40:50.391909 +0000
@@ -10,11 +10,6 @@
-raise ValueError(
-    'Invalid input:\n'
-    f' * x={x}\n'
-    f' * y={y}\n'
-    f' * z={z}'
-)
+raise ValueError(f"Invalid input:\n * x={x}\n * y={y}\n * z={z}")

At first, I assumed this wouldn't happen because of "Manual user splits will be respected when it is possible to do so without violating the line length limit" Is this string literal merging related to "Unnecessary surrounding parens will be stripped from strings and adjacent short strings will be merged when possible"? Or is the "manual user splits" protection is only for assignments?

Asking since I am a bit surprised at Black's output from this issue: #1540

Thanks!

@bbugyi200
Copy link
Contributor Author

bbugyi200 commented Jul 22, 2020

@ichard26 No problem. Let's look at how black handled this example before compared to how it does now:

##### INPUT
raise ValueError(
    'Invalid input:\n'
    f' * x={x}\n'
    f' * y={y}\n'
    f' * z={z}'
)

##### OLD OUTPUT
raise ValueError("Invalid input:\n" f" * x={x}\n" f" * y={y}\n" f" * z={z}")

##### NEW OUTPUT
raise ValueError(f"Invalid input:\n * x={x}\n * y={y}\n * z={z}")

So the only thing that #1132 has changed about this output is that the strings are merged together.

@skshetry skshetry mentioned this pull request Aug 17, 2020
2 tasks
@Hendler Hendler mentioned this pull request Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment