-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang-format] Stop breaking unbreakable strings in JS #66168
Conversation
Dictionary literal keys and strings in TypeScript type declarations can not be broken. The problem was pointed out by @alexfh and @e-kud here: https://reviews.llvm.org/D154093#4644512
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-format ChangesDictionary literal keys and strings in TypeScript type declarations can not be broken.The problem was pointed out by @alexfh and @e-kud here: https://reviews.llvm.org/D154093#4644512Full diff: https://github.com/llvm/llvm-project/pull/66168.diff 3 Files Affected:
diff --git a/clang/lib/Format/ContinuationIndenter.cpp b/clang/lib/Format/ContinuationIndenter.cpp index ac62dab1b07cdca..75ab08de42ea0e8 100644 --- a/clang/lib/Format/ContinuationIndenter.cpp +++ b/clang/lib/Format/ContinuationIndenter.cpp @@ -2241,6 +2241,12 @@ ContinuationIndenter::createBreakableToken(const FormatToken &Current, if (Style.isJson() || !Style.BreakStringLiterals || !AllowBreak) return nullptr; + // Strings in TypeScript types and dictionary keys can not be broken. + if (Style.isJavaScript() && (Current.is(TT_SelectorName) || + State.Line->startsWith(Keywords.kw_type))) { + return nullptr; + } + // Don't break string literals inside preprocessor directives (except for // #define directives, as their contents are stored in separate lines and // are not affected by this check). diff --git a/clang/unittests/Format/FormatTestJS.cpp b/clang/unittests/Format/FormatTestJS.cpp index bc4c7ff7ba6b30f..309326569143df6 100644 --- a/clang/unittests/Format/FormatTestJS.cpp +++ b/clang/unittests/Format/FormatTestJS.cpp @@ -1596,6 +1596,28 @@ TEST_F(FormatTestJS, StringLiteralConcatenation) { "var literal = `xxxxxx ${xxxxxxxxxxxxxxxxxxxxxx + " "xxxxxxxxxxxxxxxxxxxxxx} xxxxxx`;", getGoogleJSStyleWithColumns(14)); + + // Strings in a TypeScript type declaration can't be broken. + verifyFormat("type x =\n" + " 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';", + getGoogleJSStyleWithColumns(20)); + verifyFormat("/* type */ type x =\n" + " 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';", + getGoogleJSStyleWithColumns(20)); + // Dictionary keys can't be broken. Values can be broken. + verifyFormat("var w = {\n" + " 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx':\n" + " 'xxxxxxxxxx' +\n" + " 'xxxxxxxxxx' +\n" + " 'xxxxxxxxxx' +\n" + " 'xxxxxxxxxx' +\n" + " 'xxx',\n" + "};", + "var w = {\n" + " 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx':\n" + " 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',\n" + "};", + getGoogleJSStyleWithColumns(20)); } TEST_F(FormatTestJS, RegexLiteralClassification) { diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp index 434852983712940..9ab1eae85f29c4d 100644 --- a/clang/unittests/Format/TokenAnnotatorTest.cpp +++ b/clang/unittests/Format/TokenAnnotatorTest.cpp @@ -2040,6 +2040,25 @@ TEST_F(TokenAnnotatorTest, UnderstandDesignatedInitializers) { EXPECT_TOKEN(Tokens[13], tok::period, TT_DesignatedInitializerPeriod); } +TEST_F(TokenAnnotatorTest, UnderstandsJavaScript) { + auto Annotate = [this](llvm::StringRef Code) { + return annotate(Code, getLLVMStyle(FormatStyle::LK_JavaScript)); + }; + + // Dictionary. + auto Tokens = Annotate("var x = {'x' : 1, 'y' : 2};"); + ASSERT_EQ(Tokens.size(), 14u) << Tokens; + EXPECT_TOKEN(Tokens[3], tok::l_brace, TT_DictLiteral); + EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_SelectorName); + EXPECT_TOKEN(Tokens[5], tok::colon, TT_DictLiteral); + EXPECT_TOKEN(Tokens[8], tok::string_literal, TT_SelectorName); + EXPECT_TOKEN(Tokens[9], tok::colon, TT_DictLiteral); + // Change when we need to annotate these. + EXPECT_BRACE_KIND(Tokens[3], BK_Unknown); + EXPECT_BRACE_KIND(Tokens[11], BK_Unknown); + EXPECT_TOKEN(Tokens[11], tok::r_brace, TT_Unknown); +} + } // namespace } // namespace format } // namespace clang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG. Thanks for the prompt fix!
Unfortunately, this is not enough. I found at least one more broken case (pun intended):
gets transformed into
which is also invalid. |
Here is the new fix. #66321 If it is not urgent, I prefer to wait a few days for reviews and then commit the patch myself. |
Thanks! I think, I found another pattern where string literal breaking is not allowed - in parameters of
I used to work on this part of clang-format and now the code a bit, thus I feel comfortable approving this sort of a change. Another thing is that we need to get this fixed really soon. Another (safer and actually preferred) alternative is to revert the change and address problematic use cases before recommitting it, but I'm also fine with forward-fixing this as long as fixes don't require major changes that have potential for other problems to appear. |
See the discussion [here](llvm#66168 (comment)). The functionality is not mature enough.
I see. In that case, I added #66372. However, why do you need it fixed so soon? The latest release is |
Here are string literal context from closure - This still misses contexts when string literal is required, for example https://github.com/search?q=repo%3Agoogle%2Fclosure-compiler%20%22%20must%20be%20a%20string%20literal%22&type=code I wonder, if splitting the literal with |
Maybe that means we should make the Can you set the option in your configuration as a workaround for now? If so, please don't approve #66372. Note that you can set a different style configuration for each language. |
I think, disabling the functionality for JS/TS is the right choice until all wrinkles are ironed.
In general, "as a community, we strongly value having the tip of tree in a good state while allowing rapid iterative development" (as opposed to allowing to break stuff after a release cut and trying to glue the pieces together before the next release, see https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). Some LLVM users (toolchain teams at Google, for example) strongly rely on the ToT LLVM being close to (that is, at most a few patches away from) production quality. We are updating our toolchain ~weekly and expect all tools (including clang-format) to be free from critical defects (if you're interested, there's a talk about this: https://www.youtube.com/watch?v=zh91if43QLM). Other users (including multiple open-source projects) may take various parts of LLVM at arbitrary moments of time with similar expectations to the code quality. This particular problem happens to be critical, since it makes clang-format generate invalid code. Leaving this state for weeks after it was detected (or requiring users to modify their configuration just to work around the problem) is unacceptable, thus I think we should proceed with #66372 (or a revert of the original patch) for now. |
See the discussion [here](#66168 (comment)). The functionality is not mature enough.
Dictionary literal keys and strings in TypeScript type declarations can not be broken. The problem was pointed out by @alexfh and @e-kud here: https://reviews.llvm.org/D154093#4644512
See the discussion [here](llvm#66168 (comment)). The functionality is not mature enough.
Looks like we need the similar fix for verilog, see https://reviews.llvm.org/D154093#4648931 |
I was trying to stop the formatter from breaking the arguments to certain functions by marking the lines as not breakable. Then I found the commit 53c38f4 which says that lines containing |
Dictionary literal keys and strings in TypeScript type declarations can not be broken.
The problem was pointed out by @alexfh and @e-kud here:
https://reviews.llvm.org/D154093#4644512