Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code style to convert byte arrays to UTF8 strings #60647

Merged
merged 47 commits into from
May 3, 2022

Conversation

davidwengier
Copy link
Contributor

Relates to #58848
Part of #60582

I've never created an analyzer and fixer from scratch before, nor a code style analyzer, so please let me know where, not if, I've gone wrong.

@davidwengier davidwengier requested a review from a team as a code owner April 8, 2022 13:11
return null;
}

private static string GetStringLiteralRepresentation(Rune rune)
Copy link
Member

@Youssef1313 Youssef1313 Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private static bool TryReplaceChar(char c, out string replaceWith)
{
replaceWith = null;
switch (c)
{
case '\\':
replaceWith = "\\\\";
break;
case '\0':
replaceWith = "\\0";
break;
case '\a':
replaceWith = "\\a";
break;
case '\b':
replaceWith = "\\b";
break;
case '\f':
replaceWith = "\\f";
break;
case '\n':
replaceWith = "\\n";
break;
case '\r':
replaceWith = "\\r";
break;
case '\t':
replaceWith = "\\t";
break;
case '\v':
replaceWith = "\\v";
break;
}
if (replaceWith != null)
{
return true;
}
if (NeedsEscaping(CharUnicodeInfo.GetUnicodeCategory(c)))
{
replaceWith = "\\u" + ((int)c).ToString("x4");
return true;
}
return false;
}

This is a similar method (not public from the compiler layer, so it cannot be reused). But it does more work:

     if (NeedsEscaping(CharUnicodeInfo.GetUnicodeCategory(c))) 
     { 
         replaceWith = "\\u" + ((int)c).ToString("x4"); 
         return true; 
     } 

So wondering if the extra work is necessary for this feature? I think the answer is no (I think difference is whether we output things like emojis as \uXXXX vs the character itself). I'm not sure so noting that you can confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do think there is merit in using either \u or \U for things like control/formatting characters. Those will likely show up very weird once in string-literal form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately didnt do anything (yet?) about \u because it's not that simple. My HalfEmoji test is a good example. That byte array should be "\uD83D", but whilst that is a valid string literal, "\uD83D"u8 is not a valid UTF8 string literal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth mentioning similar code to this exists in IVirtualCharService.TryGetEscapeCharacter but it was slightly painful to use. We could definitely expand that system if we wanted to (allow it to take in strings, not just syntax, and produce escaped code as well as consuming it) and use it here instead of direct Rune usage potentially

Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes under Compilers LGTM (commit 45)

Copy link
Member

@cston cston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler changes LGTM.

# Conflicts:
#	src/Features/CSharp/Portable/CSharpFeaturesResources.resx
@davidwengier davidwengier enabled auto-merge (squash) May 3, 2022 03:06
@davidwengier davidwengier merged commit 424bd8e into dotnet:main May 3, 2022
@ghost ghost added this to the Next milestone May 3, 2022
@davidwengier davidwengier deleted the UseUTF8Strings branch May 3, 2022 05:51
333fred added a commit that referenced this pull request May 4, 2022
…ures/required-members

* upstream/main: (368 commits)
  Use new helpers
  Cleanup
  Update src/EditorFeatures/CSharpTest/StringCopyPaste/StringCopyPasteCommandHandlerTests.cs
  Restore
  Add docs
  Fix
  Update Tools|Options code ordering to match waht is in the UI
  Use correct directory for loghub logs (#61104)
  Move usings on paste off the UI thread (#61092)
  Create shared helper to handle finding extensions in analyzer references
  Code style to convert byte arrays to UTF8 strings (#60647)
  Fixup
  Enable semantic token LSP tests + re-enable quick info tests (#61098)
  Simplify initialization logic
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  ...
333fred added a commit to 333fred/roslyn that referenced this pull request May 13, 2022
…o poison

* upstream/features/required-members: (369 commits)
  Update tests after merge
  Use new helpers
  Cleanup
  Update src/EditorFeatures/CSharpTest/StringCopyPaste/StringCopyPasteCommandHandlerTests.cs
  Restore
  Add docs
  Fix
  Update Tools|Options code ordering to match waht is in the UI
  Use correct directory for loghub logs (dotnet#61104)
  Move usings on paste off the UI thread (dotnet#61092)
  Create shared helper to handle finding extensions in analyzer references
  Code style to convert byte arrays to UTF8 strings (dotnet#60647)
  Fixup
  Enable semantic token LSP tests + re-enable quick info tests (dotnet#61098)
  Simplify initialization logic
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  Simplify common CWT pattern
  ...
@Cosifne Cosifne modified the milestones: Next, 17.3 P2 May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.