Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code style to convert byte arrays to UTF8 strings #60647

Merged
merged 47 commits into from
May 3, 2022
Merged
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
e997bd4
Initial analyzer to report diagnostics for byte arrays that can be co…
davidwengier Apr 7, 2022
7a7e785
Add a code fix provider that doesn't work
davidwengier Apr 7, 2022
8ba815c
Move more to the analyzer since it has the constant values already an…
davidwengier Apr 8, 2022
a648a47
More tests
davidwengier Apr 8, 2022
dee8140
Ensure we only offer when we can generate a valid string
davidwengier Apr 8, 2022
0891597
Fix order
davidwengier Apr 9, 2022
ed33897
FIx correctness
davidwengier Apr 9, 2022
3748700
Fix attributes
davidwengier Apr 9, 2022
527c33f
Fix declarations
davidwengier Apr 9, 2022
693f225
Apply trivia directly to the token
davidwengier Apr 9, 2022
b40d389
Simplify array type check
davidwengier Apr 9, 2022
9a04491
Cleanup
davidwengier Apr 9, 2022
9ec2f4d
Fix diagnostic id tests
davidwengier Apr 9, 2022
2885ee3
Move
davidwengier Apr 9, 2022
35c7ceb
Report diagnostic at first token
davidwengier Apr 9, 2022
f02a6b9
Add some more specific tests for real byte arrays found around the place
davidwengier Apr 9, 2022
9193027
Add test for collection initializer edge case
davidwengier Apr 9, 2022
55053d7
Add test for using statement edge case
davidwengier Apr 9, 2022
c8932ed
Support implicit arrays in parameter arrays
davidwengier Apr 9, 2022
80b20ad
Correctness
davidwengier Apr 9, 2022
7dbacfe
Fix code cleanup
davidwengier Apr 10, 2022
c12468f
Fix argument list rewriting
davidwengier Apr 10, 2022
ea42b95
More parameter array cases
davidwengier Apr 10, 2022
d145d63
Merge remote-tracking branch 'upstream/main' into UseUTF8Strings
davidwengier Apr 10, 2022
ad88b8e
Fix API break
davidwengier Apr 10, 2022
017037a
Fix
davidwengier Apr 10, 2022
b8a66a7
Move UTF8 string creation to the code fix, and just validate in the a…
davidwengier Apr 11, 2022
1dc375b
Share code for escaping characters
davidwengier Apr 11, 2022
9a78ad2
Fix language version check
davidwengier Apr 11, 2022
28daf9d
PR Feedback
davidwengier Apr 11, 2022
69afe46
Fix editorconfig tests
davidwengier Apr 12, 2022
0cddacd
Show option in VS and editor config UI
davidwengier Apr 12, 2022
0648b7a
Fixes
davidwengier Apr 12, 2022
2e8d442
Rename - plural sounds better
davidwengier Apr 12, 2022
437e576
Update UseUTF8StringLiteralDiagnosticAnalyzer.cs
davidwengier Apr 12, 2022
374be8f
GitHub editor swallowed my whitespace
davidwengier Apr 12, 2022
96d9b7b
Typo
davidwengier Apr 12, 2022
fa6112e
UTF8 -> UTF-8 is string resources
davidwengier Apr 26, 2022
4d10c90
Remove intermediate array(s)
davidwengier Apr 26, 2022
2919e48
Validate syntax directly, and use properties to tell the code fix wha…
davidwengier Apr 26, 2022
f2f952b
Multidimensional arrays
davidwengier Apr 26, 2022
f6dc999
Fix xlf files
davidwengier Apr 26, 2022
90bff0f
Update string for consistency
davidwengier Apr 27, 2022
7f76fff
Update tests
davidwengier Apr 27, 2022
b498cf2
Reduce allocations
davidwengier Apr 27, 2022
24ddd14
PR feedback
davidwengier May 3, 2022
c5844b4
Merge remote-tracking branch 'upstream/main' into UseUTF8Strings
davidwengier May 3, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/Analyzers/CSharp/Analyzers/CSharpAnalyzers.projitems
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@
<Compile Include="$(MSBuildThisFileDirectory)UsePatternMatching\CSharpUseNotPatternDiagnosticAnalyzer.cs" />
<Compile Include="$(MSBuildThisFileDirectory)UseSimpleUsingStatement\UseSimpleUsingStatementDiagnosticAnalyzer.cs" />
<Compile Include="$(MSBuildThisFileDirectory)UseThrowExpression\CSharpUseThrowExpressionDiagnosticAnalyzer.cs" />
<Compile Include="$(MSBuildThisFileDirectory)UseUTF8StringLiteral\UseUTF8StringLiteralDiagnosticAnalyzer.cs" />
<Compile Include="$(MSBuildThisFileDirectory)ValidateFormatString\CSharpValidateFormatStringDiagnosticAnalyzer.cs" />
<Compile Include="$(MSBuildThisFileDirectory)NewLines\EmbeddedStatementPlacement\EmbeddedStatementPlacementDiagnosticAnalyzer.cs" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -365,4 +365,10 @@
<data name="Convert_to_top_level_statements" xml:space="preserve">
<value>Convert to top-level statements</value>
</data>
<data name="Convert_to_UTF8_string_literal" xml:space="preserve">
<value>Convert to UTF-8 string literal</value>
</data>
<data name="Use_UTF8_string_literal" xml:space="preserve">
<value>Use UTF-8 string literal</value>
</data>
</root>
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using System;
using System.Collections.Immutable;
using System.Linq;
using System.Text;
using Microsoft.CodeAnalysis.CodeStyle;
using Microsoft.CodeAnalysis.CSharp.CodeStyle;
using Microsoft.CodeAnalysis.CSharp.EmbeddedLanguages.VirtualChars;
using Microsoft.CodeAnalysis.CSharp.Extensions;
using Microsoft.CodeAnalysis.CSharp.Shared.Extensions;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Diagnostics;
using Microsoft.CodeAnalysis.Operations;
using Microsoft.CodeAnalysis.Shared.Collections;
using Microsoft.CodeAnalysis.Text;
using Roslyn.Utilities;

namespace Microsoft.CodeAnalysis.CSharp.UseUTF8StringLiteral
{
[DiagnosticAnalyzer(LanguageNames.CSharp)]
internal sealed class UseUTF8StringLiteralDiagnosticAnalyzer : AbstractBuiltInCodeStyleDiagnosticAnalyzer
{
public enum ArrayCreationOperationLocation
{
Ancestors,
Descendants,
Current
}

public UseUTF8StringLiteralDiagnosticAnalyzer()
: base(IDEDiagnosticIds.UseUTF8StringLiteralDiagnosticId,
EnforceOnBuildValues.UseUTF8StringLiteral,
CSharpCodeStyleOptions.PreferUTF8StringLiterals,
LanguageNames.CSharp,
new LocalizableResourceString(nameof(CSharpAnalyzersResources.Convert_to_UTF8_string_literal), CSharpAnalyzersResources.ResourceManager, typeof(CSharpAnalyzersResources)),
new LocalizableResourceString(nameof(CSharpAnalyzersResources.Use_UTF8_string_literal), CSharpAnalyzersResources.ResourceManager, typeof(CSharpAnalyzersResources)))
{
}

public override DiagnosticAnalyzerCategory GetAnalyzerCategory()
=> DiagnosticAnalyzerCategory.SemanticSpanAnalysis;

protected override void InitializeWorker(AnalysisContext context)
=> context.RegisterCompilationStartAction(context =>
{
if (!context.Compilation.LanguageVersion().IsCSharp11OrAbove())
return;

var expressionType = context.Compilation.GetTypeByMetadataName(typeof(System.Linq.Expressions.Expression<>).FullName!);

context.RegisterOperationAction(c => AnalyzeOperation(c, expressionType), OperationKind.ArrayCreation);
});

private void AnalyzeOperation(OperationAnalysisContext context, INamedTypeSymbol? expressionType)
{
var arrayCreationOperation = (IArrayCreationOperation)context.Operation;

// Don't offer if the user doesn't want it
var option = context.GetOption(CSharpCodeStyleOptions.PreferUTF8StringLiterals);
if (!option.Value)
return;

// Only replace arrays with initializers
if (arrayCreationOperation.Initializer is null)
return;

// Using UTF8 string literals as nested array initializers is invalid
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
if (arrayCreationOperation.DimensionSizes.Length > 1)
return;

// Must be a byte array
if (arrayCreationOperation.Type is not IArrayTypeSymbol { ElementType.SpecialType: SpecialType.System_Byte })
return;

// UTF8 strings are not valid to use in attributes
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
if (arrayCreationOperation.Syntax.Ancestors().OfType<AttributeSyntax>().Any())
return;

// Can't use a UTF8 string inside an expression tree.
var semanticModel = context.Operation.SemanticModel;
Contract.ThrowIfNull(semanticModel);
if (arrayCreationOperation.Syntax.IsInExpressionTree(semanticModel, expressionType, context.CancellationToken))
return;

var elements = arrayCreationOperation.Initializer.ElementValues;

// If the compiler has constructed this array creation, then we don't want to do anything
// if there aren't any elements, as we could just end up inserting ""u8 somewhere.
if (arrayCreationOperation.IsImplicit && elements.Length == 0)
return;

if (!TryConvertToUTF8String(builder: null, elements))
return;

if (arrayCreationOperation.Syntax is ImplicitArrayCreationExpressionSyntax or ArrayCreationExpressionSyntax)
{
ReportArrayCreationDiagnostic(context, arrayCreationOperation.Syntax, option.Notification.Severity);
}
else if (elements.Length > 0 && elements[0].Syntax.Parent is ArgumentSyntax)
{
// For regular parameter arrays the code fix will need to search down
ReportParameterArrayDiagnostic(context, arrayCreationOperation.Syntax, elements, option.Notification.Severity, ArrayCreationOperationLocation.Descendants);
}
else if (elements.Length > 0 && elements[0].Syntax.Parent.IsKind(SyntaxKind.CollectionInitializerExpression))
{
// For collection initializers where the Add method takes a parameter array, the code fix
// will have to search up
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
ReportParameterArrayDiagnostic(context, arrayCreationOperation.Syntax, elements, option.Notification.Severity, ArrayCreationOperationLocation.Ancestors);
}

// Otherwise this is an unsupported case
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
}

private void ReportParameterArrayDiagnostic(OperationAnalysisContext context, SyntaxNode syntaxNode, ImmutableArray<IOperation> elements, ReportDiagnostic severity, ArrayCreationOperationLocation operationLocation)
{
// When the first elements parent is as argument, or an edge case for collection
// initializers where the Add method takes a param array, it means we have a parameter array.
// We raise the diagnostic on all of the parameters that make up the array. We could do just
// the first element, but that might be odd seeing: M(1, 2, [|3|], 4, 5)
var span = TextSpan.FromBounds(elements[0].Syntax.SpanStart, elements[^1].Syntax.Span.End);
var location = Location.Create(syntaxNode.SyntaxTree, span);

ReportDiagnostic(context, syntaxNode, severity, location, operationLocation);
}

private void ReportArrayCreationDiagnostic(OperationAnalysisContext context, SyntaxNode syntaxNode, ReportDiagnostic severity)
{
// When the user writes the array creation we raise the diagnostic on the first token, which will be the "new" keyword
var location = syntaxNode.GetFirstToken().GetLocation();

ReportDiagnostic(context, syntaxNode, severity, location, ArrayCreationOperationLocation.Current);
}

private void ReportDiagnostic(OperationAnalysisContext context, SyntaxNode syntaxNode, ReportDiagnostic severity, Location location, ArrayCreationOperationLocation operationLocation)
{
// Store the original syntax location so the code fix can find the operation again
var additionalLocations = ImmutableArray.Create(syntaxNode.GetLocation());

// Also let the code fix where to look to find the operation that originally trigger this diagnostic
var properties = ImmutableDictionary<string, string?>.Empty.Add(nameof(ArrayCreationOperationLocation), operationLocation.ToString());

context.ReportDiagnostic(
DiagnosticHelper.Create(Descriptor, location, severity, additionalLocations, properties));
}

internal static bool TryConvertToUTF8String(StringBuilder? builder, ImmutableArray<IOperation> arrayCreationElements)
{
for (var i = 0; i < arrayCreationElements.Length;)
{
// Need to call a method to do the actual rune decoding as it uses stackalloc, and stackalloc
// in a loop is a bad idea
if (!TryGetNextRune(arrayCreationElements, i, out var rune, out var bytesConsumed))
return false;

i += bytesConsumed;

if (builder is not null)
{
if (rune.TryGetEscapeCharacter(out var escapeChar))
{
builder.Append('\\');
builder.Append(escapeChar);
}
else
{
builder.Append(rune.ToString());
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

return true;
}

private static bool TryGetNextRune(ImmutableArray<IOperation> arrayCreationElements, int startIndex, out Rune rune, out int bytesConsumed)
{
rune = default;
bytesConsumed = 0;

// We only need max 4 elements for a single Rune
var length = Math.Min(arrayCreationElements.Length - startIndex, 4);

var array = (Span<byte>)stackalloc byte[length];
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
for (var i = 0; i < length; i++)
{
var element = arrayCreationElements[startIndex + i];

// First basic check is that the array element is actually a byte
if (element.ConstantValue.Value is not byte)
return false;

array[i] = (byte)element.ConstantValue.Value;
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
}

// If we can't decode a rune from the array then it can't be represented as a string
if (Rune.DecodeFromUtf8(array, out rune, out bytesConsumed) != System.Buffers.OperationStatus.Done)
return false;

return true;
davidwengier marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading