Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .NET 5 Unicode data to version 13.0.0 #33538

Merged
merged 4 commits into from
Mar 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions THIRD-PARTY-NOTICES.TXT
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,10 @@ http://www.opensource.org/licenses/bsd-license.html.
License notice for Unicode data
-------------------------------

http://www.unicode.org/copyright.html#License
https://www.unicode.org/license.html

Copyright © 1991-2017 Unicode, Inc. All rights reserved.
Distributed under the Terms of Use in http://www.unicode.org/copyright.html.
Copyright © 1991-2020 Unicode, Inc. All rights reserved.
Distributed under the Terms of Use in https://www.unicode.org/copyright.html.

Permission is hereby granted, free of charge, to any person obtaining
a copy of the Unicode data files and any associated documentation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp3.0</TargetFramework>
<UnicodeUcdVersion>12.1</UnicodeUcdVersion>
<TargetFramework>netcoreapp3.1</TargetFramework>
<UnicodeUcdVersion>13.0</UnicodeUcdVersion>
</PropertyGroup>

<ItemGroup>
Expand Down Expand Up @@ -36,7 +36,7 @@
<Link>UnicodeData\DerivedName.txt</Link>
<LogicalName>DerivedName.txt</LogicalName>
</EmbeddedResource>
<EmbeddedResource Include="$(PkgSystem_Private_Runtime_UnicodeData)\contentFiles\any\any\emoji\$(UnicodeUcdVersion)\emoji-data.txt">
<EmbeddedResource Include="$(PkgSystem_Private_Runtime_UnicodeData)\contentFiles\any\any\$(UnicodeUcdVersion).0\ucd\emoji\emoji-data.txt">
<Link>UnicodeData\emoji-data.txt</Link>
<LogicalName>emoji-data.txt</LogicalName>
</EmbeddedResource>
Expand Down
20 changes: 20 additions & 0 deletions src/coreclr/src/pal/src/locale/unicodedata.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,7 @@ CONST UnicodeDataRec UnicodeData[] = {
{ 0x275, LOWER_CASE, 0x19F },
{ 0x27D, LOWER_CASE, 0x2C64 },
{ 0x280, LOWER_CASE, 0x1A6 },
{ 0x282, LOWER_CASE, 0xA7C5 },
Copy link
Member

@am11 am11 Mar 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalStrehovsky, could this be a header-only or does adding .cpp in addition to .h file give some advantage? I realize that it is an auto-generated code file, UnicodeData[] can still can be packed in the header (i.e. .h file can be auto-generated with some glued structs which are currently declared there).
just wondering about your thoughts on .cpp vs. header-only approach in this case. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage of header-only besides having one less file?

I generally prefer .h/.cpp split because long time ago when I did a lot of C++, precompiled headers were a PITA to deal with and from observing where C++ is heading with modules and all, people still didn't figure it out. This is a big data structure to re-parse every time the file is included. I now try to stay away from C++ as much as possible so I might not be up to date.

{ 0x283, LOWER_CASE, 0x1A9 },
{ 0x287, LOWER_CASE, 0xA7B1 },
{ 0x288, LOWER_CASE, 0x1AE },
Expand Down Expand Up @@ -1203,6 +1204,7 @@ CONST UnicodeDataRec UnicodeData[] = {
{ 0x1CBF, UPPER_CASE, 0x10FF },
{ 0x1D79, LOWER_CASE, 0xA77D },
{ 0x1D7D, LOWER_CASE, 0x2C63 },
{ 0x1D8E, LOWER_CASE, 0xA7C6 },
{ 0x1E00, UPPER_CASE, 0x1E01 },
{ 0x1E01, LOWER_CASE, 0x1E00 },
{ 0x1E02, UPPER_CASE, 0x1E03 },
Expand Down Expand Up @@ -2170,6 +2172,7 @@ CONST UnicodeDataRec UnicodeData[] = {
{ 0xA791, LOWER_CASE, 0xA790 },
{ 0xA792, UPPER_CASE, 0xA793 },
{ 0xA793, LOWER_CASE, 0xA792 },
{ 0xA794, LOWER_CASE, 0xA7C4 },
{ 0xA796, UPPER_CASE, 0xA797 },
{ 0xA797, LOWER_CASE, 0xA796 },
{ 0xA798, UPPER_CASE, 0xA799 },
Expand Down Expand Up @@ -2205,6 +2208,23 @@ CONST UnicodeDataRec UnicodeData[] = {
{ 0xA7B7, LOWER_CASE, 0xA7B6 },
{ 0xA7B8, UPPER_CASE, 0xA7B9 },
{ 0xA7B9, LOWER_CASE, 0xA7B8 },
{ 0xA7BA, UPPER_CASE, 0xA7BB },
{ 0xA7BB, LOWER_CASE, 0xA7BA },
{ 0xA7BC, UPPER_CASE, 0xA7BD },
{ 0xA7BD, LOWER_CASE, 0xA7BC },
{ 0xA7BE, UPPER_CASE, 0xA7BF },
{ 0xA7BF, LOWER_CASE, 0xA7BE },
{ 0xA7C2, UPPER_CASE, 0xA7C3 },
{ 0xA7C3, LOWER_CASE, 0xA7C2 },
{ 0xA7C4, UPPER_CASE, 0xA794 },
{ 0xA7C5, UPPER_CASE, 0x282 },
{ 0xA7C6, UPPER_CASE, 0x1D8E },
{ 0xA7C7, UPPER_CASE, 0xA7C8 },
{ 0xA7C8, LOWER_CASE, 0xA7C7 },
{ 0xA7C9, UPPER_CASE, 0xA7CA },
{ 0xA7CA, LOWER_CASE, 0xA7C9 },
{ 0xA7F5, UPPER_CASE, 0xA7F6 },
{ 0xA7F6, LOWER_CASE, 0xA7F5 },
{ 0xAB53, LOWER_CASE, 0xA7B3 },
{ 0xAB70, LOWER_CASE, 0x13A0 },
{ 0xAB71, LOWER_CASE, 0x13A1 },
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<UnicodeUcdVersion>12.1</UnicodeUcdVersion>
<UnicodeUcdVersion>13.0</UnicodeUcdVersion>
<TargetFrameworks>$(NetCoreAppCurrent)</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
Expand Down Expand Up @@ -41,7 +41,7 @@
<Link>UnicodeData\DerivedName.txt</Link>
<LogicalName>DerivedName.txt</LogicalName>
</EmbeddedResource>
<EmbeddedResource Include="$(PkgSystem_Private_Runtime_UnicodeData)\contentFiles\any\any\emoji\$(UnicodeUcdVersion)\emoji-data.txt">
<EmbeddedResource Include="$(PkgSystem_Private_Runtime_UnicodeData)\contentFiles\any\any\$(UnicodeUcdVersion).0\ucd\emoji\emoji-data.txt">
<Link>UnicodeData\emoji-data.txt</Link>
<LogicalName>emoji-data.txt</LogicalName>
</EmbeddedResource>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<TestRuntime>true</TestRuntime>
<IncludeRemoteExecutor>true</IncludeRemoteExecutor>
<TargetFrameworks>$(NetCoreAppCurrent)</TargetFrameworks>
<UnicodeUcdVersion>12.1</UnicodeUcdVersion>
<UnicodeUcdVersion>13.0</UnicodeUcdVersion>
</PropertyGroup>
<ItemGroup>
<Compile Include="CompareInfo\CompareInfoTests.cs" />
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,17 @@ internal static partial class UnicodeHelpers
0xFF, 0xBF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xE7, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+0700..U+077F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x03, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xE7, // U+0780..U+07FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0x4F, 0xFF, 0x07, 0x00, 0x00, // U+0800..U+087F
0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xDF, 0x3F, 0x00, 0x00, 0xF8, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+0880..U+08FF
0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xDF, 0xFF, 0xFF, 0x00, 0xF8, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+0880..U+08FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+0900..U+097F
0xEF, 0x9F, 0xF9, 0xFF, 0xFF, 0xFD, 0xC5, 0xF3, 0x9F, 0x79, 0x80, 0xB0, 0xCF, 0xFF, 0xFF, 0x7F, // U+0980..U+09FF
0xEE, 0x87, 0xF9, 0xFF, 0xFF, 0xFD, 0x6D, 0xD3, 0x87, 0x39, 0x02, 0x5E, 0xC0, 0xFF, 0x7F, 0x00, // U+0A00..U+0A7F
0xEE, 0xBF, 0xFB, 0xFF, 0xFF, 0xFD, 0xED, 0xF3, 0xBF, 0x3B, 0x01, 0x00, 0xCF, 0xFF, 0x03, 0xFE, // U+0A80..U+0AFF
0xEE, 0x9F, 0xF9, 0xFF, 0xFF, 0xFD, 0xED, 0xF3, 0x9F, 0x39, 0xC0, 0xB0, 0xCF, 0xFF, 0xFF, 0x00, // U+0B00..U+0B7F
0xEE, 0x9F, 0xF9, 0xFF, 0xFF, 0xFD, 0xED, 0xF3, 0x9F, 0x39, 0xE0, 0xB0, 0xCF, 0xFF, 0xFF, 0x00, // U+0B00..U+0B7F
0xEC, 0xC7, 0x3D, 0xD6, 0x18, 0xC7, 0xFF, 0xC3, 0xC7, 0x3D, 0x81, 0x00, 0xC0, 0xFF, 0xFF, 0x07, // U+0B80..U+0BFF
0xFF, 0xDF, 0xFD, 0xFF, 0xFF, 0xFD, 0xFF, 0xE3, 0xDF, 0x3D, 0x60, 0x07, 0xCF, 0xFF, 0x80, 0xFF, // U+0C00..U+0C7F
0xFF, 0xDF, 0xFD, 0xFF, 0xFF, 0xFD, 0xEF, 0xF3, 0xDF, 0x3D, 0x60, 0x40, 0xCF, 0xFF, 0x06, 0x00, // U+0C80..U+0CFF
0xEF, 0xDF, 0xFD, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xDF, 0xFD, 0xF0, 0xFF, 0xCF, 0xFF, 0xFF, 0xFF, // U+0D00..U+0D7F
0xEC, 0xFF, 0x7F, 0xFC, 0xFF, 0xFF, 0xFB, 0x2F, 0x7F, 0x84, 0x5F, 0xFF, 0xC0, 0xFF, 0x1C, 0x00, // U+0D80..U+0DFF
0xFF, 0xDF, 0xFD, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xDF, 0xFD, 0xF0, 0xFF, 0xCF, 0xFF, 0xFF, 0xFF, // U+0D00..U+0D7F
0xEE, 0xFF, 0x7F, 0xFC, 0xFF, 0xFF, 0xFB, 0x2F, 0x7F, 0x84, 0x5F, 0xFF, 0xC0, 0xFF, 0x1C, 0x00, // U+0D80..U+0DFF
0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x87, 0xFF, 0xFF, 0xFF, 0x0F, 0x00, 0x00, 0x00, 0x00, // U+0E00..U+0E7F
0xD6, 0xF7, 0xFF, 0xFF, 0xAF, 0xFF, 0xFF, 0x3F, 0x5F, 0x3F, 0xFF, 0xF3, 0x00, 0x00, 0x00, 0x00, // U+0E80..U+0EFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFF, 0x1F, 0xFE, 0xFF, // U+0F00..U+0F7F
Expand All @@ -64,7 +64,7 @@ internal static partial class UnicodeHelpers
0xFF, 0xFF, 0xFF, 0x7F, 0xFF, 0x0F, 0xFF, 0x0F, 0xF1, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0x1F, 0x00, // U+1900..U+197F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xFF, 0xFF, 0xFF, 0x03, 0xFF, 0xC7, 0xFF, 0xFF, 0xFF, 0xFF, // U+1980..U+19FF
0xFF, 0xFF, 0xFF, 0xCF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0x9F, // U+1A00..U+1A7F
0xFF, 0x03, 0xFF, 0x03, 0xFF, 0x3F, 0xFF, 0x7F, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // U+1A80..U+1AFF
0xFF, 0x03, 0xFF, 0x03, 0xFF, 0x3F, 0xFF, 0xFF, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // U+1A80..U+1AFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x1F, // U+1B00..U+1B7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xF0, // U+1B80..U+1BFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xF8, 0xFF, 0xE3, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+1C00..U+1C7F
Expand Down Expand Up @@ -98,19 +98,19 @@ internal static partial class UnicodeHelpers
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+2A00..U+2A7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+2A80..U+2AFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xCF, 0xFF, // U+2B00..U+2B7F
0xFF, 0xFF, 0x3F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+2B80..U+2BFF
0xFF, 0xFF, 0xBF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+2B80..U+2BFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, // U+2C00..U+2C7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xFE, // U+2C80..U+2CFF
0xFF, 0xFF, 0xFF, 0xFF, 0xBF, 0x20, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x80, 0x01, 0x80, // U+2D00..U+2D7F
0xFF, 0xFF, 0x7F, 0x00, 0x7F, 0x7F, 0x7F, 0x7F, 0x7F, 0x7F, 0x7F, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, // U+2D80..U+2DFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // U+2E00..U+2E7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00, // U+2E00..U+2E7F
0xFF, 0xFF, 0xFF, 0xFB, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0x00, // U+2E80..U+2EFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+2F00..U+2F7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0x00, 0x00, 0x00, 0xFF, 0x0F, // U+2F80..U+2FFF
0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3000..U+307F
0xFF, 0xFF, 0x7F, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3080..U+30FF
0xE0, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3100..U+317F
0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x07, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0x00, 0xFF, 0xFF, // U+3180..U+31FF
0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0x00, 0xFF, 0xFF, // U+3180..U+31FF
0xFF, 0xFF, 0xFF, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3200..U+327F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3280..U+32FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+3300..U+337F
Expand Down Expand Up @@ -166,7 +166,7 @@ internal static partial class UnicodeHelpers
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4C00..U+4C7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4C80..U+4CFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4D00..U+4D7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4D80..U+4DFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4D80..U+4DFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4E00..U+4E7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4E80..U+4EFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+4F00..U+4F7F
Expand Down Expand Up @@ -330,7 +330,7 @@ internal static partial class UnicodeHelpers
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+9E00..U+9E7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+9E80..U+9EFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+9F00..U+9F7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, // U+9F80..U+9FFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x1F, // U+9F80..U+9FFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+A000..U+A07F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+A080..U+A0FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+A100..U+A17F
Expand All @@ -346,14 +346,14 @@ internal static partial class UnicodeHelpers
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+A600..U+A67F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, // U+A680..U+A6FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+A700..U+A77F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7C, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0xFF, // U+A780..U+A7FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xFF, 0x03, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, // U+A800..U+A87F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFC, 0x07, 0x00, 0x00, 0x00, 0x00, 0xE0, 0xFF, // U+A780..U+A7FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x1F, 0xFF, 0x03, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, // U+A800..U+A87F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0xC0, 0xFF, 0x03, 0xFF, 0xFF, 0xFF, 0xFF, // U+A880..U+A8FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0x80, 0xFF, 0xFF, 0xFF, 0x1F, // U+A900..U+A97F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xBF, 0xFF, 0xC3, 0xFF, 0xFF, 0xFF, 0x7F, // U+A980..U+A9FF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7F, 0x00, 0xFF, 0x3F, 0xFF, 0xF3, 0xFF, 0xFF, 0xFF, 0xFF, // U+AA00..U+AA7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x07, 0x00, 0x00, 0xF8, 0xFF, 0xFF, 0x7F, 0x00, // U+AA80..U+AAFF
0x7E, 0x7E, 0x7E, 0x00, 0x7F, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0xFF, 0xFF, // U+AB00..U+AB7F
0x7E, 0x7E, 0x7E, 0x00, 0x7F, 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x0F, 0xFF, 0xFF, // U+AB00..U+AB7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x3F, 0xFF, 0x03, // U+AB80..U+ABFF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+AC00..U+AC7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // U+AC80..U+ACFF
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<SolutionDir Condition="$(SolutionDir) == '' Or $(SolutionDir) == '*Undefined*'">..\..\</SolutionDir>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<TargetFrameworks>$(NetCoreAppCurrent);$(NetFrameworkCurrent)</TargetFrameworks>
<UnicodeUcdVersion>12.1</UnicodeUcdVersion>
<UnicodeUcdVersion>13.0</UnicodeUcdVersion>
</PropertyGroup>
<ItemGroup>
<CodeAnalysisDependentAssemblyPaths Condition=" '$(VS100COMNTOOLS)' != '' " Include="$(VS100COMNTOOLS)..\IDE\PrivateAssemblies">
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFrameworks>netcoreapp3.0</TargetFrameworks>
<TargetFrameworks>netcoreapp3.1</TargetFrameworks>
<EnableDefaultCompileItems>false</EnableDefaultCompileItems>
</PropertyGroup>
<PropertyGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFrameworks>netcoreapp3.0</TargetFrameworks>
<TargetFrameworks>netcoreapp3.1</TargetFrameworks>
<EnableDefaultCompileItems>false</EnableDefaultCompileItems>
</PropertyGroup>
<PropertyGroup>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@ This folder contains tools which allow updating the Unicode data within the __Sy

### Current implementation

The current version of the Unicode data checked in is __12.1.0__. The archived files can be found at https://unicode.org/Public/12.1.0/.
The current version of the Unicode data checked in is __13.0.0__. The archived files can be found at https://unicode.org/Public/13.0.0/.

### Updating the implementation

GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved
Updating the implementation consists of three steps: checking in a new version of the Unicode data files (into the [runtime-assets](https://github.com/dotnet/runtime-assets) repo), generating the shared files used by the runtime and the unit tests, and pointing the unit test files to the correct version of the data files.

As a prerequisite for updating the tools, you will need the _dotnet_ tool (version 3.0 or above) available from your local command prompt.
As a prerequisite for updating the tools, you will need the _dotnet_ tool (version 3.1 or above) available from your local command prompt.

1. Update the [runtime-assets](https://github.com/dotnet/runtime-assets) repo with the new Unicode data files. Instructions for generating new packages are listed at the repo root. Preserve the directory structure already present at https://github.com/dotnet/runtime-assets/tree/master/src/System.Private.Runtime.UnicodeData when making the change.

Expand All @@ -19,13 +19,13 @@ As a prerequisite for updating the tools, you will need the _dotnet_ tool (versi
3. Open a command prompt and navigate to the __src/libraries/System.Text.Encodings.Web/tools/GenDefinedCharList__ directory, then run the following command, replacing the first parameter with the path to the _UnicodeData.txt_ file you downloaded in the previous step. This command will update the "defined characters" bitmap within the runtime folder. The test project also consumes the file from the _src_ folder, so running this command will update both the runtime and the test project.

```txt
dotnet run -- "path_to_UnicodeData.txt" ../../src/System/Text/Unicode/UnicodeHelpers.generated.cs
dotnet run --framework netcoreapp3.1 -- "path_to_UnicodeData.txt" ../../src/System/Text/Unicode/UnicodeHelpers.generated.cs
```

4. Open a command prompt and navigate to the __src/libraries/System.Text.Encodings.Web/tools/GenUnicodeRanges__ directory, then run the following command, replacing the first parameter with the path to the _Blocks.txt_ file you downloaded earlier. This command will update the `UnicodeRanges` type in the runtime folder and update the unit tests to exercise the new APIs.

```txt
dotnet run -- "path_to_Blocks.txt" ../../src/System/Text/Unicode/UnicodeRanges.generated.cs ../../tests/UnicodeRangesTests.generated.cs
dotnet run --framework netcoreapp3.1 -- "path_to_Blocks.txt" ../../src/System/Text/Unicode/UnicodeRanges.generated.cs ../../tests/UnicodeRangesTests.generated.cs
```

5. Update the __ref__ APIs to reflect any new `UnicodeRanges` static properties which were added in the previous step, otherwise the unit test project will not be able to reference them. See https://github.com/dotnet/runtime/blob/master/docs/coding-guidelines/updating-ref-source.md for instructions on how to update the reference assemblies.
Expand Down